Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-09 13:14 [PATCH 0/24] make atomic_read() behave consistently across all architectures Chris Snook
@ 2007-08-09 12:41 ` Arnd Bergmann
  2007-08-09 14:29   ` Chris Snook
  2007-08-14 22:31 ` Christoph Lameter
  1 sibling, 1 reply; 1546+ messages in thread
From: Arnd Bergmann @ 2007-08-09 12:41 UTC (permalink / raw)
  To: Chris Snook
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl

On Thursday 09 August 2007, Chris Snook wrote:
> This patchset makes the behavior of atomic_read uniform by removing the
> volatile keyword from all atomic_t and atomic64_t definitions that currently
> have it, and instead explicitly casts the variable as volatile in
> atomic_read().  This leaves little room for creative optimization by the
> compiler, and is in keeping with the principles behind "volatile considered
> harmful".
> 

Just an idea: since all architectures already include asm-generic/atomic.h,
why not move the definitions of atomic_t and atomic64_t, as well as anything
that does not involve architecture specific inline assembly into the generic
header?

	Arnd <><

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* [PATCH 0/24] make atomic_read() behave consistently across all architectures
@ 2007-08-09 13:14 Chris Snook
  2007-08-09 12:41 ` Arnd Bergmann
  2007-08-14 22:31 ` Christoph Lameter
  0 siblings, 2 replies; 1546+ messages in thread
From: Chris Snook @ 2007-08-09 13:14 UTC (permalink / raw)
  To: linux-kernel, linux-arch, torvalds
  Cc: netdev, akpm, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl

As recent discussions[1], and bugs[2] have shown, there is a great deal of
confusion about the expected behavior of atomic_read(), compounded by the
fact that it is not the same on all architectures.  Since users expect calls
to atomic_read() to actually perform a read, it is not desirable to allow
the compiler to optimize this away.  Requiring the use of barrier() in this
case is inefficient, since we only want to re-load the atomic_t variable,
not everything else in scope.

This patchset makes the behavior of atomic_read uniform by removing the
volatile keyword from all atomic_t and atomic64_t definitions that currently
have it, and instead explicitly casts the variable as volatile in
atomic_read().  This leaves little room for creative optimization by the
compiler, and is in keeping with the principles behind "volatile considered
harmful".

Busy-waiters should still use cpu_relax(), but fast paths may be able to
reduce their use of barrier() between some atomic_read() calls.

	-- Chris

1)	http://lkml.org/lkml/2007/7/1/52
2)	http://lkml.org/lkml/2007/8/8/122

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-09 12:41 ` Arnd Bergmann
@ 2007-08-09 14:29   ` Chris Snook
  2007-08-09 15:30     ` Arnd Bergmann
  0 siblings, 1 reply; 1546+ messages in thread
From: Chris Snook @ 2007-08-09 14:29 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl

Arnd Bergmann wrote:
> On Thursday 09 August 2007, Chris Snook wrote:
>> This patchset makes the behavior of atomic_read uniform by removing the
>> volatile keyword from all atomic_t and atomic64_t definitions that currently
>> have it, and instead explicitly casts the variable as volatile in
>> atomic_read().  This leaves little room for creative optimization by the
>> compiler, and is in keeping with the principles behind "volatile considered
>> harmful".
>>
> 
> Just an idea: since all architectures already include asm-generic/atomic.h,
> why not move the definitions of atomic_t and atomic64_t, as well as anything
> that does not involve architecture specific inline assembly into the generic
> header?
> 
> 	Arnd <><

a) chicken and egg: asm-generic/atomic.h depends on definitions in asm/atomic.h

If you can find a way to reshuffle the code and make it simpler, I personally am 
all for it.  I'm skeptical that you'll get much to show for the effort.

b) The definitions aren't precisely identical between all architectures, so it 
would be a mess of special cases, which gets us right back to where we are now.

	-- Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-09 14:29   ` Chris Snook
@ 2007-08-09 15:30     ` Arnd Bergmann
  0 siblings, 0 replies; 1546+ messages in thread
From: Arnd Bergmann @ 2007-08-09 15:30 UTC (permalink / raw)
  To: Chris Snook
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl

On Thursday 09 August 2007, Chris Snook wrote:
> a) chicken and egg: asm-generic/atomic.h depends on definitions in asm/atomic.h

Ok, I see.
 
> If you can find a way to reshuffle the code and make it simpler, I personally am 
> all for it. I'm skeptical that you'll get much to show for the effort. 

I guess it could  be done using more macros or new headers, but I don't see
a way that would actually improve the situation.

> b) The definitions aren't precisely identical between all architectures, so it 
> would be a mess of special cases, which gets us right back to where we are now.

Why are they not identical? Anything beyond the 32/64 bit difference should
be the same afaics, or it might cause more bugs.

	Arnd <><

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-09 13:14 [PATCH 0/24] make atomic_read() behave consistently across all architectures Chris Snook
  2007-08-09 12:41 ` Arnd Bergmann
@ 2007-08-14 22:31 ` Christoph Lameter
  2007-08-14 22:45   ` Chris Snook
  2007-08-14 23:08   ` Satyam Sharma
  1 sibling, 2 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-14 22:31 UTC (permalink / raw)
  To: Chris Snook
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl

On Thu, 9 Aug 2007, Chris Snook wrote:

> This patchset makes the behavior of atomic_read uniform by removing the
> volatile keyword from all atomic_t and atomic64_t definitions that currently
> have it, and instead explicitly casts the variable as volatile in
> atomic_read().  This leaves little room for creative optimization by the
> compiler, and is in keeping with the principles behind "volatile considered
> harmful".

volatile is generally harmful even in atomic_read(). Barriers control
visibility and AFAICT things are fine.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 22:31 ` Christoph Lameter
@ 2007-08-14 22:45   ` Chris Snook
  2007-08-14 22:51     ` Christoph Lameter
  2007-08-14 23:08   ` Satyam Sharma
  1 sibling, 1 reply; 1546+ messages in thread
From: Chris Snook @ 2007-08-14 22:45 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl

Christoph Lameter wrote:
> On Thu, 9 Aug 2007, Chris Snook wrote:
> 
>> This patchset makes the behavior of atomic_read uniform by removing the
>> volatile keyword from all atomic_t and atomic64_t definitions that currently
>> have it, and instead explicitly casts the variable as volatile in
>> atomic_read().  This leaves little room for creative optimization by the
>> compiler, and is in keeping with the principles behind "volatile considered
>> harmful".
> 
> volatile is generally harmful even in atomic_read(). Barriers control
> visibility and AFAICT things are fine.

But barriers force a flush of *everything* in scope, which we generally don't 
want.  On the other hand, we pretty much always want to flush atomic_* 
operations.  One way or another, we should be restricting the volatile behavior 
to the thing that needs it.  On most architectures, this patch set just moves 
that from the declaration, where it is considered harmful, to the use, where it 
is considered an occasional necessary evil.

See the resubmitted patchset, which also puts a cast in the atomic[64]_set 
operations.

	-- Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 22:45   ` Chris Snook
@ 2007-08-14 22:51     ` Christoph Lameter
  0 siblings, 0 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-14 22:51 UTC (permalink / raw)
  To: Chris Snook
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl

On Tue, 14 Aug 2007, Chris Snook wrote:

> But barriers force a flush of *everything* in scope, which we generally don't
> want.  On the other hand, we pretty much always want to flush atomic_*
> operations.  One way or another, we should be restricting the volatile
> behavior to the thing that needs it.  On most architectures, this patch set
> just moves that from the declaration, where it is considered harmful, to the
> use, where it is considered an occasional necessary evil.

Then we would need

	atomic_read()

and

	atomic_read_volatile()

atomic_read_volatile() would imply an object sized memory barrier before 
and after?

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 23:08   ` Satyam Sharma
@ 2007-08-14 23:04     ` Chris Snook
  2007-08-14 23:14       ` Christoph Lameter
  2007-08-15  6:49         ` Herbert Xu
  2007-08-14 23:26     ` Paul E. McKenney
  2007-08-15 10:35     ` Stefan Richter
  2 siblings, 2 replies; 1546+ messages in thread
From: Chris Snook @ 2007-08-14 23:04 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Linux Kernel Mailing List, linux-arch,
	torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

Satyam Sharma wrote:
> 
> On Tue, 14 Aug 2007, Christoph Lameter wrote:
> 
>> On Thu, 9 Aug 2007, Chris Snook wrote:
>>
>>> This patchset makes the behavior of atomic_read uniform by removing the
>>> volatile keyword from all atomic_t and atomic64_t definitions that currently
>>> have it, and instead explicitly casts the variable as volatile in
>>> atomic_read().  This leaves little room for creative optimization by the
>>> compiler, and is in keeping with the principles behind "volatile considered
>>> harmful".
>> volatile is generally harmful even in atomic_read(). Barriers control
>> visibility and AFAICT things are fine.
> 
> Frankly, I don't see the need for this series myself either. Personal
> opinion (others may differ), but I consider "volatile" to be a sad /
> unfortunate wart in C (numerous threads on this list and on the gcc
> lists/bugzilla over the years stand testimony to this) and if we _can_
> steer clear of it, then why not -- why use this ill-defined primitive
> whose implementation has often differed over compiler versions and
> platforms? Granted, barrier() _is_ heavy-handed in that it makes the
> optimizer forget _everything_, but then somebody did post a forget()
> macro on this thread itself ...
> 
> [ BTW, why do we want the compiler to not optimize atomic_read()'s in
>   the first place? Atomic ops guarantee atomicity, which has nothing
>   to do with "volatility" -- users that expect "volatility" from
>   atomic ops are the ones who must be fixed instead, IMHO. ]

Because atomic operations are generally used for synchronization, which requires 
volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
  Some forget to and we get bugs, because people assume that atomic_read() 
actually reads something, and atomic_write() actually writes something.  Worse, 
these are architecture-specific, even compiler version-specific bugs that are 
often difficult to track down.

	-- Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 22:31 ` Christoph Lameter
  2007-08-14 22:45   ` Chris Snook
@ 2007-08-14 23:08   ` Satyam Sharma
  2007-08-14 23:04     ` Chris Snook
                       ` (2 more replies)
  1 sibling, 3 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-14 23:08 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Chris Snook, Linux Kernel Mailing List, linux-arch, torvalds,
	netdev, Andrew Morton, ak, heiko.carstens, davem, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On Tue, 14 Aug 2007, Christoph Lameter wrote:

> On Thu, 9 Aug 2007, Chris Snook wrote:
> 
> > This patchset makes the behavior of atomic_read uniform by removing the
> > volatile keyword from all atomic_t and atomic64_t definitions that currently
> > have it, and instead explicitly casts the variable as volatile in
> > atomic_read().  This leaves little room for creative optimization by the
> > compiler, and is in keeping with the principles behind "volatile considered
> > harmful".
> 
> volatile is generally harmful even in atomic_read(). Barriers control
> visibility and AFAICT things are fine.

Frankly, I don't see the need for this series myself either. Personal
opinion (others may differ), but I consider "volatile" to be a sad /
unfortunate wart in C (numerous threads on this list and on the gcc
lists/bugzilla over the years stand testimony to this) and if we _can_
steer clear of it, then why not -- why use this ill-defined primitive
whose implementation has often differed over compiler versions and
platforms? Granted, barrier() _is_ heavy-handed in that it makes the
optimizer forget _everything_, but then somebody did post a forget()
macro on this thread itself ...

[ BTW, why do we want the compiler to not optimize atomic_read()'s in
  the first place? Atomic ops guarantee atomicity, which has nothing
  to do with "volatility" -- users that expect "volatility" from
  atomic ops are the ones who must be fixed instead, IMHO. ]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 23:04     ` Chris Snook
@ 2007-08-14 23:14       ` Christoph Lameter
  2007-08-15  6:49         ` Herbert Xu
  1 sibling, 0 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-14 23:14 UTC (permalink / raw)
  To: Chris Snook
  Cc: Satyam Sharma, Linux Kernel Mailing List, linux-arch, torvalds,
	netdev, Andrew Morton, ak, heiko.carstens, davem, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On Tue, 14 Aug 2007, Chris Snook wrote:

> Because atomic operations are generally used for synchronization, which
> requires volatile behavior.  Most such codepaths currently use an inefficient
> barrier().  Some forget to and we get bugs, because people assume that
> atomic_read() actually reads something, and atomic_write() actually writes
> something.  Worse, these are architecture-specific, even compiler
> version-specific bugs that are often difficult to track down.

Looks like we need to have lock and unlock semantics?

atomic_read()

which has no barrier or volatile implications.

atomic_read_for_lock

	Acquire semantics?


atomic_read_for_unlock

	Release semantics?


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 23:08   ` Satyam Sharma
  2007-08-14 23:04     ` Chris Snook
@ 2007-08-14 23:26     ` Paul E. McKenney
  2007-08-15 10:35     ` Stefan Richter
  2 siblings, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-14 23:26 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, torvalds, netdev, Andrew Morton, ak, heiko.carstens,
	davem, schwidefsky, wensong, horms, wjiang, cfriesen, zlynx,
	rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 04:38:54AM +0530, Satyam Sharma wrote:
> 
> 
> On Tue, 14 Aug 2007, Christoph Lameter wrote:
> 
> > On Thu, 9 Aug 2007, Chris Snook wrote:
> > 
> > > This patchset makes the behavior of atomic_read uniform by removing the
> > > volatile keyword from all atomic_t and atomic64_t definitions that currently
> > > have it, and instead explicitly casts the variable as volatile in
> > > atomic_read().  This leaves little room for creative optimization by the
> > > compiler, and is in keeping with the principles behind "volatile considered
> > > harmful".
> > 
> > volatile is generally harmful even in atomic_read(). Barriers control
> > visibility and AFAICT things are fine.
> 
> Frankly, I don't see the need for this series myself either. Personal
> opinion (others may differ), but I consider "volatile" to be a sad /
> unfortunate wart in C (numerous threads on this list and on the gcc
> lists/bugzilla over the years stand testimony to this) and if we _can_
> steer clear of it, then why not -- why use this ill-defined primitive
> whose implementation has often differed over compiler versions and
> platforms? Granted, barrier() _is_ heavy-handed in that it makes the
> optimizer forget _everything_, but then somebody did post a forget()
> macro on this thread itself ...
> 
> [ BTW, why do we want the compiler to not optimize atomic_read()'s in
>   the first place? Atomic ops guarantee atomicity, which has nothing
>   to do with "volatility" -- users that expect "volatility" from
>   atomic ops are the ones who must be fixed instead, IMHO. ]

Interactions between mainline code and interrupt/NMI handlers on the same
CPU (for example, when both are using per-CPU variables.  See examples
previously posted in this thread, or look at the rcu_read_lock() and
rcu_read_unlock() implementations in http://lkml.org/lkml/2007/8/7/280.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 23:04     ` Chris Snook
@ 2007-08-15  6:49         ` Herbert Xu
  2007-08-15  6:49         ` Herbert Xu
  1 sibling, 0 replies; 1546+ messages in thread
From: Herbert Xu @ 2007-08-15  6:49 UTC (permalink / raw)
  To: Chris Snook
  Cc: satyam, clameter, linux-kernel, linux-arch, torvalds, netdev,
	akpm, ak, heiko.carstens, davem, schwidefsky, wensong, horms,
	wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

Chris Snook <csnook@redhat.com> wrote:
> 
> Because atomic operations are generally used for synchronization, which requires 
> volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
>  Some forget to and we get bugs, because people assume that atomic_read() 
> actually reads something, and atomic_write() actually writes something.  Worse, 
> these are architecture-specific, even compiler version-specific bugs that are 
> often difficult to track down.

I'm yet to see a single example from the current tree where
this patch series is the correct solution.  So far the only
example has been a buggy piece of code which has since been
fixed with a cpu_relax.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
@ 2007-08-15  6:49         ` Herbert Xu
  0 siblings, 0 replies; 1546+ messages in thread
From: Herbert Xu @ 2007-08-15  6:49 UTC (permalink / raw)
  To: Chris Snook
  Cc: satyam, clameter, linux-kernel, linux-arch, torvalds, netdev,
	akpm, ak, heiko.carstens, davem, schwidefsky, wensong, horms,
	wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

Chris Snook <csnook@redhat.com> wrote:
> 
> Because atomic operations are generally used for synchronization, which requires 
> volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
>  Some forget to and we get bugs, because people assume that atomic_read() 
> actually reads something, and atomic_write() actually writes something.  Worse, 
> these are architecture-specific, even compiler version-specific bugs that are 
> often difficult to track down.

I'm yet to see a single example from the current tree where
this patch series is the correct solution.  So far the only
example has been a buggy piece of code which has since been
fixed with a cpu_relax.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15  6:49         ` Herbert Xu
  (?)
@ 2007-08-15  8:18         ` Heiko Carstens
  2007-08-15 13:53           ` Stefan Richter
  2007-08-16  0:39           ` [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert() Satyam Sharma
  -1 siblings, 2 replies; 1546+ messages in thread
From: Heiko Carstens @ 2007-08-15  8:18 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Chris Snook, satyam, clameter, linux-kernel, linux-arch, torvalds,
	netdev, akpm, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 02:49:03PM +0800, Herbert Xu wrote:
> Chris Snook <csnook@redhat.com> wrote:
> > 
> > Because atomic operations are generally used for synchronization, which requires 
> > volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
> >  Some forget to and we get bugs, because people assume that atomic_read() 
> > actually reads something, and atomic_write() actually writes something.  Worse, 
> > these are architecture-specific, even compiler version-specific bugs that are 
> > often difficult to track down.
> 
> I'm yet to see a single example from the current tree where
> this patch series is the correct solution.  So far the only
> example has been a buggy piece of code which has since been
> fixed with a cpu_relax.

Btw.: we still have

include/asm-i386/mach-es7000/mach_wakecpu.h:  while (!atomic_read(deassert));
include/asm-i386/mach-default/mach_wakecpu.h: while (!atomic_read(deassert));

Looks like they need to be fixed as well.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 23:08   ` Satyam Sharma
  2007-08-14 23:04     ` Chris Snook
  2007-08-14 23:26     ` Paul E. McKenney
@ 2007-08-15 10:35     ` Stefan Richter
  2007-08-15 12:04       ` Herbert Xu
                         ` (2 more replies)
  2 siblings, 3 replies; 1546+ messages in thread
From: Stefan Richter @ 2007-08-15 10:35 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, torvalds, netdev, Andrew Morton, ak, heiko.carstens,
	davem, schwidefsky, wensong, horms, wjiang, cfriesen, zlynx,
	rpjday, jesper.juhl, segher, Herbert Xu, Paul E. McKenney

Satyam Sharma wrote:
> [ BTW, why do we want the compiler to not optimize atomic_read()'s in
>   the first place? Atomic ops guarantee atomicity, which has nothing
>   to do with "volatility" -- users that expect "volatility" from
>   atomic ops are the ones who must be fixed instead, IMHO. ]

LDD3 says on page 125:  "The following operations are defined for the
type [atomic_t] and are guaranteed to be atomic with respect to all
processors of an SMP computer."

Doesn't "atomic WRT all processors" require volatility?
-- 
Stefan Richter
-=====-=-=== =--- -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 10:35     ` Stefan Richter
@ 2007-08-15 12:04       ` Herbert Xu
  2007-08-15 12:31       ` Satyam Sharma
  2007-08-15 19:59       ` Christoph Lameter
  2 siblings, 0 replies; 1546+ messages in thread
From: Herbert Xu @ 2007-08-15 12:04 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Satyam Sharma, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Paul E. McKenney

On Wed, Aug 15, 2007 at 12:35:31PM +0200, Stefan Richter wrote:
> 
> LDD3 says on page 125:  "The following operations are defined for the
> type [atomic_t] and are guaranteed to be atomic with respect to all
> processors of an SMP computer."
> 
> Doesn't "atomic WRT all processors" require volatility?

Not at all.  We also require this to be atomic without any
hint of volatility.

	extern int foo;
	foo = 4;

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 10:35     ` Stefan Richter
  2007-08-15 12:04       ` Herbert Xu
@ 2007-08-15 12:31       ` Satyam Sharma
  2007-08-15 13:08         ` Stefan Richter
                           ` (3 more replies)
  2007-08-15 19:59       ` Christoph Lameter
  2 siblings, 4 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-15 12:31 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Herbert Xu,
	Paul E. McKenney

On Wed, 15 Aug 2007, Stefan Richter wrote:

> Satyam Sharma wrote:
> > [ BTW, why do we want the compiler to not optimize atomic_read()'s in
> >   the first place? Atomic ops guarantee atomicity, which has nothing
> >   to do with "volatility" -- users that expect "volatility" from
> >   atomic ops are the ones who must be fixed instead, IMHO. ]
> 
> LDD3 says on page 125:  "The following operations are defined for the
> type [atomic_t] and are guaranteed to be atomic with respect to all
> processors of an SMP computer."
> 
> Doesn't "atomic WRT all processors" require volatility?

No, it definitely doesn't. Why should it?

"Atomic w.r.t. all processors" is just your normal, simple "atomicity"
for SMP systems (ensure that that object is modified / set / replaced
in main memory atomically) and has nothing to do with "volatile"
behaviour.

"Volatile behaviour" itself isn't consistently defined (at least
definitely not consistently implemented in various gcc versions across
platforms), but it is /expected/ to mean something like: "ensure that
every such access actually goes all the way to memory, and is not
re-ordered w.r.t. to other accesses, as far as the compiler can take
care of these". The last "as far as compiler can take care" disclaimer
comes about due to CPUs doing their own re-ordering nowadays.

For example (say on i386):

(A)
$ cat tp1.c
int a;

void func(void)
{
	a = 10;
	a = 20;
}
$ gcc -Os -S tp1.c
$ cat tp1.s
...
movl    $20, a
...

(B)
$ cat tp2.c
volatile int a;

void func(void)
{
	a = 10;
	a = 20;
}
$ gcc -Os -S tp2.c
$ cat tp2.s
...
movl    $10, a
movl    $20, a
...

(C)
$ cat tp3.c
int a;

void func(void)
{
	*(volatile int *)&a = 10;
	*(volatile int *)&a = 20;
}
$ gcc -Os -S tp3.c
$ cat tp3.s
...
movl    $10, a
movl    $20, a
...

In (A) the compiler optimized "a = 10;" away, but the actual store
of the final value "20" to "a" was still "atomic". (B) and (C) also
exhibit "volatile" behaviour apart from the "atomicity".

But as others replied, it seems some callers out there depend upon
atomic ops exhibiting "volatile" behaviour as well, so that answers
my initial question, actually. I haven't looked at the code Paul
pointed me at, but I wonder if that "forget(x)" macro would help
those cases. I'd wish to avoid the "volatile" primitive, personally.

Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 12:31       ` Satyam Sharma
@ 2007-08-15 13:08         ` Stefan Richter
  2007-08-15 13:11           ` Stefan Richter
  2007-08-15 13:47           ` Satyam Sharma
  2007-08-15 18:31         ` Segher Boessenkool
                           ` (2 subsequent siblings)
  3 siblings, 2 replies; 1546+ messages in thread
From: Stefan Richter @ 2007-08-15 13:08 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Herbert Xu,
	Paul E. McKenney

Satyam Sharma wrote:
> On Wed, 15 Aug 2007, Stefan Richter wrote:
>> Doesn't "atomic WRT all processors" require volatility?
> 
> No, it definitely doesn't. Why should it?
> 
> "Atomic w.r.t. all processors" is just your normal, simple "atomicity"
> for SMP systems (ensure that that object is modified / set / replaced
> in main memory atomically) and has nothing to do with "volatile"
> behaviour.
> 
> "Volatile behaviour" itself isn't consistently defined (at least
> definitely not consistently implemented in various gcc versions across
> platforms), but it is /expected/ to mean something like: "ensure that
> every such access actually goes all the way to memory, and is not
> re-ordered w.r.t. to other accesses, as far as the compiler can take
> care of these". The last "as far as compiler can take care" disclaimer
> comes about due to CPUs doing their own re-ordering nowadays.
> 
> For example (say on i386):

[...]

> In (A) the compiler optimized "a = 10;" away, but the actual store
> of the final value "20" to "a" was still "atomic". (B) and (C) also
> exhibit "volatile" behaviour apart from the "atomicity".
> 
> But as others replied, it seems some callers out there depend upon
> atomic ops exhibiting "volatile" behaviour as well, so that answers
> my initial question, actually. I haven't looked at the code Paul
> pointed me at, but I wonder if that "forget(x)" macro would help
> those cases. I'd wish to avoid the "volatile" primitive, personally.

So, looking at load instead of store, understand I correctly that in
your opinion

	int b;

	b = atomic_read(&a);
	if (b)
		do_something_time_consuming();

	b = atomic_read(&a);
	if (b)
		do_something_more();

should be changed to explicitly forget(&a) after
do_something_time_consuming?

If so, how about the following:

static inline void A(atomic_t *a)
{
	int b = atomic_read(a);
	if (b)
		do_something_time_consuming();
}

static inline void B(atomic_t *a)
{
	int b = atomic_read(a);
	if (b)
		do_something_more();
}

static void C(atomic_t *a)
{
	A(a);
	B(b);
}

Would this need forget(a) after A(a)?

(Is the latter actually answered in C99 or is it compiler-dependent?)
-- 
Stefan Richter
-=====-=-=== =--- -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 13:08         ` Stefan Richter
@ 2007-08-15 13:11           ` Stefan Richter
  2007-08-15 13:47           ` Satyam Sharma
  1 sibling, 0 replies; 1546+ messages in thread
From: Stefan Richter @ 2007-08-15 13:11 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Herbert Xu,
	Paul E. McKenney

I wrote:
> static inline void A(atomic_t *a)
> {
> 	int b = atomic_read(a);
> 	if (b)
> 		do_something_time_consuming();
> }
> 
> static inline void B(atomic_t *a)
> {
> 	int b = atomic_read(a);
> 	if (b)
> 		do_something_more();
> }
> 
> static void C(atomic_t *a)
> {
> 	A(a);
> 	B(b);
	/* ^ typo */
	B(a);
> }
> 
> Would this need forget(a) after A(a)?
> 
> (Is the latter actually answered in C99 or is it compiler-dependent?)


-- 
Stefan Richter
-=====-=-=== =--- -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 13:08         ` Stefan Richter
  2007-08-15 13:11           ` Stefan Richter
@ 2007-08-15 13:47           ` Satyam Sharma
  2007-08-15 14:25             ` Paul E. McKenney
  1 sibling, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-15 13:47 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Herbert Xu,
	Paul E. McKenney



On Wed, 15 Aug 2007, Stefan Richter wrote:

> Satyam Sharma wrote:
> > On Wed, 15 Aug 2007, Stefan Richter wrote:
> >> Doesn't "atomic WRT all processors" require volatility?
> > 
> > No, it definitely doesn't. Why should it?
> > 
> > "Atomic w.r.t. all processors" is just your normal, simple "atomicity"
> > for SMP systems (ensure that that object is modified / set / replaced
> > in main memory atomically) and has nothing to do with "volatile"
> > behaviour.
> > 
> > "Volatile behaviour" itself isn't consistently defined (at least
> > definitely not consistently implemented in various gcc versions across
> > platforms), but it is /expected/ to mean something like: "ensure that
> > every such access actually goes all the way to memory, and is not
> > re-ordered w.r.t. to other accesses, as far as the compiler can take
> > care of these". The last "as far as compiler can take care" disclaimer
> > comes about due to CPUs doing their own re-ordering nowadays.
> > 
> > For example (say on i386):
> 
> [...]
> 
> > In (A) the compiler optimized "a = 10;" away, but the actual store
> > of the final value "20" to "a" was still "atomic". (B) and (C) also
> > exhibit "volatile" behaviour apart from the "atomicity".
> > 
> > But as others replied, it seems some callers out there depend upon
> > atomic ops exhibiting "volatile" behaviour as well, so that answers
> > my initial question, actually. I haven't looked at the code Paul
> > pointed me at, but I wonder if that "forget(x)" macro would help
> > those cases. I'd wish to avoid the "volatile" primitive, personally.
> 
> So, looking at load instead of store, understand I correctly that in
> your opinion
> 
> 	int b;
> 
> 	b = atomic_read(&a);
> 	if (b)
> 		do_something_time_consuming();
> 
> 	b = atomic_read(&a);
> 	if (b)
> 		do_something_more();
> 
> should be changed to explicitly forget(&a) after
> do_something_time_consuming?

No, I'd actually prefer something like what Christoph Lameter suggested,
i.e. users (such as above) who want "volatile"-like behaviour from atomic
ops can use alternative functions. How about something like:

#define atomic_read_volatile(v)			\
	({					\
		forget(&(v)->counter);		\
		((v)->counter);			\
	})

Or possibly, implement these "volatile" atomic ops variants in inline asm
like the patch that Sebastian Siewior has submitted on another thread just
a while back.

Of course, if we find there are more callers in the kernel who want the
volatility behaviour than those who don't care, we can re-define the
existing ops to such variants, and re-name the existing definitions to
somethine else, say "atomic_read_nonvolatile" for all I care.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15  8:18         ` Heiko Carstens
@ 2007-08-15 13:53           ` Stefan Richter
  2007-08-15 14:35             ` Satyam Sharma
  2007-08-16  0:39           ` [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert() Satyam Sharma
  1 sibling, 1 reply; 1546+ messages in thread
From: Stefan Richter @ 2007-08-15 13:53 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Herbert Xu, Chris Snook, satyam, clameter, linux-kernel,
	linux-arch, torvalds, netdev, akpm, ak, davem, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On 8/15/2007 10:18 AM, Heiko Carstens wrote:
> On Wed, Aug 15, 2007 at 02:49:03PM +0800, Herbert Xu wrote:
>> Chris Snook <csnook@redhat.com> wrote:
>> > 
>> > Because atomic operations are generally used for synchronization, which requires 
>> > volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
>> >  Some forget to and we get bugs, because people assume that atomic_read() 
>> > actually reads something, and atomic_write() actually writes something.  Worse, 
>> > these are architecture-specific, even compiler version-specific bugs that are 
>> > often difficult to track down.
>> 
>> I'm yet to see a single example from the current tree where
>> this patch series is the correct solution.  So far the only
>> example has been a buggy piece of code which has since been
>> fixed with a cpu_relax.
> 
> Btw.: we still have
> 
> include/asm-i386/mach-es7000/mach_wakecpu.h:  while (!atomic_read(deassert));
> include/asm-i386/mach-default/mach_wakecpu.h: while (!atomic_read(deassert));
> 
> Looks like they need to be fixed as well.


I don't know if this here is affected:

/* drivers/ieee1394/ieee1394_core.h */
static inline unsigned int get_hpsb_generation(struct hpsb_host *host)
{
	return atomic_read(&host->generation);
}

/* drivers/ieee1394/nodemgr.c */
static int nodemgr_host_thread(void *__hi)
{
	[...]

	for (;;) {
		[... sleep until bus reset event ...]

		/* Pause for 1/4 second in 1/16 second intervals,
		 * to make sure things settle down. */
		g = get_hpsb_generation(host);
		for (i = 0; i < 4 ; i++) {
			if (msleep_interruptible(63) ||
			    kthread_should_stop())
				goto exit;

	/* Now get the generation in which the node ID's we collect
	 * are valid.  During the bus scan we will use this generation
	 * for the read transactions, so that if another reset occurs
	 * during the scan the transactions will fail instead of
	 * returning bogus data. */

			generation = get_hpsb_generation(host);

	/* If we get a reset before we are done waiting, then
	 * start the waiting over again */

			if (generation != g)
				g = generation, i = 0;
		}

		[... scan bus, using generation ...]

	}
exit:
[...]
}



-- 
Stefan Richter
-=====-=-=== =--- -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 13:47           ` Satyam Sharma
@ 2007-08-15 14:25             ` Paul E. McKenney
  2007-08-15 15:33               ` Herbert Xu
  2007-08-15 17:55               ` Satyam Sharma
  0 siblings, 2 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-15 14:25 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote:
> On Wed, 15 Aug 2007, Stefan Richter wrote:
> > Satyam Sharma wrote:
> > > On Wed, 15 Aug 2007, Stefan Richter wrote:
> > >> Doesn't "atomic WRT all processors" require volatility?
> > > 
> > > No, it definitely doesn't. Why should it?
> > > 
> > > "Atomic w.r.t. all processors" is just your normal, simple "atomicity"
> > > for SMP systems (ensure that that object is modified / set / replaced
> > > in main memory atomically) and has nothing to do with "volatile"
> > > behaviour.
> > > 
> > > "Volatile behaviour" itself isn't consistently defined (at least
> > > definitely not consistently implemented in various gcc versions across
> > > platforms), but it is /expected/ to mean something like: "ensure that
> > > every such access actually goes all the way to memory, and is not
> > > re-ordered w.r.t. to other accesses, as far as the compiler can take
> > > care of these". The last "as far as compiler can take care" disclaimer
> > > comes about due to CPUs doing their own re-ordering nowadays.
> > > 
> > > For example (say on i386):
> > 
> > [...]
> > 
> > > In (A) the compiler optimized "a = 10;" away, but the actual store
> > > of the final value "20" to "a" was still "atomic". (B) and (C) also
> > > exhibit "volatile" behaviour apart from the "atomicity".
> > > 
> > > But as others replied, it seems some callers out there depend upon
> > > atomic ops exhibiting "volatile" behaviour as well, so that answers
> > > my initial question, actually. I haven't looked at the code Paul
> > > pointed me at, but I wonder if that "forget(x)" macro would help
> > > those cases. I'd wish to avoid the "volatile" primitive, personally.
> > 
> > So, looking at load instead of store, understand I correctly that in
> > your opinion
> > 
> > 	int b;
> > 
> > 	b = atomic_read(&a);
> > 	if (b)
> > 		do_something_time_consuming();
> > 
> > 	b = atomic_read(&a);
> > 	if (b)
> > 		do_something_more();
> > 
> > should be changed to explicitly forget(&a) after
> > do_something_time_consuming?
> 
> No, I'd actually prefer something like what Christoph Lameter suggested,
> i.e. users (such as above) who want "volatile"-like behaviour from atomic
> ops can use alternative functions. How about something like:
> 
> #define atomic_read_volatile(v)			\
> 	({					\
> 		forget(&(v)->counter);		\
> 		((v)->counter);			\
> 	})

Wouldn't the above "forget" the value, throw it away, then forget
that it forgot it, giving non-volatile semantics?

> Or possibly, implement these "volatile" atomic ops variants in inline asm
> like the patch that Sebastian Siewior has submitted on another thread just
> a while back.

Given that you are advocating a change (please keep in mind that
atomic_read() and atomic_set() had volatile semantics on almost all
platforms), care to give some example where these historical volatile
semantics are causing a problem?

> Of course, if we find there are more callers in the kernel who want the
> volatility behaviour than those who don't care, we can re-define the
> existing ops to such variants, and re-name the existing definitions to
> somethine else, say "atomic_read_nonvolatile" for all I care.

Do we really need another set of APIs?  Can you give even one example
where the pre-existing volatile semantics are causing enough of a problem
to justify adding yet more atomic_*() APIs?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 13:53           ` Stefan Richter
@ 2007-08-15 14:35             ` Satyam Sharma
  2007-08-15 14:52               ` Herbert Xu
  2007-08-15 19:58               ` Stefan Richter
  0 siblings, 2 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-15 14:35 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Hi Stefan,

On Wed, 15 Aug 2007, Stefan Richter wrote:

> On 8/15/2007 10:18 AM, Heiko Carstens wrote:
> > On Wed, Aug 15, 2007 at 02:49:03PM +0800, Herbert Xu wrote:
> >> Chris Snook <csnook@redhat.com> wrote:
> >> > 
> >> > Because atomic operations are generally used for synchronization, which requires 
> >> > volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
> >> >  Some forget to and we get bugs, because people assume that atomic_read() 
> >> > actually reads something, and atomic_write() actually writes something.  Worse, 
> >> > these are architecture-specific, even compiler version-specific bugs that are 
> >> > often difficult to track down.
> >> 
> >> I'm yet to see a single example from the current tree where
> >> this patch series is the correct solution.  So far the only
> >> example has been a buggy piece of code which has since been
> >> fixed with a cpu_relax.
> > 
> > Btw.: we still have
> > 
> > include/asm-i386/mach-es7000/mach_wakecpu.h:  while (!atomic_read(deassert));
> > include/asm-i386/mach-default/mach_wakecpu.h: while (!atomic_read(deassert));
> > 
> > Looks like they need to be fixed as well.
> 
> 
> I don't know if this here is affected:

Yes, I think it is. You're clearly expecting the read to actually happen
when you call get_hpsb_generation(). It's clearly not a busy-loop, so
cpu_relax() sounds pointless / wrong solution for this case, so I'm now
somewhat beginning to appreciate the motivation behind this series :-)

But as I said, there are ways to achieve the same goals of this series
without using "volatile".

I think I'll submit a RFC/patch or two on this myself (will also fix
the code pieces listed here).

> /* drivers/ieee1394/ieee1394_core.h */
> static inline unsigned int get_hpsb_generation(struct hpsb_host *host)
> {
> 	return atomic_read(&host->generation);
> }
> 
> /* drivers/ieee1394/nodemgr.c */
> static int nodemgr_host_thread(void *__hi)
> {
> 	[...]
> 
> 	for (;;) {
> 		[... sleep until bus reset event ...]
> 
> 		/* Pause for 1/4 second in 1/16 second intervals,
> 		 * to make sure things settle down. */
> 		g = get_hpsb_generation(host);
> 		for (i = 0; i < 4 ; i++) {
> 			if (msleep_interruptible(63) ||
> 			    kthread_should_stop())
> 				goto exit;

Totally unrelated, but this looks weird. IMHO you actually wanted:

	msleep_interruptible(63);
	if (kthread_should_stop())
		goto exit;

here, didn't you? Otherwise the thread will exit even when
kthread_should_stop() != TRUE (just because it received a signal),
and it is not good for a kthread to exit on its own if it uses
kthread_should_stop() or if some other piece of kernel code could
eventually call kthread_stop(tsk) on it.

Ok, probably the thread will never receive a signal in the first
place because it's spawned off kthreadd which ignores all signals
beforehand, but still ...

[PATCH] ieee1394: Fix kthread stopping in nodemgr_host_thread

The nodemgr host thread can exit on its own even when kthread_should_stop
is not true, on receiving a signal (might never happen in practice, as
it ignores signals). But considering kthread_stop() must not be mixed with
kthreads that can exit on their own, I think changing the code like this
is clearer. This change means the thread can cut its sleep short when
receive a signal but looking at the code around, that sounds okay (and
again, it might never actually recieve a signal in practice).

Signed-off-by: Satyam Sharma <satyam@infradead.org>

---

 drivers/ieee1394/nodemgr.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/ieee1394/nodemgr.c b/drivers/ieee1394/nodemgr.c
index 2ffd534..981a7da 100644
--- a/drivers/ieee1394/nodemgr.c
+++ b/drivers/ieee1394/nodemgr.c
@@ -1721,7 +1721,8 @@ static int nodemgr_host_thread(void *__hi)
 		 * to make sure things settle down. */
 		g = get_hpsb_generation(host);
 		for (i = 0; i < 4 ; i++) {
-			if (msleep_interruptible(63) || kthread_should_stop())
+			msleep_interruptible(63);
+			if (kthread_should_stop())
 				goto exit;

 			/* Now get the generation in which the node ID's we collect

^ permalink raw reply related	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 14:35             ` Satyam Sharma
@ 2007-08-15 14:52               ` Herbert Xu
  2007-08-15 16:09                 ` Stefan Richter
  2007-08-15 19:58               ` Stefan Richter
  1 sibling, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-15 14:52 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, Heiko Carstens, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote:
>
> > I don't know if this here is affected:
> 
> Yes, I think it is. You're clearly expecting the read to actually happen
> when you call get_hpsb_generation(). It's clearly not a busy-loop, so
> cpu_relax() sounds pointless / wrong solution for this case, so I'm now
> somewhat beginning to appreciate the motivation behind this series :-)

Nope, we're calling schedule which is a rather heavy-weight
barrier.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 14:25             ` Paul E. McKenney
@ 2007-08-15 15:33               ` Herbert Xu
  2007-08-15 16:08                 ` Paul E. McKenney
  2007-08-15 18:19                 ` David Howells
  2007-08-15 17:55               ` Satyam Sharma
  1 sibling, 2 replies; 1546+ messages in thread
From: Herbert Xu @ 2007-08-15 15:33 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Satyam Sharma, Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote:
> 
> Do we really need another set of APIs?  Can you give even one example
> where the pre-existing volatile semantics are causing enough of a problem
> to justify adding yet more atomic_*() APIs?

Let's turn this around.  Can you give a single example where
the volatile semantics is needed in a legitimate way?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 15:33               ` Herbert Xu
@ 2007-08-15 16:08                 ` Paul E. McKenney
  2007-08-15 17:18                   ` Satyam Sharma
  2007-08-15 18:19                 ` David Howells
  1 sibling, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-15 16:08 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Satyam Sharma, Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 11:33:36PM +0800, Herbert Xu wrote:
> On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote:
> > 
> > Do we really need another set of APIs?  Can you give even one example
> > where the pre-existing volatile semantics are causing enough of a problem
> > to justify adding yet more atomic_*() APIs?
> 
> Let's turn this around.  Can you give a single example where
> the volatile semantics is needed in a legitimate way?

Sorry, but you are the one advocating for the change.

Nice try, though!  ;-)

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 14:52               ` Herbert Xu
@ 2007-08-15 16:09                 ` Stefan Richter
  2007-08-15 16:27                   ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Stefan Richter @ 2007-08-15 16:09 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Satyam Sharma, Heiko Carstens, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu wrote:
> On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote:
>>> I don't know if this here is affected:

[...something like]
	b = atomic_read(a);
	for (i = 0; i < 4; i++) {
		msleep_interruptible(63);
		c = atomic_read(a);
		if (c != b) {
			b = c;
			i = 0;
		}
	}

> Nope, we're calling schedule which is a rather heavy-weight
> barrier.

How does the compiler know that msleep() has got barrier()s?
-- 
Stefan Richter
-=====-=-=== =--- -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15  6:49         ` Herbert Xu
  (?)
  (?)
@ 2007-08-15 16:13         ` Chris Snook
  2007-08-15 23:40           ` Herbert Xu
  -1 siblings, 1 reply; 1546+ messages in thread
From: Chris Snook @ 2007-08-15 16:13 UTC (permalink / raw)
  To: Herbert Xu
  Cc: satyam, clameter, linux-kernel, linux-arch, torvalds, netdev,
	akpm, ak, heiko.carstens, davem, schwidefsky, wensong, horms,
	wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu wrote:
> Chris Snook <csnook@redhat.com> wrote:
>> Because atomic operations are generally used for synchronization, which requires 
>> volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
>>  Some forget to and we get bugs, because people assume that atomic_read() 
>> actually reads something, and atomic_write() actually writes something.  Worse, 
>> these are architecture-specific, even compiler version-specific bugs that are 
>> often difficult to track down.
> 
> I'm yet to see a single example from the current tree where
> this patch series is the correct solution.  So far the only
> example has been a buggy piece of code which has since been
> fixed with a cpu_relax.

Part of the motivation here is to fix heisenbugs.  If I knew where they 
were, I'd be posting patches for them.  Unlike most bugs, where we want 
to expose them as obviously as possible, these can be extremely 
difficult to track down, and are often due to people assuming that the 
atomic_* operations have the same semantics they've historically had. 
Remember that until recently, all SMP architectures except s390 (which 
very few kernel developers outside of IBM, Red Hat, and SuSE do much 
work on) had volatile declarations for atomic_t.  Removing the volatile 
declarations from i386 and x86_64 may have created heisenbugs that won't 
manifest themselves until GCC 6.0 comes out and people start compiling 
kernels with -O5.  We should have consistent semantics for atomic_* 
operations.

The other motivation is to reduce the need for the barriers used to 
prevent/fix such problems which clobber all your registers, and instead 
force atomic_* operations to behave in the way they're actually used. 
After the (resubmitted) patchset is merged, we'll be able to remove a 
whole bunch of barriers, shrinking our source and our binaries, and 
improving performance.

	-- Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 16:09                 ` Stefan Richter
@ 2007-08-15 16:27                   ` Paul E. McKenney
  2007-08-15 17:13                     ` Satyam Sharma
  2007-08-15 18:31                     ` Segher Boessenkool
  0 siblings, 2 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-15 16:27 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Herbert Xu, Satyam Sharma, Heiko Carstens, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 06:09:35PM +0200, Stefan Richter wrote:
> Herbert Xu wrote:
> > On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote:
> >>> I don't know if this here is affected:
> 
> [...something like]
> 	b = atomic_read(a);
> 	for (i = 0; i < 4; i++) {
> 		msleep_interruptible(63);
> 		c = atomic_read(a);
> 		if (c != b) {
> 			b = c;
> 			i = 0;
> 		}
> 	}
> 
> > Nope, we're calling schedule which is a rather heavy-weight
> > barrier.
> 
> How does the compiler know that msleep() has got barrier()s?

Because msleep_interruptible() is in a separate compilation unit,
the compiler has to assume that it might modify any arbitrary global.
In many cases, the compiler also has to assume that msleep_interruptible()
might call back into a function in the current compilation unit, thus
possibly modifying global static variables.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 16:27                   ` Paul E. McKenney
@ 2007-08-15 17:13                     ` Satyam Sharma
  2007-08-15 18:31                     ` Segher Boessenkool
  1 sibling, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-15 17:13 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Stefan Richter, Herbert Xu, Heiko Carstens, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher



On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> On Wed, Aug 15, 2007 at 06:09:35PM +0200, Stefan Richter wrote:
> > Herbert Xu wrote:
> > > On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote:
> > >>> I don't know if this here is affected:
> > 
> > [...something like]
> > 	b = atomic_read(a);
> > 	for (i = 0; i < 4; i++) {
> > 		msleep_interruptible(63);
> > 		c = atomic_read(a);
> > 		if (c != b) {
> > 			b = c;
> > 			i = 0;
> > 		}
> > 	}
> > 
> > > Nope, we're calling schedule which is a rather heavy-weight
> > > barrier.
> > 
> > How does the compiler know that msleep() has got barrier()s?
> 
> Because msleep_interruptible() is in a separate compilation unit,
> the compiler has to assume that it might modify any arbitrary global.
> In many cases, the compiler also has to assume that msleep_interruptible()
> might call back into a function in the current compilation unit, thus
> possibly modifying global static variables.

Yup, I've just verified this with a testcase. So a call to any function
outside of the current compilation unit acts as a compiler barrier. Cool.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 16:08                 ` Paul E. McKenney
@ 2007-08-15 17:18                   ` Satyam Sharma
  2007-08-15 17:33                     ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-15 17:18 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Herbert Xu, Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher



On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> On Wed, Aug 15, 2007 at 11:33:36PM +0800, Herbert Xu wrote:
> > On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote:
> > > 
> > > Do we really need another set of APIs?  Can you give even one example
> > > where the pre-existing volatile semantics are causing enough of a problem
> > > to justify adding yet more atomic_*() APIs?
> > 
> > Let's turn this around.  Can you give a single example where
> > the volatile semantics is needed in a legitimate way?
> 
> Sorry, but you are the one advocating for the change.

Not for i386 and x86_64 -- those have atomic ops without any "volatile"
semantics (currently as per existing definitions).

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 17:18                   ` Satyam Sharma
@ 2007-08-15 17:33                     ` Paul E. McKenney
  2007-08-15 18:05                       ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-15 17:33 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 10:48:28PM +0530, Satyam Sharma wrote:
> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> > On Wed, Aug 15, 2007 at 11:33:36PM +0800, Herbert Xu wrote:
> > > On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote:
> > > > 
> > > > Do we really need another set of APIs?  Can you give even one example
> > > > where the pre-existing volatile semantics are causing enough of a problem
> > > > to justify adding yet more atomic_*() APIs?
> > > 
> > > Let's turn this around.  Can you give a single example where
> > > the volatile semantics is needed in a legitimate way?
> > 
> > Sorry, but you are the one advocating for the change.
> 
> Not for i386 and x86_64 -- those have atomic ops without any "volatile"
> semantics (currently as per existing definitions).

I claim unit volumes with arm, and the majority of the architectures, but
I cannot deny the popularity of i386 and x86_64 with many developers.  ;-)

However, I am not aware of code in the kernel that would benefit
from the compiler coalescing multiple atomic_set() and atomic_read()
invocations, thus I don't see the downside to volatility in this case.
Are there some performance-critical code fragments that I am missing?

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 14:25             ` Paul E. McKenney
  2007-08-15 15:33               ` Herbert Xu
@ 2007-08-15 17:55               ` Satyam Sharma
  2007-08-15 19:07                 ` Paul E. McKenney
  2007-08-15 20:58                 ` Segher Boessenkool
  1 sibling, 2 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-15 17:55 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

Hi Paul,


On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote:
> > [...]
> > No, I'd actually prefer something like what Christoph Lameter suggested,
> > i.e. users (such as above) who want "volatile"-like behaviour from atomic
> > ops can use alternative functions. How about something like:
> > 
> > #define atomic_read_volatile(v)			\
> > 	({					\
> > 		forget(&(v)->counter);		\
> > 		((v)->counter);			\
> > 	})
> 
> Wouldn't the above "forget" the value, throw it away, then forget
> that it forgot it, giving non-volatile semantics?

Nope, I don't think so. I wrote the following trivial testcases:
[ See especially tp4.c and tp4.s (last example). ]

==============================================================================
$ cat tp1.c # Using volatile access casts

#define atomic_read(a)	(*(volatile int *)&a)

int a;

void func(void)
{
	a = 0;
	while (atomic_read(a))
		;
}
==============================================================================
$ gcc -Os -S tp1.c; cat tp1.s

func:
	pushl	%ebp
	movl	%esp, %ebp
	movl	$0, a
.L2:
	movl	a, %eax
	testl	%eax, %eax
	jne	.L2
	popl	%ebp
	ret
==============================================================================
$ cat tp2.c # Using nothing; gcc will optimize the whole loop away

#define forget(x)

#define atomic_read(a)		\
	({			\
		forget(&(a));	\
		(a);		\
	})

int a;

void func(void)
{
	a = 0;
	while (atomic_read(a))
		;
}
==============================================================================
$ gcc -Os -S tp2.c; cat tp2.s

func:
	pushl	%ebp
	movl	%esp, %ebp
	popl	%ebp
	movl	$0, a
	ret
==============================================================================
$ cat tp3.c # Using a full memory clobber barrier

#define forget(x)	asm volatile ("":::"memory")

#define atomic_read(a)		\
	({			\
		forget(&(a));	\
		(a);		\
	})

int a;

void func(void)
{
	a = 0;
	while (atomic_read(a))
		;
}
==============================================================================
$ gcc -Os -S tp3.c; cat tp3.s

func:
	pushl	%ebp
	movl	%esp, %ebp
	movl	$0, a
.L2:
	cmpl	$0, a
	jne	.L2
	popl	%ebp
	ret
==============================================================================
$ cat tp4.c # Using a forget(var) macro

#define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))

#define atomic_read(a)		\
	({			\
		forget(a);	\
		(a);		\
	})

int a;

void func(void)
{
	a = 0;
	while (atomic_read(a))
		;
}
==============================================================================
$ gcc -Os -S tp4.c; cat tp4.s

func:
	pushl	%ebp
	movl	%esp, %ebp
	movl	$0, a
.L2:
	cmpl	$0, a
	jne	.L2
	popl	%ebp
	ret
==============================================================================


Possibly these were too trivial to expose any potential problems that you
may have been referring to, so would be helpful if you could write a more
concrete example / sample code.


> > Or possibly, implement these "volatile" atomic ops variants in inline asm
> > like the patch that Sebastian Siewior has submitted on another thread just
> > a while back.
> 
> Given that you are advocating a change (please keep in mind that
> atomic_read() and atomic_set() had volatile semantics on almost all
> platforms), care to give some example where these historical volatile
> semantics are causing a problem?
> [...]
> Can you give even one example
> where the pre-existing volatile semantics are causing enough of a problem
> to justify adding yet more atomic_*() APIs?

Will take this to the other sub-thread ...


> > Of course, if we find there are more callers in the kernel who want the
> > volatility behaviour than those who don't care, we can re-define the
> > existing ops to such variants, and re-name the existing definitions to
> > somethine else, say "atomic_read_nonvolatile" for all I care.
> 
> Do we really need another set of APIs?

Well, if there's one set of users who do care about volatile behaviour,
and another set that doesn't, it only sounds correct to provide both
those APIs, instead of forcing one behaviour on all users.


Thanks,
Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 17:33                     ` Paul E. McKenney
@ 2007-08-15 18:05                       ` Satyam Sharma
  0 siblings, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-15 18:05 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Herbert Xu, Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> On Wed, Aug 15, 2007 at 10:48:28PM +0530, Satyam Sharma wrote:
> > [...]
> > Not for i386 and x86_64 -- those have atomic ops without any "volatile"
> > semantics (currently as per existing definitions).
> 
> I claim unit volumes with arm, and the majority of the architectures, but
> I cannot deny the popularity of i386 and x86_64 with many developers.  ;-)

Hmm, does arm really need that "volatile int counter;"? Hopefully RMK will
take a patch removing that "volatile" ... ;-)

> However, I am not aware of code in the kernel that would benefit
> from the compiler coalescing multiple atomic_set() and atomic_read()
> invocations, thus I don't see the downside to volatility in this case.
> Are there some performance-critical code fragments that I am missing?

I don't know, and yes, code with multiple atomic_set's and atomic_read's
getting optimized or coalesced does sound strange to start with. Anyway,
I'm not against "volatile semantics" per se. As replied elsewhere, I do
appreciate the motivation behind this series (to _avoid_ gotchas, not to
fix existing ones). Just that I'd like to avoid using "volatile", for
aforementioned reasons, especially given that there are perfectly
reasonable alternatives to achieve the same desired behaviour.

Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 15:33               ` Herbert Xu
  2007-08-15 16:08                 ` Paul E. McKenney
@ 2007-08-15 18:19                 ` David Howells
  2007-08-15 18:45                   ` Paul E. McKenney
  1 sibling, 1 reply; 1546+ messages in thread
From: David Howells @ 2007-08-15 18:19 UTC (permalink / raw)
  To: Herbert Xu
  Cc: dhowells, Paul E. McKenney, Satyam Sharma, Stefan Richter,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu <herbert@gondor.apana.org.au> wrote:

> Let's turn this around.  Can you give a single example where
> the volatile semantics is needed in a legitimate way?

Accessing H/W registers?  But apart from that...

David

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 16:27                   ` Paul E. McKenney
  2007-08-15 17:13                     ` Satyam Sharma
@ 2007-08-15 18:31                     ` Segher Boessenkool
  2007-08-15 18:57                       ` Paul E. McKenney
  1 sibling, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-15 18:31 UTC (permalink / raw)
  To: paulmck
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

>> How does the compiler know that msleep() has got barrier()s?
>
> Because msleep_interruptible() is in a separate compilation unit,
> the compiler has to assume that it might modify any arbitrary global.

No; compilation units have nothing to do with it, GCC can optimise
across compilation unit boundaries just fine, if you tell it to
compile more than one compilation unit at once.

What you probably mean is that the compiler has to assume any code
it cannot currently see can do anything (insofar as allowed by the
relevant standards etc.)

> In many cases, the compiler also has to assume that 
> msleep_interruptible()
> might call back into a function in the current compilation unit, thus
> possibly modifying global static variables.

It most often is smart enough to see what compilation-unit-local
variables might be modified that way, though :-)


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 12:31       ` Satyam Sharma
  2007-08-15 13:08         ` Stefan Richter
@ 2007-08-15 18:31         ` Segher Boessenkool
  2007-08-15 19:40           ` Satyam Sharma
  2007-08-15 23:22         ` Paul Mackerras
  2007-08-16  3:37         ` Bill Fink
  3 siblings, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-15 18:31 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, netdev, ak, cfriesen,
	rpjday, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

> "Volatile behaviour" itself isn't consistently defined (at least
> definitely not consistently implemented in various gcc versions across
> platforms),

It should be consistent across platforms; if not, file a bug please.

> but it is /expected/ to mean something like: "ensure that
> every such access actually goes all the way to memory, and is not
> re-ordered w.r.t. to other accesses, as far as the compiler can take
> care of these". The last "as far as compiler can take care" disclaimer
> comes about due to CPUs doing their own re-ordering nowadays.

You can *expect* whatever you want, but this isn't in line with
reality at all.

volatile _does not_ make accesses go all the way to memory.
volatile _does not_ prevent reordering wrt other accesses.

What volatile does are a) never optimise away a read (or write)
to the object, since the data can change in ways the compiler
cannot see; and b) never move stores to the object across a
sequence point.  This does not mean other accesses cannot be
reordered wrt the volatile access.

If the abstract machine would do an access to a volatile-
qualified object, the generated machine code will do that
access too.  But, for example, it can still be optimised
away by the compiler, if it can prove it is allowed to.

If you want stuff to go all the way to memory, you need some
architecture-specific flush sequence; to make a store globally
visible before another store, you need mb(); before some following
read, you need mb(); to prevent reordering you need a barrier.

Segher

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 18:19                 ` David Howells
@ 2007-08-15 18:45                   ` Paul E. McKenney
  2007-08-15 23:41                     ` Herbert Xu
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-15 18:45 UTC (permalink / raw)
  To: David Howells
  Cc: Herbert Xu, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Wed, Aug 15, 2007 at 07:19:57PM +0100, David Howells wrote:
> Herbert Xu <herbert@gondor.apana.org.au> wrote:
> 
> > Let's turn this around.  Can you give a single example where
> > the volatile semantics is needed in a legitimate way?
> 
> Accessing H/W registers?  But apart from that...

Communicating between process context and interrupt/NMI handlers using
per-CPU variables.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 18:31                     ` Segher Boessenkool
@ 2007-08-15 18:57                       ` Paul E. McKenney
  2007-08-15 19:54                         ` Satyam Sharma
  2007-08-15 21:05                         ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Segher Boessenkool
  0 siblings, 2 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-15 18:57 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

On Wed, Aug 15, 2007 at 08:31:25PM +0200, Segher Boessenkool wrote:
> >>How does the compiler know that msleep() has got barrier()s?
> >
> >Because msleep_interruptible() is in a separate compilation unit,
> >the compiler has to assume that it might modify any arbitrary global.
> 
> No; compilation units have nothing to do with it, GCC can optimise
> across compilation unit boundaries just fine, if you tell it to
> compile more than one compilation unit at once.

Last I checked, the Linux kernel build system did compile each .c file
as a separate compilation unit.

> What you probably mean is that the compiler has to assume any code
> it cannot currently see can do anything (insofar as allowed by the
> relevant standards etc.)

Indeed.

> >In many cases, the compiler also has to assume that 
> >msleep_interruptible()
> >might call back into a function in the current compilation unit, thus
> >possibly modifying global static variables.
> 
> It most often is smart enough to see what compilation-unit-local
> variables might be modified that way, though :-)

Yep.  For example, if it knows the current value of a given such local
variable, and if all code paths that would change some other variable
cannot be reached given that current value of the first variable.
At least given that gcc doesn't know about multiple threads of execution!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 17:55               ` Satyam Sharma
@ 2007-08-15 19:07                 ` Paul E. McKenney
  2007-08-15 21:07                   ` Segher Boessenkool
  2007-08-15 20:58                 ` Segher Boessenkool
  1 sibling, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-15 19:07 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, Aug 15, 2007 at 11:25:05PM +0530, Satyam Sharma wrote:
> Hi Paul,
> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> 
> > On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote:
> > > [...]
> > > No, I'd actually prefer something like what Christoph Lameter suggested,
> > > i.e. users (such as above) who want "volatile"-like behaviour from atomic
> > > ops can use alternative functions. How about something like:
> > > 
> > > #define atomic_read_volatile(v)			\
> > > 	({					\
> > > 		forget(&(v)->counter);		\
> > > 		((v)->counter);			\
> > > 	})
> > 
> > Wouldn't the above "forget" the value, throw it away, then forget
> > that it forgot it, giving non-volatile semantics?
> 
> Nope, I don't think so. I wrote the following trivial testcases:
> [ See especially tp4.c and tp4.s (last example). ]

Right.  I should have said "wouldn't the compiler be within its rights
to forget the value, throw it away, then forget that it forgot it".
The value coming out of the #define above is an unadorned ((v)->counter),
which has no volatile semantics.

> ==============================================================================
> $ cat tp1.c # Using volatile access casts
> 
> #define atomic_read(a)	(*(volatile int *)&a)
> 
> int a;
> 
> void func(void)
> {
> 	a = 0;
> 	while (atomic_read(a))
> 		;
> }
> ==============================================================================
> $ gcc -Os -S tp1.c; cat tp1.s
> 
> func:
> 	pushl	%ebp
> 	movl	%esp, %ebp
> 	movl	$0, a
> .L2:
> 	movl	a, %eax
> 	testl	%eax, %eax
> 	jne	.L2
> 	popl	%ebp
> 	ret
> ==============================================================================
> $ cat tp2.c # Using nothing; gcc will optimize the whole loop away
> 
> #define forget(x)
> 
> #define atomic_read(a)		\
> 	({			\
> 		forget(&(a));	\
> 		(a);		\
> 	})
> 
> int a;
> 
> void func(void)
> {
> 	a = 0;
> 	while (atomic_read(a))
> 		;
> }
> ==============================================================================
> $ gcc -Os -S tp2.c; cat tp2.s
> 
> func:
> 	pushl	%ebp
> 	movl	%esp, %ebp
> 	popl	%ebp
> 	movl	$0, a
> 	ret
> ==============================================================================
> $ cat tp3.c # Using a full memory clobber barrier
> 
> #define forget(x)	asm volatile ("":::"memory")
> 
> #define atomic_read(a)		\
> 	({			\
> 		forget(&(a));	\
> 		(a);		\
> 	})
> 
> int a;
> 
> void func(void)
> {
> 	a = 0;
> 	while (atomic_read(a))
> 		;
> }
> ==============================================================================
> $ gcc -Os -S tp3.c; cat tp3.s
> 
> func:
> 	pushl	%ebp
> 	movl	%esp, %ebp
> 	movl	$0, a
> .L2:
> 	cmpl	$0, a
> 	jne	.L2
> 	popl	%ebp
> 	ret
> ==============================================================================
> $ cat tp4.c # Using a forget(var) macro
> 
> #define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))
> 
> #define atomic_read(a)		\
> 	({			\
> 		forget(a);	\
> 		(a);		\
> 	})
> 
> int a;
> 
> void func(void)
> {
> 	a = 0;
> 	while (atomic_read(a))
> 		;
> }
> ==============================================================================
> $ gcc -Os -S tp4.c; cat tp4.s
> 
> func:
> 	pushl	%ebp
> 	movl	%esp, %ebp
> 	movl	$0, a
> .L2:
> 	cmpl	$0, a
> 	jne	.L2
> 	popl	%ebp
> 	ret
> ==============================================================================
> 
> Possibly these were too trivial to expose any potential problems that you
> may have been referring to, so would be helpful if you could write a more
> concrete example / sample code.

The trick is to have a sufficiently complicated expression to force
the compiler to run out of registers.  If the value is non-volatile,
it will refetch it (and expect it not to have changed, possibly being
disappointed by an interrupt handler running on that same CPU).

> > > Or possibly, implement these "volatile" atomic ops variants in inline asm
> > > like the patch that Sebastian Siewior has submitted on another thread just
> > > a while back.
> > 
> > Given that you are advocating a change (please keep in mind that
> > atomic_read() and atomic_set() had volatile semantics on almost all
> > platforms), care to give some example where these historical volatile
> > semantics are causing a problem?
> > [...]
> > Can you give even one example
> > where the pre-existing volatile semantics are causing enough of a problem
> > to justify adding yet more atomic_*() APIs?
> 
> Will take this to the other sub-thread ...

OK.

> > > Of course, if we find there are more callers in the kernel who want the
> > > volatility behaviour than those who don't care, we can re-define the
> > > existing ops to such variants, and re-name the existing definitions to
> > > somethine else, say "atomic_read_nonvolatile" for all I care.
> > 
> > Do we really need another set of APIs?
> 
> Well, if there's one set of users who do care about volatile behaviour,
> and another set that doesn't, it only sounds correct to provide both
> those APIs, instead of forcing one behaviour on all users.

Well, if the second set doesn't care, they should be OK with the volatile
behavior in this case.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 18:31         ` Segher Boessenkool
@ 2007-08-15 19:40           ` Satyam Sharma
  2007-08-15 20:42             ` Segher Boessenkool
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-15 19:40 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, netdev, ak, cfriesen,
	rpjday, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang



On Wed, 15 Aug 2007, Segher Boessenkool wrote:

> > "Volatile behaviour" itself isn't consistently defined (at least
> > definitely not consistently implemented in various gcc versions across
> > platforms),
> 
> It should be consistent across platforms; if not, file a bug please.
> 
> > but it is /expected/ to mean something like: "ensure that
> > every such access actually goes all the way to memory, and is not
> > re-ordered w.r.t. to other accesses, as far as the compiler can take
                              ^
                              (volatile)

(or, alternatively, "other accesses to the same volatile object" ...)

> > care of these". The last "as far as compiler can take care" disclaimer
> > comes about due to CPUs doing their own re-ordering nowadays.
> 
> You can *expect* whatever you want, but this isn't in line with
> reality at all.
> 
> volatile _does not_ prevent reordering wrt other accesses.
> [...]
> What volatile does are a) never optimise away a read (or write)
> to the object, since the data can change in ways the compiler
> cannot see; and b) never move stores to the object across a
> sequence point.  This does not mean other accesses cannot be
> reordered wrt the volatile access.
> 
> If the abstract machine would do an access to a volatile-
> qualified object, the generated machine code will do that
> access too.  But, for example, it can still be optimised
> away by the compiler, if it can prove it is allowed to.

As (now) indicated above, I had meant multiple volatile accesses to
the same object, obviously.

BTW:

#define atomic_read(a)	(*(volatile int *)&(a))
#define atomic_set(a,i)	(*(volatile int *)&(a) = (i))

int a;

void func(void)
{
	int b;

	b = atomic_read(a);
	atomic_set(a, 20);
	b = atomic_read(a);
}

gives:

func:
	pushl	%ebp
	movl	a, %eax
	movl	%esp, %ebp
	movl	$20, a
	movl	a, %eax
	popl	%ebp
	ret

so the first atomic_read() wasn't optimized away.


> volatile _does not_ make accesses go all the way to memory.
> [...]
> If you want stuff to go all the way to memory, you need some
> architecture-specific flush sequence; to make a store globally
> visible before another store, you need mb(); before some following
> read, you need mb(); to prevent reordering you need a barrier.

Sure, which explains the "as far as the compiler can take care" bit.
Poor phrase / choice of words, probably.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 18:57                       ` Paul E. McKenney
@ 2007-08-15 19:54                         ` Satyam Sharma
  2007-08-15 20:17                           ` Paul E. McKenney
  2007-08-15 20:47                           ` Segher Boessenkool
  2007-08-15 21:05                         ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Segher Boessenkool
  1 sibling, 2 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-15 19:54 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Segher Boessenkool, horms, Stefan Richter,
	Linux Kernel Mailing List, rpjday, netdev, ak, cfriesen,
	Heiko Carstens, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

[ The Cc: list scares me. Should probably trim it. ]


On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> On Wed, Aug 15, 2007 at 08:31:25PM +0200, Segher Boessenkool wrote:
> > >>How does the compiler know that msleep() has got barrier()s?
> > >
> > >Because msleep_interruptible() is in a separate compilation unit,
> > >the compiler has to assume that it might modify any arbitrary global.
> > 
> > No; compilation units have nothing to do with it, GCC can optimise
> > across compilation unit boundaries just fine, if you tell it to
> > compile more than one compilation unit at once.
> 
> Last I checked, the Linux kernel build system did compile each .c file
> as a separate compilation unit.
> 
> > What you probably mean is that the compiler has to assume any code
> > it cannot currently see can do anything (insofar as allowed by the
> > relevant standards etc.)

I think this was just terminology confusion here again. Isn't "any code
that it cannot currently see" the same as "another compilation unit",
and wouldn't the "compilation unit" itself expand if we ask gcc to
compile more than one unit at once? Or is there some more specific
"definition" for "compilation unit" (in gcc lingo, possibly?)

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 14:35             ` Satyam Sharma
  2007-08-15 14:52               ` Herbert Xu
@ 2007-08-15 19:58               ` Stefan Richter
  1 sibling, 0 replies; 1546+ messages in thread
From: Stefan Richter @ 2007-08-15 19:58 UTC (permalink / raw)
  To: Satyam Sharma; +Cc: Linux Kernel Mailing List, Linus Torvalds

(trimmed Cc)

Satyam Sharma wrote:
> [PATCH] ieee1394: Fix kthread stopping in nodemgr_host_thread
> 
> The nodemgr host thread can exit on its own even when kthread_should_stop
> is not true, on receiving a signal (might never happen in practice, as
> it ignores signals). But considering kthread_stop() must not be mixed with
> kthreads that can exit on their own, I think changing the code like this
> is clearer. This change means the thread can cut its sleep short when
> receive a signal but looking at the code around, that sounds okay (and
> again, it might never actually recieve a signal in practice).

Thanks, committed to linux1394-2.6.git.
-- 
Stefan Richter
-=====-=-=== =--- -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 10:35     ` Stefan Richter
  2007-08-15 12:04       ` Herbert Xu
  2007-08-15 12:31       ` Satyam Sharma
@ 2007-08-15 19:59       ` Christoph Lameter
  2 siblings, 0 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-15 19:59 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Satyam Sharma, Chris Snook, Linux Kernel Mailing List, linux-arch,
	torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher, Herbert Xu, Paul E. McKenney

On Wed, 15 Aug 2007, Stefan Richter wrote:

> LDD3 says on page 125:  "The following operations are defined for the
> type [atomic_t] and are guaranteed to be atomic with respect to all
> processors of an SMP computer."
> 
> Doesn't "atomic WRT all processors" require volatility?

Atomic operations only require exclusive access to the cacheline while the 
value is modified.



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 19:54                         ` Satyam Sharma
@ 2007-08-15 20:17                           ` Paul E. McKenney
  2007-08-15 20:52                             ` Segher Boessenkool
  2007-08-15 20:47                           ` Segher Boessenkool
  1 sibling, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-15 20:17 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Segher Boessenkool, horms, Stefan Richter,
	Linux Kernel Mailing List, rpjday, netdev, ak, cfriesen,
	Heiko Carstens, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

On Thu, Aug 16, 2007 at 01:24:42AM +0530, Satyam Sharma wrote:
> [ The Cc: list scares me. Should probably trim it. ]

Trim away!  ;-)

> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> 
> > On Wed, Aug 15, 2007 at 08:31:25PM +0200, Segher Boessenkool wrote:
> > > >>How does the compiler know that msleep() has got barrier()s?
> > > >
> > > >Because msleep_interruptible() is in a separate compilation unit,
> > > >the compiler has to assume that it might modify any arbitrary global.
> > > 
> > > No; compilation units have nothing to do with it, GCC can optimise
> > > across compilation unit boundaries just fine, if you tell it to
> > > compile more than one compilation unit at once.
> > 
> > Last I checked, the Linux kernel build system did compile each .c file
> > as a separate compilation unit.
> > 
> > > What you probably mean is that the compiler has to assume any code
> > > it cannot currently see can do anything (insofar as allowed by the
> > > relevant standards etc.)
> 
> I think this was just terminology confusion here again. Isn't "any code
> that it cannot currently see" the same as "another compilation unit",
> and wouldn't the "compilation unit" itself expand if we ask gcc to
> compile more than one unit at once? Or is there some more specific
> "definition" for "compilation unit" (in gcc lingo, possibly?)

This is indeed my understanding -- "compilation unit" is whatever the
compiler looks at in one go.  I have heard the word "module" used for
the minimal compilation unit covering a single .c file and everything
that it #includes, but there might be a better name for this.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 19:40           ` Satyam Sharma
@ 2007-08-15 20:42             ` Segher Boessenkool
  2007-08-16  1:23               ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-15 20:42 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, netdev, ak, cfriesen,
	rpjday, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>> What volatile does are a) never optimise away a read (or write)
>> to the object, since the data can change in ways the compiler
>> cannot see; and b) never move stores to the object across a
>> sequence point.  This does not mean other accesses cannot be
>> reordered wrt the volatile access.
>>
>> If the abstract machine would do an access to a volatile-
>> qualified object, the generated machine code will do that
>> access too.  But, for example, it can still be optimised
>> away by the compiler, if it can prove it is allowed to.
>
> As (now) indicated above, I had meant multiple volatile accesses to
> the same object, obviously.

Yes, accesses to volatile objects are never reordered with
respect to each other.

> BTW:
>
> #define atomic_read(a)	(*(volatile int *)&(a))
> #define atomic_set(a,i)	(*(volatile int *)&(a) = (i))
>
> int a;
>
> void func(void)
> {
> 	int b;
>
> 	b = atomic_read(a);
> 	atomic_set(a, 20);
> 	b = atomic_read(a);
> }
>
> gives:
>
> func:
> 	pushl	%ebp
> 	movl	a, %eax
> 	movl	%esp, %ebp
> 	movl	$20, a
> 	movl	a, %eax
> 	popl	%ebp
> 	ret
>
> so the first atomic_read() wasn't optimized away.

Of course.  It is executed by the abstract machine, so
it will be executed by the actual machine.  On the other
hand, try

	b = 0;
	if (b)
		b = atomic_read(a);

or similar.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 19:54                         ` Satyam Sharma
  2007-08-15 20:17                           ` Paul E. McKenney
@ 2007-08-15 20:47                           ` Segher Boessenkool
  2007-08-16  0:36                             ` Satyam Sharma
  1 sibling, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-15 20:47 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: horms, Stefan Richter, Linux Kernel Mailing List,
	Paul E. McKenney, ak, netdev, cfriesen, Heiko Carstens, rpjday,
	jesper.juhl, linux-arch, Andrew Morton, zlynx, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>>> What you probably mean is that the compiler has to assume any code
>>> it cannot currently see can do anything (insofar as allowed by the
>>> relevant standards etc.)
>
> I think this was just terminology confusion here again. Isn't "any code
> that it cannot currently see" the same as "another compilation unit",

It is not; try  gcc -combine  or the upcoming link-time optimisation
stuff, for example.

> and wouldn't the "compilation unit" itself expand if we ask gcc to
> compile more than one unit at once? Or is there some more specific
> "definition" for "compilation unit" (in gcc lingo, possibly?)

"compilation unit" is a C standard term.  It typically boils down
to "single .c file".


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 20:17                           ` Paul E. McKenney
@ 2007-08-15 20:52                             ` Segher Boessenkool
  2007-08-15 22:42                               ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-15 20:52 UTC (permalink / raw)
  To: paulmck
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

>> I think this was just terminology confusion here again. Isn't "any 
>> code
>> that it cannot currently see" the same as "another compilation unit",
>> and wouldn't the "compilation unit" itself expand if we ask gcc to
>> compile more than one unit at once? Or is there some more specific
>> "definition" for "compilation unit" (in gcc lingo, possibly?)
>
> This is indeed my understanding -- "compilation unit" is whatever the
> compiler looks at in one go.  I have heard the word "module" used for
> the minimal compilation unit covering a single .c file and everything
> that it #includes, but there might be a better name for this.

Yes, that's what's called "compilation unit" :-)

[/me double checks]

Erm, the C standard actually calls it "translation unit".

To be exact, to avoid any more confusion:

5.1.1.1/1:
A C program need not all be translated at the same time. The
text of the program is kept in units called source files, (or
preprocessing files) in this International Standard. A source
file together with all the headers and source files included
via the preprocessing directive #include is known as a
preprocessing translation unit. After preprocessing, a
preprocessing translation unit is called a translation unit.



Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 17:55               ` Satyam Sharma
  2007-08-15 19:07                 ` Paul E. McKenney
@ 2007-08-15 20:58                 ` Segher Boessenkool
  1 sibling, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-15 20:58 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, netdev, ak, cfriesen,
	rpjday, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>>> Of course, if we find there are more callers in the kernel who want 
>>> the
>>> volatility behaviour than those who don't care, we can re-define the
>>> existing ops to such variants, and re-name the existing definitions 
>>> to
>>> somethine else, say "atomic_read_nonvolatile" for all I care.
>>
>> Do we really need another set of APIs?
>
> Well, if there's one set of users who do care about volatile behaviour,
> and another set that doesn't, it only sounds correct to provide both
> those APIs, instead of forcing one behaviour on all users.

But since there currently is only one such API, and there are
users expecting the stronger behaviour, the only sane thing to
do is let the API provide that behaviour.  You can always add
a new API with weaker behaviour later, and move users that are
okay with it over to that new API.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 18:57                       ` Paul E. McKenney
  2007-08-15 19:54                         ` Satyam Sharma
@ 2007-08-15 21:05                         ` Segher Boessenkool
  2007-08-15 22:44                           ` Paul E. McKenney
  1 sibling, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-15 21:05 UTC (permalink / raw)
  To: paulmck
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

>> No; compilation units have nothing to do with it, GCC can optimise
>> across compilation unit boundaries just fine, if you tell it to
>> compile more than one compilation unit at once.
>
> Last I checked, the Linux kernel build system did compile each .c file
> as a separate compilation unit.

I have some patches to use -combine -fwhole-program for Linux.
Highly experimental, you need a patched bleeding edge toolchain.
If there's interest I'll clean it up and put it online.

David Woodhouse had some similar patches about a year ago.

>>> In many cases, the compiler also has to assume that
>>> msleep_interruptible()
>>> might call back into a function in the current compilation unit, thus
>>> possibly modifying global static variables.
>>
>> It most often is smart enough to see what compilation-unit-local
>> variables might be modified that way, though :-)
>
> Yep.  For example, if it knows the current value of a given such local
> variable, and if all code paths that would change some other variable
> cannot be reached given that current value of the first variable.

Or the most common thing: if neither the address of the translation-
unit local variable nor the address of any function writing to that
variable can "escape" from that translation unit, nothing outside
the translation unit can write to the variable.

> At least given that gcc doesn't know about multiple threads of 
> execution!

Heh, only about the threads it creates itself (not relevant to
the kernel, for sure :-) )


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 19:07                 ` Paul E. McKenney
@ 2007-08-15 21:07                   ` Segher Boessenkool
  0 siblings, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-15 21:07 UTC (permalink / raw)
  To: paulmck
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, rpjday, netdev, ak,
	cfriesen, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>> Possibly these were too trivial to expose any potential problems that 
>> you
>> may have been referring to, so would be helpful if you could write a 
>> more
>> concrete example / sample code.
>
> The trick is to have a sufficiently complicated expression to force
> the compiler to run out of registers.

You can use -ffixed-XXX to keep the testcase simple.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 20:52                             ` Segher Boessenkool
@ 2007-08-15 22:42                               ` Paul E. McKenney
  0 siblings, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-15 22:42 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

On Wed, Aug 15, 2007 at 10:52:53PM +0200, Segher Boessenkool wrote:
> >>I think this was just terminology confusion here again. Isn't "any 
> >>code
> >>that it cannot currently see" the same as "another compilation unit",
> >>and wouldn't the "compilation unit" itself expand if we ask gcc to
> >>compile more than one unit at once? Or is there some more specific
> >>"definition" for "compilation unit" (in gcc lingo, possibly?)
> >
> >This is indeed my understanding -- "compilation unit" is whatever the
> >compiler looks at in one go.  I have heard the word "module" used for
> >the minimal compilation unit covering a single .c file and everything
> >that it #includes, but there might be a better name for this.
> 
> Yes, that's what's called "compilation unit" :-)
> 
> [/me double checks]
> 
> Erm, the C standard actually calls it "translation unit".
> 
> To be exact, to avoid any more confusion:
> 
> 5.1.1.1/1:
> A C program need not all be translated at the same time. The
> text of the program is kept in units called source files, (or
> preprocessing files) in this International Standard. A source
> file together with all the headers and source files included
> via the preprocessing directive #include is known as a
> preprocessing translation unit. After preprocessing, a
> preprocessing translation unit is called a translation unit.

I am OK with "translation" and "compilation" being near-synonyms.  ;-)

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 21:05                         ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Segher Boessenkool
@ 2007-08-15 22:44                           ` Paul E. McKenney
  2007-08-16  1:23                             ` Segher Boessenkool
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-15 22:44 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

On Wed, Aug 15, 2007 at 11:05:35PM +0200, Segher Boessenkool wrote:
> >>No; compilation units have nothing to do with it, GCC can optimise
> >>across compilation unit boundaries just fine, if you tell it to
> >>compile more than one compilation unit at once.
> >
> >Last I checked, the Linux kernel build system did compile each .c file
> >as a separate compilation unit.
> 
> I have some patches to use -combine -fwhole-program for Linux.
> Highly experimental, you need a patched bleeding edge toolchain.
> If there's interest I'll clean it up and put it online.
> 
> David Woodhouse had some similar patches about a year ago.

Sounds exciting...  ;-)

> >>>In many cases, the compiler also has to assume that
> >>>msleep_interruptible()
> >>>might call back into a function in the current compilation unit, thus
> >>>possibly modifying global static variables.
> >>
> >>It most often is smart enough to see what compilation-unit-local
> >>variables might be modified that way, though :-)
> >
> >Yep.  For example, if it knows the current value of a given such local
> >variable, and if all code paths that would change some other variable
> >cannot be reached given that current value of the first variable.
> 
> Or the most common thing: if neither the address of the translation-
> unit local variable nor the address of any function writing to that
> variable can "escape" from that translation unit, nothing outside
> the translation unit can write to the variable.

But there is usually at least one externally callable function in
a .c file.

> >At least given that gcc doesn't know about multiple threads of 
> >execution!
> 
> Heh, only about the threads it creates itself (not relevant to
> the kernel, for sure :-) )

;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 12:31       ` Satyam Sharma
  2007-08-15 13:08         ` Stefan Richter
  2007-08-15 18:31         ` Segher Boessenkool
@ 2007-08-15 23:22         ` Paul Mackerras
  2007-08-16  0:26           ` Christoph Lameter
  2007-08-24 12:50           ` Denys Vlasenko
  2007-08-16  3:37         ` Bill Fink
  3 siblings, 2 replies; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-15 23:22 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

Satyam Sharma writes:

> > Doesn't "atomic WRT all processors" require volatility?
> 
> No, it definitely doesn't. Why should it?
> 
> "Atomic w.r.t. all processors" is just your normal, simple "atomicity"
> for SMP systems (ensure that that object is modified / set / replaced
> in main memory atomically) and has nothing to do with "volatile"
> behaviour.

Atomic variables are "volatile" in the sense that they are liable to
be changed at any time by mechanisms that are outside the knowledge of
the C compiler, namely, other CPUs, or this CPU executing an interrupt
routine.

In the kernel we use atomic variables in precisely those situations
where a variable is potentially accessed concurrently by multiple
CPUs, and where each CPU needs to see updates done by other CPUs in a
timely fashion.  That is what they are for.  Therefore the compiler
must not cache values of atomic variables in registers; each
atomic_read must result in a load and each atomic_set must result in a
store.  Anything else will just lead to subtle bugs.

I have no strong opinion about whether or not the best way to achieve
this is through the use of the "volatile" C keyword.  Segher's idea of
using asm instead seems like a good one to me.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 16:13         ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Chris Snook
@ 2007-08-15 23:40           ` Herbert Xu
  2007-08-15 23:51             ` Paul E. McKenney
  2007-08-16  1:26             ` Segher Boessenkool
  0 siblings, 2 replies; 1546+ messages in thread
From: Herbert Xu @ 2007-08-15 23:40 UTC (permalink / raw)
  To: Chris Snook
  Cc: satyam, clameter, linux-kernel, linux-arch, torvalds, netdev,
	akpm, ak, heiko.carstens, davem, schwidefsky, wensong, horms,
	wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 12:13:12PM -0400, Chris Snook wrote:
>
> Part of the motivation here is to fix heisenbugs.  If I knew where they 

By the same token we should probably disable optimisations
altogether since that too can create heisenbugs.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 18:45                   ` Paul E. McKenney
@ 2007-08-15 23:41                     ` Herbert Xu
  2007-08-15 23:53                       ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-15 23:41 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Wed, Aug 15, 2007 at 11:45:20AM -0700, Paul E. McKenney wrote:
> On Wed, Aug 15, 2007 at 07:19:57PM +0100, David Howells wrote:
> > Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > 
> > > Let's turn this around.  Can you give a single example where
> > > the volatile semantics is needed in a legitimate way?
> > 
> > Accessing H/W registers?  But apart from that...
> 
> Communicating between process context and interrupt/NMI handlers using
> per-CPU variables.

Remeber we're talking about atomic_read/atomic_set.  Please
cite the actual file/function name you have in mind.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 23:40           ` Herbert Xu
@ 2007-08-15 23:51             ` Paul E. McKenney
  2007-08-16  1:30               ` Segher Boessenkool
  2007-08-16  1:26             ` Segher Boessenkool
  1 sibling, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-15 23:51 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Chris Snook, satyam, clameter, linux-kernel, linux-arch, torvalds,
	netdev, akpm, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 07:40:21AM +0800, Herbert Xu wrote:
> On Wed, Aug 15, 2007 at 12:13:12PM -0400, Chris Snook wrote:
> >
> > Part of the motivation here is to fix heisenbugs.  If I knew where they 
> 
> By the same token we should probably disable optimisations
> altogether since that too can create heisenbugs.

Precisely the point -- use of volatile (whether in casts or on asms)
in these cases are intended to disable those optimizations likely to
result in heisenbugs.  But they are also intended to leave other
valuable optimizations in force.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 23:41                     ` Herbert Xu
@ 2007-08-15 23:53                       ` Paul E. McKenney
  2007-08-16  0:12                         ` Herbert Xu
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-15 23:53 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 07:41:46AM +0800, Herbert Xu wrote:
> On Wed, Aug 15, 2007 at 11:45:20AM -0700, Paul E. McKenney wrote:
> > On Wed, Aug 15, 2007 at 07:19:57PM +0100, David Howells wrote:
> > > Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > > 
> > > > Let's turn this around.  Can you give a single example where
> > > > the volatile semantics is needed in a legitimate way?
> > > 
> > > Accessing H/W registers?  But apart from that...
> > 
> > Communicating between process context and interrupt/NMI handlers using
> > per-CPU variables.
> 
> Remeber we're talking about atomic_read/atomic_set.  Please
> cite the actual file/function name you have in mind.

Yep, we are indeed talking about atomic_read()/atomic_set().

We have been through this issue already in this thread.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 23:53                       ` Paul E. McKenney
@ 2007-08-16  0:12                         ` Herbert Xu
  2007-08-16  0:23                           ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  0:12 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Wed, Aug 15, 2007 at 04:53:35PM -0700, Paul E. McKenney wrote:
>
> > > Communicating between process context and interrupt/NMI handlers using
> > > per-CPU variables.
> > 
> > Remeber we're talking about atomic_read/atomic_set.  Please
> > cite the actual file/function name you have in mind.
> 
> Yep, we are indeed talking about atomic_read()/atomic_set().
> 
> We have been through this issue already in this thread.

Sorry, but I must've missed it.  Could you cite the file or
function for my benefit?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:12                         ` Herbert Xu
@ 2007-08-16  0:23                           ` Paul E. McKenney
  2007-08-16  0:30                             ` Herbert Xu
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-16  0:23 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 08:12:48AM +0800, Herbert Xu wrote:
> On Wed, Aug 15, 2007 at 04:53:35PM -0700, Paul E. McKenney wrote:
> >
> > > > Communicating between process context and interrupt/NMI handlers using
> > > > per-CPU variables.
> > > 
> > > Remeber we're talking about atomic_read/atomic_set.  Please
> > > cite the actual file/function name you have in mind.
> > 
> > Yep, we are indeed talking about atomic_read()/atomic_set().
> > 
> > We have been through this issue already in this thread.
> 
> Sorry, but I must've missed it.  Could you cite the file or
> function for my benefit?

I might summarize the thread if there is interest, but I am not able to
do so right this minute.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 23:22         ` Paul Mackerras
@ 2007-08-16  0:26           ` Christoph Lameter
  2007-08-16  0:34             ` Paul Mackerras
  2007-08-16  0:39             ` Paul E. McKenney
  2007-08-24 12:50           ` Denys Vlasenko
  1 sibling, 2 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-16  0:26 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

On Thu, 16 Aug 2007, Paul Mackerras wrote:

> In the kernel we use atomic variables in precisely those situations
> where a variable is potentially accessed concurrently by multiple
> CPUs, and where each CPU needs to see updates done by other CPUs in a
> timely fashion.  That is what they are for.  Therefore the compiler
> must not cache values of atomic variables in registers; each
> atomic_read must result in a load and each atomic_set must result in a
> store.  Anything else will just lead to subtle bugs.

This may have been the intend. However, today the visibility is controlled 
using barriers. And we have barriers that we use with atomic operations. 
Having volatile be the default just lead to confusion. Atomic read should 
just read with no extras. Extras can be added by using variants like 
atomic_read_volatile or so.


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:23                           ` Paul E. McKenney
@ 2007-08-16  0:30                             ` Herbert Xu
  2007-08-16  0:49                               ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  0:30 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Wed, Aug 15, 2007 at 05:23:10PM -0700, Paul E. McKenney wrote:
> On Thu, Aug 16, 2007 at 08:12:48AM +0800, Herbert Xu wrote:
> > On Wed, Aug 15, 2007 at 04:53:35PM -0700, Paul E. McKenney wrote:
> > >
> > > > > Communicating between process context and interrupt/NMI handlers using
> > > > > per-CPU variables.
> > > > 
> > > > Remeber we're talking about atomic_read/atomic_set.  Please
> > > > cite the actual file/function name you have in mind.
> > > 
> > > Yep, we are indeed talking about atomic_read()/atomic_set().
> > > 
> > > We have been through this issue already in this thread.
> > 
> > Sorry, but I must've missed it.  Could you cite the file or
> > function for my benefit?
> 
> I might summarize the thread if there is interest, but I am not able to
> do so right this minute.

Thanks.  But I don't need a summary of the thread, I'm asking
for an extant code snippet in our kernel that benefits from
the volatile change and is not part of a busy-wait.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: your mail
  2007-08-16  0:36                             ` Satyam Sharma
@ 2007-08-16  0:32                               ` Herbert Xu
  2007-08-16  0:58                                 ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Satyam Sharma
  2007-08-16  1:38                               ` Segher Boessenkool
  1 sibling, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  0:32 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Segher Boessenkool, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, ak, netdev, cfriesen,
	Heiko Carstens, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, clameter, schwidefsky, Chris Snook, davem, Linus Torvalds,
	wensong, wjiang

On Thu, Aug 16, 2007 at 06:06:00AM +0530, Satyam Sharma wrote:
> 
> that are:
> 
> 	while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
> 		mdelay(1);
> 		msecs--;
> 	}
> 
> where mdelay() becomes __const_udelay() which happens to be in another
> translation unit (arch/i386/lib/delay.c) and hence saves this callsite
> from being a bug :-)

The udelay itself certainly should have some form of cpu_relax in it.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:26           ` Christoph Lameter
@ 2007-08-16  0:34             ` Paul Mackerras
  2007-08-16  0:40               ` Christoph Lameter
  2007-08-16  0:39             ` Paul E. McKenney
  1 sibling, 1 reply; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-16  0:34 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

Christoph Lameter writes:

> On Thu, 16 Aug 2007, Paul Mackerras wrote:
> 
> > In the kernel we use atomic variables in precisely those situations
> > where a variable is potentially accessed concurrently by multiple
> > CPUs, and where each CPU needs to see updates done by other CPUs in a
> > timely fashion.  That is what they are for.  Therefore the compiler
> > must not cache values of atomic variables in registers; each
> > atomic_read must result in a load and each atomic_set must result in a
> > store.  Anything else will just lead to subtle bugs.
> 
> This may have been the intend. However, today the visibility is controlled 
> using barriers. And we have barriers that we use with atomic operations. 

Those barriers are for when we need ordering between atomic variables
and other memory locations.  An atomic variable by itself doesn't and
shouldn't need any barriers for other CPUs to be able to see what's
happening to it.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* (no subject)
  2007-08-15 20:47                           ` Segher Boessenkool
@ 2007-08-16  0:36                             ` Satyam Sharma
  2007-08-16  0:32                               ` your mail Herbert Xu
  2007-08-16  1:38                               ` Segher Boessenkool
  0 siblings, 2 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-16  0:36 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: horms, Stefan Richter, Linux Kernel Mailing List,
	Paul E. McKenney, ak, netdev, cfriesen, Heiko Carstens, rpjday,
	jesper.juhl, linux-arch, Andrew Morton, zlynx, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

On Wed, 15 Aug 2007, Segher Boessenkool wrote:

> > > > What you probably mean is that the compiler has to assume any code
> > > > it cannot currently see can do anything (insofar as allowed by the
> > > > relevant standards etc.)
> > 
> > I think this was just terminology confusion here again. Isn't "any code
> > that it cannot currently see" the same as "another compilation unit",
> 
> It is not; try  gcc -combine  or the upcoming link-time optimisation
> stuff, for example.
> 
> > and wouldn't the "compilation unit" itself expand if we ask gcc to
> > compile more than one unit at once? Or is there some more specific
> > "definition" for "compilation unit" (in gcc lingo, possibly?)
> 
> "compilation unit" is a C standard term.  It typically boils down
> to "single .c file".

As you mentioned later, "single .c file with all the other files (headers
or other .c files) that it pulls in via #include" is actually "translation
unit", both in the C standard as well as gcc docs. "Compilation unit"
doesn't seem to be nearly as standard a term, though in most places it
is indeed meant to be same as "translation unit", but with the new gcc
inter-module-analysis stuff that you referred to above, I suspect one may
reasonably want to call a "compilation unit" as all that the compiler sees
at a given instant.

BTW I did some auditing (only inside include/asm-{i386,x86_64}/ and
arch/{i386,x86_64}/) and found a couple more callsites that don't use
cpu_relax():

arch/i386/kernel/crash.c:101
arch/x86_64/kernel/crash.c:97

that are:

	while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
		mdelay(1);
		msecs--;
	}

where mdelay() becomes __const_udelay() which happens to be in another
translation unit (arch/i386/lib/delay.c) and hence saves this callsite
from being a bug :-)

Curiously, __const_udelay() is still marked as "inline" where it is
implemented in lib/delay.c which is weird, considering it won't ever
be inlined, would it? With the kernel presently being compiled one
translation unit at a time, I don't see how the implementation would
be visible to any callsite out there to be able to inline it.

Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-15  8:18         ` Heiko Carstens
  2007-08-15 13:53           ` Stefan Richter
@ 2007-08-16  0:39           ` Satyam Sharma
  2007-08-24 11:59             ` Denys Vlasenko
  1 sibling, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-16  0:39 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Herbert Xu, Chris Snook, clameter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher



On Wed, 15 Aug 2007, Heiko Carstens wrote:

> [...]
> Btw.: we still have
> 
> include/asm-i386/mach-es7000/mach_wakecpu.h:  while (!atomic_read(deassert));
> include/asm-i386/mach-default/mach_wakecpu.h: while (!atomic_read(deassert));
> 
> Looks like they need to be fixed as well.


[PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()

Use cpu_relax() in the busy loops, as atomic_read() doesn't automatically
imply volatility for i386 and x86_64. x86_64 doesn't have this issue because
it open-codes the while loop in smpboot.c:smp_callin() itself that already
uses cpu_relax().

For i386, however, smpboot.c:smp_callin() calls wait_for_init_deassert()
which is buggy for mach-default and mach-es7000 cases.

[ I test-built a kernel -- smp_callin() itself got inlined in its only
  callsite, smpboot.c:start_secondary() -- and the relevant piece of
  code disassembles to the following:

0xc1019704 <start_secondary+12>:        mov    0xc144c4c8,%eax
0xc1019709 <start_secondary+17>:        test   %eax,%eax
0xc101970b <start_secondary+19>:        je     0xc1019709 <start_secondary+17>

  init_deasserted (at 0xc144c4c8) gets fetched into %eax only once and
  then we loop over the test of the stale value in the register only,
  so these look like real bugs to me. With the fix below, this becomes:

0xc1019706 <start_secondary+14>:        pause
0xc1019708 <start_secondary+16>:        cmpl   $0x0,0xc144c4c8
0xc101970f <start_secondary+23>:        je     0xc1019706 <start_secondary+14>

  which looks nice and healthy. ]

Thanks to Heiko Carstens for noticing this.

Signed-off-by: Satyam Sharma <satyam@infradead.org>

---

 include/asm-i386/mach-default/mach_wakecpu.h |    3 ++-
 include/asm-i386/mach-es7000/mach_wakecpu.h  |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/asm-i386/mach-default/mach_wakecpu.h b/include/asm-i386/mach-default/mach_wakecpu.h
index 673b85c..3ebb178 100644
--- a/include/asm-i386/mach-default/mach_wakecpu.h
+++ b/include/asm-i386/mach-default/mach_wakecpu.h
@@ -15,7 +15,8 @@
 
 static inline void wait_for_init_deassert(atomic_t *deassert)
 {
-	while (!atomic_read(deassert));
+	while (!atomic_read(deassert))
+		cpu_relax();
 	return;
 }
 
diff --git a/include/asm-i386/mach-es7000/mach_wakecpu.h b/include/asm-i386/mach-es7000/mach_wakecpu.h
index efc903b..84ff583 100644
--- a/include/asm-i386/mach-es7000/mach_wakecpu.h
+++ b/include/asm-i386/mach-es7000/mach_wakecpu.h
@@ -31,7 +31,8 @@ wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip)
 static inline void wait_for_init_deassert(atomic_t *deassert)
 {
 #ifdef WAKE_SECONDARY_VIA_INIT
-	while (!atomic_read(deassert));
+	while (!atomic_read(deassert))
+		cpu_relax();
 #endif
 	return;
 }

^ permalink raw reply related	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:26           ` Christoph Lameter
  2007-08-16  0:34             ` Paul Mackerras
@ 2007-08-16  0:39             ` Paul E. McKenney
  2007-08-16  0:42               ` Christoph Lameter
  1 sibling, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-16  0:39 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, Aug 15, 2007 at 05:26:34PM -0700, Christoph Lameter wrote:
> On Thu, 16 Aug 2007, Paul Mackerras wrote:
> 
> > In the kernel we use atomic variables in precisely those situations
> > where a variable is potentially accessed concurrently by multiple
> > CPUs, and where each CPU needs to see updates done by other CPUs in a
> > timely fashion.  That is what they are for.  Therefore the compiler
> > must not cache values of atomic variables in registers; each
> > atomic_read must result in a load and each atomic_set must result in a
> > store.  Anything else will just lead to subtle bugs.
> 
> This may have been the intend. However, today the visibility is controlled 
> using barriers. And we have barriers that we use with atomic operations. 
> Having volatile be the default just lead to confusion. Atomic read should 
> just read with no extras. Extras can be added by using variants like 
> atomic_read_volatile or so.

Seems to me that we face greater chance of confusion without the
volatile than with, particularly as compiler optimizations become
more aggressive.  Yes, we could simply disable optimization, but
optimization can be quite helpful.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:34             ` Paul Mackerras
@ 2007-08-16  0:40               ` Christoph Lameter
  0 siblings, 0 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-16  0:40 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

On Thu, 16 Aug 2007, Paul Mackerras wrote:

> Those barriers are for when we need ordering between atomic variables
> and other memory locations.  An atomic variable by itself doesn't and
> shouldn't need any barriers for other CPUs to be able to see what's
> happening to it.

It does not need any barriers. As soon as one cpu acquires the 
cacheline for write it will be invalidated in the caches of the others. So 
the other cpu will have to refetch. No need for volatile.

The issue here may be that the compiler has fetched the atomic variable 
earlier and put it into a register. However, that prefetching is limited 
because it cannot cross functions calls etc. The only problem could be 
loops where the compiler does not refetch the variable since it assumes 
that it does not change and there are no function calls in the body of the 
loop. But AFAIK these loops need cpu_relax and other measures anyways to 
avoid bad effects from busy waiting.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:39             ` Paul E. McKenney
@ 2007-08-16  0:42               ` Christoph Lameter
  2007-08-16  0:53                 ` Paul E. McKenney
  2007-08-16  1:51                 ` Paul Mackerras
  0 siblings, 2 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-16  0:42 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> Seems to me that we face greater chance of confusion without the
> volatile than with, particularly as compiler optimizations become
> more aggressive.  Yes, we could simply disable optimization, but
> optimization can be quite helpful.

A volatile default would disable optimizations for atomic_read. 
atomic_read without volatile would allow for full optimization by the 
compiler. Seems that this is what one wants in many cases.


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:30                             ` Herbert Xu
@ 2007-08-16  0:49                               ` Paul E. McKenney
  2007-08-16  0:53                                 ` Herbert Xu
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-16  0:49 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 08:30:23AM +0800, Herbert Xu wrote:
> On Wed, Aug 15, 2007 at 05:23:10PM -0700, Paul E. McKenney wrote:
> > On Thu, Aug 16, 2007 at 08:12:48AM +0800, Herbert Xu wrote:
> > > On Wed, Aug 15, 2007 at 04:53:35PM -0700, Paul E. McKenney wrote:
> > > >
> > > > > > Communicating between process context and interrupt/NMI handlers using
> > > > > > per-CPU variables.
> > > > > 
> > > > > Remeber we're talking about atomic_read/atomic_set.  Please
> > > > > cite the actual file/function name you have in mind.
> > > > 
> > > > Yep, we are indeed talking about atomic_read()/atomic_set().
> > > > 
> > > > We have been through this issue already in this thread.
> > > 
> > > Sorry, but I must've missed it.  Could you cite the file or
> > > function for my benefit?
> > 
> > I might summarize the thread if there is interest, but I am not able to
> > do so right this minute.
> 
> Thanks.  But I don't need a summary of the thread, I'm asking
> for an extant code snippet in our kernel that benefits from
> the volatile change and is not part of a busy-wait.

Sorry, can't help you there.  I really do believe that the information
you need (as opposed to the specific item you are asking for) really
has been put forth in this thread.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:58                                 ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Satyam Sharma
@ 2007-08-16  0:51                                   ` Herbert Xu
  2007-08-16  1:18                                     ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  0:51 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Segher Boessenkool, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, ak, netdev, cfriesen,
	Heiko Carstens, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, clameter, schwidefsky, Chris Snook, davem, Linus Torvalds,
	wensong, wjiang

On Thu, Aug 16, 2007 at 06:28:42AM +0530, Satyam Sharma wrote:
>
> > The udelay itself certainly should have some form of cpu_relax in it.
> 
> Yes, a form of barrier() must be present in mdelay() or udelay() itself
> as you say, having it in __const_udelay() is *not* enough (superflous
> actually, considering it is already a separate translation unit and
> invisible to the compiler).

As long as __const_udelay does something which has the same
effect as barrier it is enough even if it's in the same unit.
As a matter of fact it does on i386 where __delay either uses
rep_nop or asm/volatile.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:49                               ` Paul E. McKenney
@ 2007-08-16  0:53                                 ` Herbert Xu
  2007-08-16  1:14                                   ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  0:53 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Wed, Aug 15, 2007 at 05:49:50PM -0700, Paul E. McKenney wrote:
> On Thu, Aug 16, 2007 at 08:30:23AM +0800, Herbert Xu wrote:
>
> > Thanks.  But I don't need a summary of the thread, I'm asking
> > for an extant code snippet in our kernel that benefits from
> > the volatile change and is not part of a busy-wait.
> 
> Sorry, can't help you there.  I really do believe that the information
> you need (as opposed to the specific item you are asking for) really
> has been put forth in this thread.

That only leads me to believe that such a code snippet simply
does not exist.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:42               ` Christoph Lameter
@ 2007-08-16  0:53                 ` Paul E. McKenney
  2007-08-16  0:59                   ` Christoph Lameter
  2007-08-16  1:51                 ` Paul Mackerras
  1 sibling, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-16  0:53 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, Aug 15, 2007 at 05:42:07PM -0700, Christoph Lameter wrote:
> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> 
> > Seems to me that we face greater chance of confusion without the
> > volatile than with, particularly as compiler optimizations become
> > more aggressive.  Yes, we could simply disable optimization, but
> > optimization can be quite helpful.
> 
> A volatile default would disable optimizations for atomic_read. 
> atomic_read without volatile would allow for full optimization by the 
> compiler. Seems that this is what one wants in many cases.

The volatile cast should not disable all that many optimizations,
for example, it is much less hurtful than barrier().  Furthermore,
the main optimizations disabled (pulling atomic_read() and atomic_set()
out of loops) really do need to be disabled.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:32                               ` your mail Herbert Xu
@ 2007-08-16  0:58                                 ` Satyam Sharma
  2007-08-16  0:51                                   ` Herbert Xu
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-16  0:58 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Segher Boessenkool, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, ak, netdev, cfriesen,
	Heiko Carstens, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, clameter, schwidefsky, Chris Snook, davem, Linus Torvalds,
	wensong, wjiang

[ Sorry for empty subject line in previous mail. I intended to make
  a patch so cleared it to change it, but ultimately neither made
  a patch nor restored subject line. Done that now. ]

On Thu, 16 Aug 2007, Herbert Xu wrote:

> On Thu, Aug 16, 2007 at 06:06:00AM +0530, Satyam Sharma wrote:
> > 
> > that are:
> > 
> > 	while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
> > 		mdelay(1);
> > 		msecs--;
> > 	}
> > 
> > where mdelay() becomes __const_udelay() which happens to be in another
> > translation unit (arch/i386/lib/delay.c) and hence saves this callsite
> > from being a bug :-)
> 
> The udelay itself certainly should have some form of cpu_relax in it.

Yes, a form of barrier() must be present in mdelay() or udelay() itself
as you say, having it in __const_udelay() is *not* enough (superflous
actually, considering it is already a separate translation unit and
invisible to the compiler).

However, there are no compiler barriers on the macro-definition-path
between mdelay(1) and __const_udelay(), so the only thing that saves us
from being a bug here is indeed the different-translation-unit concept.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:53                 ` Paul E. McKenney
@ 2007-08-16  0:59                   ` Christoph Lameter
  2007-08-16  1:14                     ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-16  0:59 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> The volatile cast should not disable all that many optimizations,
> for example, it is much less hurtful than barrier().  Furthermore,
> the main optimizations disabled (pulling atomic_read() and atomic_set()
> out of loops) really do need to be disabled.

In many cases you do not need a barrier. Having volatile there *will* 
impact optimization because the compiler cannot use a register that may 
contain the value that was fetched earlier. And the compiler cannot choose 
freely when to fetch the value. The order of memory accesses are fixed if 
you use volatile. If the variable is not volatile then the compiler can 
arrange memory accesses any way they fit and thus generate better code.


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:59                   ` Christoph Lameter
@ 2007-08-16  1:14                     ` Paul E. McKenney
  2007-08-16  1:41                       ` Christoph Lameter
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-16  1:14 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, Aug 15, 2007 at 05:59:41PM -0700, Christoph Lameter wrote:
> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> 
> > The volatile cast should not disable all that many optimizations,
> > for example, it is much less hurtful than barrier().  Furthermore,
> > the main optimizations disabled (pulling atomic_read() and atomic_set()
> > out of loops) really do need to be disabled.
> 
> In many cases you do not need a barrier. Having volatile there *will* 
> impact optimization because the compiler cannot use a register that may 
> contain the value that was fetched earlier. And the compiler cannot choose 
> freely when to fetch the value. The order of memory accesses are fixed if 
> you use volatile. If the variable is not volatile then the compiler can 
> arrange memory accesses any way they fit and thus generate better code.

Understood.  My point is not that the impact is precisely zero, but
rather that the impact on optimization is much less hurtful than the
problems that could arise otherwise, particularly as compilers become
more aggressive in their optimizations.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:53                                 ` Herbert Xu
@ 2007-08-16  1:14                                   ` Paul E. McKenney
  0 siblings, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-16  1:14 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 08:53:16AM +0800, Herbert Xu wrote:
> On Wed, Aug 15, 2007 at 05:49:50PM -0700, Paul E. McKenney wrote:
> > On Thu, Aug 16, 2007 at 08:30:23AM +0800, Herbert Xu wrote:
> >
> > > Thanks.  But I don't need a summary of the thread, I'm asking
> > > for an extant code snippet in our kernel that benefits from
> > > the volatile change and is not part of a busy-wait.
> > 
> > Sorry, can't help you there.  I really do believe that the information
> > you need (as opposed to the specific item you are asking for) really
> > has been put forth in this thread.
> 
> That only leads me to believe that such a code snippet simply
> does not exist.

Whatever...

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:51                                   ` Herbert Xu
@ 2007-08-16  1:18                                     ` Satyam Sharma
  0 siblings, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-16  1:18 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Segher Boessenkool, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, ak, netdev, cfriesen,
	Heiko Carstens, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, clameter, schwidefsky, Chris Snook, davem, Linus Torvalds,
	wensong, wjiang

Hi Herbert,


On Thu, 16 Aug 2007, Herbert Xu wrote:

> On Thu, Aug 16, 2007 at 06:28:42AM +0530, Satyam Sharma wrote:
> >
> > > The udelay itself certainly should have some form of cpu_relax in it.
> > 
> > Yes, a form of barrier() must be present in mdelay() or udelay() itself
> > as you say, having it in __const_udelay() is *not* enough (superflous
> > actually, considering it is already a separate translation unit and
> > invisible to the compiler).
> 
> As long as __const_udelay does something which has the same
> effect as barrier it is enough even if it's in the same unit.

Only if __const_udelay() is inlined. But as I said, __const_udelay()
-- although marked "inline" -- will never be inlined anywhere in the
kernel in reality. It's an exported symbol, and never inlined from
modules. Even from built-in targets, the definition of __const_udelay
is invisible when gcc is compiling the compilation units of those
callsites. The compiler has no idea that that function has barriers
or not, so we're saved here _only_ by the lucky fact that
__const_udelay() is in a different compilation unit.


> As a matter of fact it does on i386 where __delay either uses
> rep_nop or asm/volatile.

__delay() can be either delay_tsc() or delay_loop() on i386.

delay_tsc() uses the rep_nop() there for it's own little busy
loop, actually. But for a call site that inlines __const_udelay()
-- if it were ever moved to a .h file and marked inline -- the
call to __delay() will _still_ be across compilation units. So,
again for this case, it does not matter if the callee function
has compiler barriers or not (it would've been a different story
if we were discussing real/CPU barriers, I think), what saves us
here is just the fact that a call is made to a function from a
different compilation unit, which is invisible to the compiler
when compiling the callsite, and hence acting as the compiler
barrier.

Regarding delay_loop(), it uses "volatile" for the "asm" which
has quite different semantics from the C language "volatile"
type-qualifier keyword and does not imply any compiler barrier
at all.


Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 22:44                           ` Paul E. McKenney
@ 2007-08-16  1:23                             ` Segher Boessenkool
  2007-08-16  2:22                               ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-16  1:23 UTC (permalink / raw)
  To: paulmck
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

>>>> No; compilation units have nothing to do with it, GCC can optimise
>>>> across compilation unit boundaries just fine, if you tell it to
>>>> compile more than one compilation unit at once.
>>>
>>> Last I checked, the Linux kernel build system did compile each .c 
>>> file
>>> as a separate compilation unit.
>>
>> I have some patches to use -combine -fwhole-program for Linux.
>> Highly experimental, you need a patched bleeding edge toolchain.
>> If there's interest I'll clean it up and put it online.
>>
>> David Woodhouse had some similar patches about a year ago.
>
> Sounds exciting...  ;-)

Yeah, the breakage is *quite* spectacular :-)

>>>>> In many cases, the compiler also has to assume that
>>>>> msleep_interruptible()
>>>>> might call back into a function in the current compilation unit, 
>>>>> thus
>>>>> possibly modifying global static variables.
>>>>
>>>> It most often is smart enough to see what compilation-unit-local
>>>> variables might be modified that way, though :-)
>>>
>>> Yep.  For example, if it knows the current value of a given such 
>>> local
>>> variable, and if all code paths that would change some other variable
>>> cannot be reached given that current value of the first variable.
>>
>> Or the most common thing: if neither the address of the translation-
>> unit local variable nor the address of any function writing to that
>> variable can "escape" from that translation unit, nothing outside
>> the translation unit can write to the variable.
>
> But there is usually at least one externally callable function in
> a .c file.

Of course, but often none of those will (indirectly) write a certain
static variable.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 20:42             ` Segher Boessenkool
@ 2007-08-16  1:23               ` Satyam Sharma
  0 siblings, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-16  1:23 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, netdev, ak, cfriesen,
	rpjday, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang



On Wed, 15 Aug 2007, Segher Boessenkool wrote:

> [...]
> > BTW:
> > 
> > #define atomic_read(a)	(*(volatile int *)&(a))
> > #define atomic_set(a,i)	(*(volatile int *)&(a) = (i))
> > 
> > int a;
> > 
> > void func(void)
> > {
> > 	int b;
> > 
> > 	b = atomic_read(a);
> > 	atomic_set(a, 20);
> > 	b = atomic_read(a);
> > }
> > 
> > gives:
> > 
> > func:
> > 	pushl	%ebp
> > 	movl	a, %eax
> > 	movl	%esp, %ebp
> > 	movl	$20, a
> > 	movl	a, %eax
> > 	popl	%ebp
> > 	ret
> > 
> > so the first atomic_read() wasn't optimized away.
> 
> Of course.  It is executed by the abstract machine, so
> it will be executed by the actual machine.  On the other
> hand, try
> 
> 	b = 0;
> 	if (b)
> 		b = atomic_read(a);
> 
> or similar.

Yup, obviously. Volatile accesses (or any access to volatile objects),
or even "__volatile__ asms" (which gcc normally promises never to elid)
can always be optimized for cases such as these where the compiler can
trivially determine that the code in question is not reachable.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 23:40           ` Herbert Xu
  2007-08-15 23:51             ` Paul E. McKenney
@ 2007-08-16  1:26             ` Segher Boessenkool
  2007-08-16  2:23               ` Nick Piggin
  1 sibling, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-16  1:26 UTC (permalink / raw)
  To: Herbert Xu
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, davem, wensong, wjiang

>> Part of the motivation here is to fix heisenbugs.  If I knew where 
>> they
>
> By the same token we should probably disable optimisations
> altogether since that too can create heisenbugs.

Almost everything is a tradeoff; and so is this.  I don't
believe most people would find disabling all compiler
optimisations an acceptable price to pay for some peace
of mind.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 23:51             ` Paul E. McKenney
@ 2007-08-16  1:30               ` Segher Boessenkool
  2007-08-16  2:30                 ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-16  1:30 UTC (permalink / raw)
  To: paulmck
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

>>> Part of the motivation here is to fix heisenbugs.  If I knew where 
>>> they
>>
>> By the same token we should probably disable optimisations
>> altogether since that too can create heisenbugs.
>
> Precisely the point -- use of volatile (whether in casts or on asms)
> in these cases are intended to disable those optimizations likely to
> result in heisenbugs.

The only thing volatile on an asm does is create a side effect
on the asm statement; in effect, it tells the compiler "do not
remove this asm even if you don't need any of its outputs".

It's not disabling optimisation likely to result in bugs,
heisen- or otherwise; _not_ putting the volatile on an asm
that needs it simply _is_ a bug :-)


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2007-08-16  0:36                             ` Satyam Sharma
  2007-08-16  0:32                               ` your mail Herbert Xu
@ 2007-08-16  1:38                               ` Segher Boessenkool
  1 sibling, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-16  1:38 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: horms, Stefan Richter, Linux Kernel Mailing List,
	Paul E. McKenney, ak, netdev, cfriesen, Heiko Carstens, rpjday,
	jesper.juhl, linux-arch, Andrew Morton, zlynx, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>> "compilation unit" is a C standard term.  It typically boils down
>> to "single .c file".
>
> As you mentioned later, "single .c file with all the other files 
> (headers
> or other .c files) that it pulls in via #include" is actually 
> "translation
> unit", both in the C standard as well as gcc docs.

Yeah.  "single .c file after preprocessing".  Same thing :-)

> "Compilation unit"
> doesn't seem to be nearly as standard a term, though in most places it
> is indeed meant to be same as "translation unit", but with the new gcc
> inter-module-analysis stuff that you referred to above, I suspect one 
> may
> reasonably want to call a "compilation unit" as all that the compiler 
> sees
> at a given instant.

That would be a bit confusing, would it not?  They'd better find
some better name for that if they want to name it at all (remember,
none of these optimisations should have any effect on the semantics
of the program, you just get fewer .o files etc.).


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:14                     ` Paul E. McKenney
@ 2007-08-16  1:41                       ` Christoph Lameter
  2007-08-16  2:15                         ` Satyam Sharma
  2007-08-16  2:32                         ` Paul E. McKenney
  0 siblings, 2 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-16  1:41 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> Understood.  My point is not that the impact is precisely zero, but
> rather that the impact on optimization is much less hurtful than the
> problems that could arise otherwise, particularly as compilers become
> more aggressive in their optimizations.

The problems arise because barriers are not used as required. Volatile 
has wishy washy semantics and somehow marries memory barriers with data 
access. It is clearer to separate the two. Conceptual cleanness usually 
translates into better code. If one really wants the volatile then lets 
make it explicit and use

	atomic_read_volatile()

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:42               ` Christoph Lameter
  2007-08-16  0:53                 ` Paul E. McKenney
@ 2007-08-16  1:51                 ` Paul Mackerras
  2007-08-16  2:00                   ` Herbert Xu
  2007-08-16  2:07                   ` Segher Boessenkool
  1 sibling, 2 replies; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-16  1:51 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

Christoph Lameter writes:

> A volatile default would disable optimizations for atomic_read. 
> atomic_read without volatile would allow for full optimization by the 
> compiler. Seems that this is what one wants in many cases.

Name one such case.

An atomic_read should do a load from memory.  If the programmer puts
an atomic_read() in the code then the compiler should emit a load for
it, not re-use a value returned by a previous atomic_read.  I do not
believe it would ever be useful for the compiler to collapse two
atomic_read statements into a single load.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:51                 ` Paul Mackerras
@ 2007-08-16  2:00                   ` Herbert Xu
  2007-08-16  2:05                     ` Paul Mackerras
  2007-08-16  2:07                   ` Segher Boessenkool
  1 sibling, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  2:00 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Paul E. McKenney, Satyam Sharma,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 11:51:42AM +1000, Paul Mackerras wrote:
> 
> Name one such case.

See sk_stream_mem_schedule in net/core/stream.c:

        /* Under limit. */
        if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
                if (*sk->sk_prot->memory_pressure)
                        *sk->sk_prot->memory_pressure = 0;
                return 1;
        }

        /* Over hard limit. */
        if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
                sk->sk_prot->enter_memory_pressure();
                goto suppress_allocation;
        }

We don't need to reload sk->sk_prot->memory_allocated here.

Now where is your example again?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:00                   ` Herbert Xu
@ 2007-08-16  2:05                     ` Paul Mackerras
  2007-08-16  2:11                       ` Herbert Xu
                                         ` (2 more replies)
  0 siblings, 3 replies; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-16  2:05 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Paul E. McKenney, Satyam Sharma,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> See sk_stream_mem_schedule in net/core/stream.c:
> 
>         /* Under limit. */
>         if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
>                 if (*sk->sk_prot->memory_pressure)
>                         *sk->sk_prot->memory_pressure = 0;
>                 return 1;
>         }
> 
>         /* Over hard limit. */
>         if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
>                 sk->sk_prot->enter_memory_pressure();
>                 goto suppress_allocation;
>         }
> 
> We don't need to reload sk->sk_prot->memory_allocated here.

Are you sure?  How do you know some other CPU hasn't changed the value
in between?

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:51                 ` Paul Mackerras
  2007-08-16  2:00                   ` Herbert Xu
@ 2007-08-16  2:07                   ` Segher Boessenkool
  1 sibling, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-16  2:07 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, Paul E. McKenney,
	netdev, ak, cfriesen, rpjday, jesper.juhl, linux-arch,
	Andrew Morton, zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

>> A volatile default would disable optimizations for atomic_read.
>> atomic_read without volatile would allow for full optimization by the
>> compiler. Seems that this is what one wants in many cases.
>
> Name one such case.
>
> An atomic_read should do a load from memory.  If the programmer puts
> an atomic_read() in the code then the compiler should emit a load for
> it, not re-use a value returned by a previous atomic_read.  I do not
> believe it would ever be useful for the compiler to collapse two
> atomic_read statements into a single load.

An atomic_read() implemented as a "normal" C variable read would
allow that read to be combined with another "normal" read from
that variable.  This could perhaps be marginally useful, although
I'd bet you cannot see it unless counting cycles on a simulator
or counting bits in the binary size.

With an asm() implementation, the compiler can not do this; with
a "volatile" implementation (either volatile variable or volatile-cast),
this invokes undefined behaviour (in both C and GCC).


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:15                         ` Satyam Sharma
@ 2007-08-16  2:08                           ` Herbert Xu
  2007-08-16  2:18                             ` Christoph Lameter
  2007-08-16  2:18                             ` Chris Friesen
  0 siblings, 2 replies; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  2:08 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Paul E. McKenney, Paul Mackerras,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 07:45:44AM +0530, Satyam Sharma wrote:
>
> Completely agreed, again. To summarize again (had done so about ~100 mails
> earlier in this thread too :-) ...
> 
> atomic_{read,set}_volatile() -- guarantees volatility also along with
> atomicity (the two _are_ different concepts after all, irrespective of
> whether callsites normally want one with the other or not)
> 
> atomic_{read,set}_nonvolatile() -- only guarantees atomicity, compiler
> free to elid / coalesce / optimize such accesses, can keep the object
> in question cached in a local register, leads to smaller text, etc.
> 
> As to which one should be the default atomic_read() is a question of
> whether majority of callsites (more weightage to important / hot
> codepaths, lesser to obscure callsites) want a particular behaviour.
> 
> Do we have a consensus here? (hoping against hope, probably :-)

I can certainly agree with this.

But I have to say that I still don't know of a single place
where one would actually use the volatile variant.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:05                     ` Paul Mackerras
@ 2007-08-16  2:11                       ` Herbert Xu
  2007-08-16  2:35                         ` Paul E. McKenney
  2007-08-16  3:15                         ` Paul Mackerras
  2007-08-16  2:15                       ` Christoph Lameter
  2007-08-16  2:33                       ` Satyam Sharma
  2 siblings, 2 replies; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  2:11 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Paul E. McKenney, Satyam Sharma,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 12:05:56PM +1000, Paul Mackerras wrote:
> Herbert Xu writes:
> 
> > See sk_stream_mem_schedule in net/core/stream.c:
> > 
> >         /* Under limit. */
> >         if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
> >                 if (*sk->sk_prot->memory_pressure)
> >                         *sk->sk_prot->memory_pressure = 0;
> >                 return 1;
> >         }
> > 
> >         /* Over hard limit. */
> >         if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
> >                 sk->sk_prot->enter_memory_pressure();
> >                 goto suppress_allocation;
> >         }
> > 
> > We don't need to reload sk->sk_prot->memory_allocated here.
> 
> Are you sure?  How do you know some other CPU hasn't changed the value
> in between?

Yes I'm sure, because we don't care if others have increased
the reservation.

Note that even if we did we'd be using barriers so volatile
won't do us any good here.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:41                       ` Christoph Lameter
@ 2007-08-16  2:15                         ` Satyam Sharma
  2007-08-16  2:08                           ` Herbert Xu
  2007-08-16  2:32                         ` Paul E. McKenney
  1 sibling, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-16  2:15 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Paul Mackerras, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, 15 Aug 2007, Christoph Lameter wrote:

> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> 
> > Understood.  My point is not that the impact is precisely zero, but
> > rather that the impact on optimization is much less hurtful than the
> > problems that could arise otherwise, particularly as compilers become
> > more aggressive in their optimizations.
> 
> The problems arise because barriers are not used as required. Volatile 
> has wishy washy semantics and somehow marries memory barriers with data 
> access. It is clearer to separate the two. Conceptual cleanness usually 
> translates into better code. If one really wants the volatile then lets 
> make it explicit and use
> 
> 	atomic_read_volatile()

Completely agreed, again. To summarize again (had done so about ~100 mails
earlier in this thread too :-) ...

atomic_{read,set}_volatile() -- guarantees volatility also along with
atomicity (the two _are_ different concepts after all, irrespective of
whether callsites normally want one with the other or not)

atomic_{read,set}_nonvolatile() -- only guarantees atomicity, compiler
free to elid / coalesce / optimize such accesses, can keep the object
in question cached in a local register, leads to smaller text, etc.

As to which one should be the default atomic_read() is a question of
whether majority of callsites (more weightage to important / hot
codepaths, lesser to obscure callsites) want a particular behaviour.

Do we have a consensus here? (hoping against hope, probably :-)

[ This thread has gotten completely out of hand ... for my mail client
  alpine as well, it now seems. Reminds of that 1000+ GPLv3 fest :-) ]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:05                     ` Paul Mackerras
  2007-08-16  2:11                       ` Herbert Xu
@ 2007-08-16  2:15                       ` Christoph Lameter
  2007-08-16  2:17                         ` Christoph Lameter
  2007-08-16  2:33                       ` Satyam Sharma
  2 siblings, 1 reply; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-16  2:15 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Paul E. McKenney, Satyam Sharma, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, 16 Aug 2007, Paul Mackerras wrote:

> > We don't need to reload sk->sk_prot->memory_allocated here.
> 
> Are you sure?  How do you know some other CPU hasn't changed the value
> in between?

The cpu knows because the cacheline was not invalidated.


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:15                       ` Christoph Lameter
@ 2007-08-16  2:17                         ` Christoph Lameter
  0 siblings, 0 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-16  2:17 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Paul E. McKenney, Satyam Sharma, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Wed, 15 Aug 2007, Christoph Lameter wrote:

> On Thu, 16 Aug 2007, Paul Mackerras wrote:
> 
> > > We don't need to reload sk->sk_prot->memory_allocated here.
> > 
> > Are you sure?  How do you know some other CPU hasn't changed the value
> > in between?
> 
> The cpu knows because the cacheline was not invalidated.

Crap my statement above is wrong..... We do not care that the 
value was changed otherwise we would have put a barrier in there.


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:08                           ` Herbert Xu
@ 2007-08-16  2:18                             ` Christoph Lameter
  2007-08-16  3:23                               ` Paul Mackerras
  2007-08-16  2:18                             ` Chris Friesen
  1 sibling, 1 reply; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-16  2:18 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Satyam Sharma, Paul E. McKenney, Paul Mackerras, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, 16 Aug 2007, Herbert Xu wrote:

> > Do we have a consensus here? (hoping against hope, probably :-)
> 
> I can certainly agree with this.

I agree too.

> But I have to say that I still don't know of a single place
> where one would actually use the volatile variant.

I suspect that what you say is true after we have looked at all callers.



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:08                           ` Herbert Xu
  2007-08-16  2:18                             ` Christoph Lameter
@ 2007-08-16  2:18                             ` Chris Friesen
  1 sibling, 0 replies; 1546+ messages in thread
From: Chris Friesen @ 2007-08-16  2:18 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney,
	Paul Mackerras, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, zlynx, rpjday, jesper.juhl, segher

Herbert Xu wrote:

> But I have to say that I still don't know of a single place
> where one would actually use the volatile variant.

Given that many of the existing users do currently have "volatile", are 
you comfortable simply removing that behaviour from them?  Are you sure 
that you will not introduce any issues?

Forcing a re-read is only a performance penalty.  Removing it can cause 
behavioural changes.

I would be more comfortable making the default match the majority of the 
current implementations (ie: volatile semantics).  Then, if someone 
cares about performance they can explicitly validate the call path and 
convert it over to the non-volatile version.

Correctness before speed...

Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:23                             ` Segher Boessenkool
@ 2007-08-16  2:22                               ` Paul E. McKenney
  0 siblings, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-16  2:22 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

On Thu, Aug 16, 2007 at 03:23:28AM +0200, Segher Boessenkool wrote:
> >>>>No; compilation units have nothing to do with it, GCC can optimise
> >>>>across compilation unit boundaries just fine, if you tell it to
> >>>>compile more than one compilation unit at once.
> >>>
> >>>Last I checked, the Linux kernel build system did compile each .c 
> >>>file
> >>>as a separate compilation unit.
> >>
> >>I have some patches to use -combine -fwhole-program for Linux.
> >>Highly experimental, you need a patched bleeding edge toolchain.
> >>If there's interest I'll clean it up and put it online.
> >>
> >>David Woodhouse had some similar patches about a year ago.
> >
> >Sounds exciting...  ;-)
> 
> Yeah, the breakage is *quite* spectacular :-)

I bet!!!  ;-)

> >>>>>In many cases, the compiler also has to assume that
> >>>>>msleep_interruptible()
> >>>>>might call back into a function in the current compilation unit, 
> >>>>>thus
> >>>>>possibly modifying global static variables.
> >>>>
> >>>>It most often is smart enough to see what compilation-unit-local
> >>>>variables might be modified that way, though :-)
> >>>
> >>>Yep.  For example, if it knows the current value of a given such 
> >>>local
> >>>variable, and if all code paths that would change some other variable
> >>>cannot be reached given that current value of the first variable.
> >>
> >>Or the most common thing: if neither the address of the translation-
> >>unit local variable nor the address of any function writing to that
> >>variable can "escape" from that translation unit, nothing outside
> >>the translation unit can write to the variable.
> >
> >But there is usually at least one externally callable function in
> >a .c file.
> 
> Of course, but often none of those will (indirectly) write a certain
> static variable.

But there has to be some path to the static functions, assuming that
they are not dead code.  Yes, there can be cases where the compiler
knows enough about the state of the variables to rule out some of code
paths to them, but I have no idea how often this happens in kernel
code.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:26             ` Segher Boessenkool
@ 2007-08-16  2:23               ` Nick Piggin
  2007-08-16 19:32                 ` Segher Boessenkool
  0 siblings, 1 reply; 1546+ messages in thread
From: Nick Piggin @ 2007-08-16  2:23 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Herbert Xu, heiko.carstens, horms, linux-kernel, rpjday, ak,
	netdev, cfriesen, akpm, torvalds, jesper.juhl, linux-arch, zlynx,
	satyam, clameter, schwidefsky, Chris Snook, davem, wensong,
	wjiang

Segher Boessenkool wrote:
>>> Part of the motivation here is to fix heisenbugs.  If I knew where they
>>
>>
>> By the same token we should probably disable optimisations
>> altogether since that too can create heisenbugs.
> 
> 
> Almost everything is a tradeoff; and so is this.  I don't
> believe most people would find disabling all compiler
> optimisations an acceptable price to pay for some peace
> of mind.

So why is this a good tradeoff?

I also think that just adding things to APIs in the hope it might fix
up some bugs isn't really a good road to go down. Where do you stop?

On the actual proposal to make atomic_operators volatile: I think the
better approach in the long term, for both maintainability of the
code and education of coders, is to make the use of barriers _more_
explicit rather than sprinkling these "just in case" ones around.

You may get rid of a few atomic_read heisenbugs (in noise when
compared to all bugs), but if the coder was using a regular atomic
load, or a test_bit (which is also atomic), etc. then they're going
to have problems.

It would be better for Linux if everyone was to have better awareness
of barriers than to hide some of the cases where they're required.
A pretty large number of bugs I see in lock free code in the VM is
due to memory ordering problems. It's hard to find those bugs, or
even be aware when you're writing buggy code if you don't have some
feel for barriers.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:30               ` Segher Boessenkool
@ 2007-08-16  2:30                 ` Paul E. McKenney
  2007-08-16 19:33                   ` Segher Boessenkool
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-16  2:30 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

On Thu, Aug 16, 2007 at 03:30:44AM +0200, Segher Boessenkool wrote:
> >>>Part of the motivation here is to fix heisenbugs.  If I knew where 
> >>>they
> >>
> >>By the same token we should probably disable optimisations
> >>altogether since that too can create heisenbugs.
> >
> >Precisely the point -- use of volatile (whether in casts or on asms)
> >in these cases are intended to disable those optimizations likely to
> >result in heisenbugs.
> 
> The only thing volatile on an asm does is create a side effect
> on the asm statement; in effect, it tells the compiler "do not
> remove this asm even if you don't need any of its outputs".
> 
> It's not disabling optimisation likely to result in bugs,
> heisen- or otherwise; _not_ putting the volatile on an asm
> that needs it simply _is_ a bug :-)

Yep.  And the reason it is a bug is that it fails to disable
the relevant compiler optimizations.  So I suspect that we might
actually be saying the same thing here.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:41                       ` Christoph Lameter
  2007-08-16  2:15                         ` Satyam Sharma
@ 2007-08-16  2:32                         ` Paul E. McKenney
  1 sibling, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-16  2:32 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, Aug 15, 2007 at 06:41:40PM -0700, Christoph Lameter wrote:
> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> 
> > Understood.  My point is not that the impact is precisely zero, but
> > rather that the impact on optimization is much less hurtful than the
> > problems that could arise otherwise, particularly as compilers become
> > more aggressive in their optimizations.
> 
> The problems arise because barriers are not used as required. Volatile 
> has wishy washy semantics and somehow marries memory barriers with data 
> access. It is clearer to separate the two. Conceptual cleanness usually 
> translates into better code. If one really wants the volatile then lets 
> make it explicit and use
> 
> 	atomic_read_volatile()

There are indeed architectures where you can cause gcc to emit memory
barriers in response to volatile.  I am assuming that we are -not-
making gcc do this.  Given this, then volatiles and memory barrier
instructions are orthogonal -- one controls the compiler, the other
controls the CPU.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:05                     ` Paul Mackerras
  2007-08-16  2:11                       ` Herbert Xu
  2007-08-16  2:15                       ` Christoph Lameter
@ 2007-08-16  2:33                       ` Satyam Sharma
  2007-08-16  3:01                         ` Satyam Sharma
  2007-08-16  3:05                         ` Paul Mackerras
  2 siblings, 2 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-16  2:33 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Christoph Lameter, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher



On Thu, 16 Aug 2007, Paul Mackerras wrote:

> Herbert Xu writes:
> 
> > See sk_stream_mem_schedule in net/core/stream.c:
> > 
> >         /* Under limit. */
> >         if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
> >                 if (*sk->sk_prot->memory_pressure)
> >                         *sk->sk_prot->memory_pressure = 0;
> >                 return 1;
> >         }
> > 
> >         /* Over hard limit. */
> >         if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
> >                 sk->sk_prot->enter_memory_pressure();
> >                 goto suppress_allocation;
> >         }
> > 
> > We don't need to reload sk->sk_prot->memory_allocated here.
> 
> Are you sure?  How do you know some other CPU hasn't changed the value
> in between?

I can't speak for this particular case, but there could be similar code
examples elsewhere, where we do the atomic ops on an atomic_t object
inside a higher-level locking scheme that would take care of the kind of
problem you're referring to here. It would be useful for such or similar
code if the compiler kept the value of that atomic object in a register.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:11                       ` Herbert Xu
@ 2007-08-16  2:35                         ` Paul E. McKenney
  2007-08-16  3:15                         ` Paul Mackerras
  1 sibling, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-16  2:35 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Christoph Lameter, Satyam Sharma, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 10:11:05AM +0800, Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 12:05:56PM +1000, Paul Mackerras wrote:
> > Herbert Xu writes:
> > 
> > > See sk_stream_mem_schedule in net/core/stream.c:
> > > 
> > >         /* Under limit. */
> > >         if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
> > >                 if (*sk->sk_prot->memory_pressure)
> > >                         *sk->sk_prot->memory_pressure = 0;
> > >                 return 1;
> > >         }
> > > 
> > >         /* Over hard limit. */
> > >         if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
> > >                 sk->sk_prot->enter_memory_pressure();
> > >                 goto suppress_allocation;
> > >         }
> > > 
> > > We don't need to reload sk->sk_prot->memory_allocated here.
> > 
> > Are you sure?  How do you know some other CPU hasn't changed the value
> > in between?
> 
> Yes I'm sure, because we don't care if others have increased
> the reservation.
> 
> Note that even if we did we'd be using barriers so volatile
> won't do us any good here.

If the load-coalescing is important to performance, why not load into
a local variable?

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:33                       ` Satyam Sharma
@ 2007-08-16  3:01                         ` Satyam Sharma
  2007-08-16  4:11                           ` Paul Mackerras
  2007-08-16  3:05                         ` Paul Mackerras
  1 sibling, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-16  3:01 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Christoph Lameter, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher



On Thu, 16 Aug 2007, Satyam Sharma wrote:

> On Thu, 16 Aug 2007, Paul Mackerras wrote:
> > Herbert Xu writes:
> > 
> > > See sk_stream_mem_schedule in net/core/stream.c:
> > > 
> > >         /* Under limit. */
> > >         if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
> > >                 if (*sk->sk_prot->memory_pressure)
> > >                         *sk->sk_prot->memory_pressure = 0;
> > >                 return 1;
> > >         }
> > > 
> > >         /* Over hard limit. */
> > >         if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
> > >                 sk->sk_prot->enter_memory_pressure();
> > >                 goto suppress_allocation;
> > >         }
> > > 
> > > We don't need to reload sk->sk_prot->memory_allocated here.
> > 
> > Are you sure?  How do you know some other CPU hasn't changed the value
> > in between?
> 
> I can't speak for this particular case, but there could be similar code
> examples elsewhere, where we do the atomic ops on an atomic_t object
> inside a higher-level locking scheme that would take care of the kind of
> problem you're referring to here. It would be useful for such or similar
> code if the compiler kept the value of that atomic object in a register.

We might not be using atomic_t (and ops) if we already have a higher-level
locking scheme, actually. So as Herbert mentioned, such cases might just
not care. [ Too much of this thread, too little sleep, sorry! ]

Anyway, the problem, of course, is that this conversion to a stronger /
safer-by-default behaviour doesn't happen with zero cost to performance.
Converting atomic ops to "volatile" behaviour did add ~2K to kernel text
for archs such as i386 (possibly to important codepaths) that didn't have
those semantics already so it would be constructive to actually look at
those differences and see if there were really any heisenbugs that got
rectified. Or if there were legitimate optimizations that got wrongly
disabled. Onus lies on those proposing the modifications, I'd say ;-)

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:33                       ` Satyam Sharma
  2007-08-16  3:01                         ` Satyam Sharma
@ 2007-08-16  3:05                         ` Paul Mackerras
  2007-08-16 19:39                           ` Segher Boessenkool
  1 sibling, 1 reply; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-16  3:05 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Christoph Lameter, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

Satyam Sharma writes:

> I can't speak for this particular case, but there could be similar code
> examples elsewhere, where we do the atomic ops on an atomic_t object
> inside a higher-level locking scheme that would take care of the kind of
> problem you're referring to here. It would be useful for such or similar
> code if the compiler kept the value of that atomic object in a register.

If there is a higher-level locking scheme then there is no point to
using atomic_t variables.  Atomic_t is specifically for the situation
where multiple CPUs are updating a variable without locking.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:11                       ` Herbert Xu
  2007-08-16  2:35                         ` Paul E. McKenney
@ 2007-08-16  3:15                         ` Paul Mackerras
  2007-08-16  3:43                           ` Herbert Xu
  1 sibling, 1 reply; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-16  3:15 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Paul E. McKenney, Satyam Sharma,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> > Are you sure?  How do you know some other CPU hasn't changed the value
> > in between?
> 
> Yes I'm sure, because we don't care if others have increased
> the reservation.

But others can also reduce the reservation.  Also, the code sets and
clears *sk->sk_prot->memory_pressure nonatomically with respect to the
reads of sk->sk_prot->memory_allocated, so in fact the code doesn't
guarantee any particular relationship between the two.

That code looks like a beautiful example of buggy, racy code where
someone has sprinkled magic fix-the-races dust (otherwise known as
atomic_t) around in a vain attempt to fix the races.

That's assuming that all that stuff actually performs any useful
purpose, of course, and that there isn't some lock held by the
callers.  In the latter case it is pointless using atomic_t.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:18                             ` Christoph Lameter
@ 2007-08-16  3:23                               ` Paul Mackerras
  2007-08-16  3:33                                 ` Herbert Xu
                                                   ` (2 more replies)
  0 siblings, 3 replies; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-16  3:23 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Herbert Xu, Satyam Sharma, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

Christoph Lameter writes:

> > But I have to say that I still don't know of a single place
> > where one would actually use the volatile variant.
> 
> I suspect that what you say is true after we have looked at all callers.

It seems that there could be a lot of places where atomic_t is used in
a non-atomic fashion, and that those uses are either buggy, or there
is some lock held at the time which guarantees that other CPUs aren't
changing the value.  In both cases there is no point in using
atomic_t; we might as well just use an ordinary int.

In particular, atomic_read seems to lend itself to buggy uses.  People
seem to do things like:

	atomic_add(&v, something);
	if (atomic_read(&v) > something_else) ...

and expect that there is some relationship between the value that the
atomic_add stored and the value that the atomic_read will return,
which there isn't.  People seem to think that using atomic_t magically
gets rid of races.  It doesn't.

I'd go so far as to say that anywhere where you want a non-"volatile"
atomic_read, either your code is buggy, or else an int would work just
as well.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:23                               ` Paul Mackerras
@ 2007-08-16  3:33                                 ` Herbert Xu
  2007-08-16  3:48                                   ` Paul Mackerras
  2007-08-16 18:48                                 ` Christoph Lameter
  2007-08-16 19:44                                 ` Segher Boessenkool
  2 siblings, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  3:33 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 01:23:06PM +1000, Paul Mackerras wrote:
>
> In particular, atomic_read seems to lend itself to buggy uses.  People
> seem to do things like:
> 
> 	atomic_add(&v, something);
> 	if (atomic_read(&v) > something_else) ...

If you're referring to the code in sk_stream_mem_schedule
then it's working as intended.  The atomicity guarantees
that the atomic_add/atomic_sub won't be seen in parts by
other readers.

We certainly do not need to see other atomic_add/atomic_sub
operations immediately.

If you're referring to another code snippet please cite.

> I'd go so far as to say that anywhere where you want a non-"volatile"
> atomic_read, either your code is buggy, or else an int would work just
> as well.

An int won't work here because += and -= do not have the
atomicity guarantees that atomic_add/atomic_sub do.  In
particular, this may cause an atomic_read on another CPU
to give a bogus reading.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 12:31       ` Satyam Sharma
                           ` (2 preceding siblings ...)
  2007-08-15 23:22         ` Paul Mackerras
@ 2007-08-16  3:37         ` Bill Fink
  2007-08-16  5:20           ` Satyam Sharma
  3 siblings, 1 reply; 1546+ messages in thread
From: Bill Fink @ 2007-08-16  3:37 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

On Wed, 15 Aug 2007, Satyam Sharma wrote:

> (C)
> $ cat tp3.c
> int a;
> 
> void func(void)
> {
> 	*(volatile int *)&a = 10;
> 	*(volatile int *)&a = 20;
> }
> $ gcc -Os -S tp3.c
> $ cat tp3.s
> ...
> movl    $10, a
> movl    $20, a
> ...

I'm curious about one minor tangential point.  Why, instead of:

	b = *(volatile int *)&a;

why can't this just be expressed as:

	b = (volatile int)a;

Isn't it the contents of a that's volatile, i.e. it's value can change
invisibly to the compiler, and that's why you want to force a read from
memory?  Why do you need the "*(volatile int *)&" construct?

						-Bill

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:15                         ` Paul Mackerras
@ 2007-08-16  3:43                           ` Herbert Xu
  0 siblings, 0 replies; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  3:43 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Paul E. McKenney, Satyam Sharma,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 01:15:05PM +1000, Paul Mackerras wrote:
> 
> But others can also reduce the reservation.  Also, the code sets and
> clears *sk->sk_prot->memory_pressure nonatomically with respect to the
> reads of sk->sk_prot->memory_allocated, so in fact the code doesn't
> guarantee any particular relationship between the two.

Yes others can reduce the reservation, but the point of this
is that the code doesn't care.  We'll either see the value
before or after the reduction and in either case we'll do
something sensible.

The worst that can happen is when we're just below the hard
limit and multiple CPUs fail to allocate but that's not really
a problem because if the machine is making progress at all
then we will eventually scale back and allow these allocations
to succeed.

As to the non-atomic operation on memory_pressue, that's OK
because we only ever assign values to it and never do other
operations such as += or -=.  Remember that int/long assignments
must be atomic or Linux won't run on your architecture.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:33                                 ` Herbert Xu
@ 2007-08-16  3:48                                   ` Paul Mackerras
  2007-08-16  4:03                                     ` Herbert Xu
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-16  3:48 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> If you're referring to the code in sk_stream_mem_schedule
> then it's working as intended.  The atomicity guarantees

You mean it's intended that *sk->sk_prot->memory_pressure can end up
as 1 when sk->sk_prot->memory_allocated is small (less than
->sysctl_mem[0]), or as 0 when ->memory_allocated is large (greater
than ->sysctl_mem[2])?  Because that's the effect of the current code.
If so I wonder why you bother computing it.

> that the atomic_add/atomic_sub won't be seen in parts by
> other readers.
> 
> We certainly do not need to see other atomic_add/atomic_sub
> operations immediately.
> 
> If you're referring to another code snippet please cite.
> 
> > I'd go so far as to say that anywhere where you want a non-"volatile"
> > atomic_read, either your code is buggy, or else an int would work just
> > as well.
> 
> An int won't work here because += and -= do not have the
> atomicity guarantees that atomic_add/atomic_sub do.  In
> particular, this may cause an atomic_read on another CPU
> to give a bogus reading.

The point is that guaranteeing the atomicity of the increment or
decrement does not suffice to make the code race-free.  In this case
the race arises from the fact that reading ->memory_allocated and
setting *->memory_pressure are separate operations.  To make that code
work properly you need a lock.  And once you have the lock an ordinary
int would suffice for ->memory_allocated.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:48                                   ` Paul Mackerras
@ 2007-08-16  4:03                                     ` Herbert Xu
  2007-08-16  4:34                                       ` Paul Mackerras
  0 siblings, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  4:03 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 01:48:32PM +1000, Paul Mackerras wrote:
> Herbert Xu writes:
> 
> > If you're referring to the code in sk_stream_mem_schedule
> > then it's working as intended.  The atomicity guarantees
> 
> You mean it's intended that *sk->sk_prot->memory_pressure can end up
> as 1 when sk->sk_prot->memory_allocated is small (less than
> ->sysctl_mem[0]), or as 0 when ->memory_allocated is large (greater
> than ->sysctl_mem[2])?  Because that's the effect of the current code.
> If so I wonder why you bother computing it.

You need to remember that there are three different limits:
minimum, pressure, and maximum.  By default we should never
be in a situation where what you say can occur.

If you set all three limits to the same thing, then yes it
won't work as intended but it's still well-behaved.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:01                         ` Satyam Sharma
@ 2007-08-16  4:11                           ` Paul Mackerras
  2007-08-16  5:39                             ` Herbert Xu
  2007-08-16 18:54                             ` Christoph Lameter
  0 siblings, 2 replies; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-16  4:11 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Christoph Lameter, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

Satyam Sharma writes:

> Anyway, the problem, of course, is that this conversion to a stronger /
> safer-by-default behaviour doesn't happen with zero cost to performance.
> Converting atomic ops to "volatile" behaviour did add ~2K to kernel text
> for archs such as i386 (possibly to important codepaths) that didn't have
> those semantics already so it would be constructive to actually look at
> those differences and see if there were really any heisenbugs that got
> rectified. Or if there were legitimate optimizations that got wrongly
> disabled. Onus lies on those proposing the modifications, I'd say ;-)

The uses of atomic_read where one might want it to allow caching of
the result seem to me to fall into 3 categories:

1. Places that are buggy because of a race arising from the way it's
   used.

2. Places where there is a race but it doesn't matter because we're
   doing some clever trick.

3. Places where there is some locking in place that eliminates any
   potential race.

In case 1, adding volatile won't solve the race, of course, but it's
hard to argue that we shouldn't do something because it will slow down
buggy code.  Case 2 is hopefully pretty rare and accompanied by large
comment blocks, and in those cases caching the result of atomic_read
explicitly in a local variable would probably make the code clearer.
And in case 3 there is no reason to use atomic_t at all; we might as
well just use an int.

So I don't see any good reason to make the atomic API more complex by
having "volatile" and "non-volatile" versions of atomic_read.  It
should just have the "volatile" behaviour.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  4:03                                     ` Herbert Xu
@ 2007-08-16  4:34                                       ` Paul Mackerras
  2007-08-16  5:37                                         ` Herbert Xu
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-16  4:34 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> > You mean it's intended that *sk->sk_prot->memory_pressure can end up
> > as 1 when sk->sk_prot->memory_allocated is small (less than
> > ->sysctl_mem[0]), or as 0 when ->memory_allocated is large (greater
> > than ->sysctl_mem[2])?  Because that's the effect of the current code.
> > If so I wonder why you bother computing it.
> 
> You need to remember that there are three different limits:
> minimum, pressure, and maximum.  By default we should never
> be in a situation where what you say can occur.
> 
> If you set all three limits to the same thing, then yes it
> won't work as intended but it's still well-behaved.

I'm not talking about setting all three limits to the same thing.

I'm talking about this situation:

CPU 0 comes into __sk_stream_mem_reclaim, reads memory_allocated, but
then before it can do the store to *memory_pressure, CPUs 1-1023 all
go through sk_stream_mem_schedule, collectively increase
memory_allocated to more than sysctl_mem[2] and set *memory_pressure.
Finally CPU 0 gets to do its store and it sets *memory_pressure back
to 0, but by this stage memory_allocated is way larger than
sysctl_mem[2].

Yes, it's unlikely, but that is the nature of race conditions - they
are unlikely, and only show up at inconvenient times, never when
someone who could fix the bug is watching. :)

Similarly it would be possible for other CPUs to decrease
memory_allocated from greater than sysctl_mem[2] to less than
sysctl_mem[0] in the interval between when we read memory_allocated
and set *memory_pressure to 1.  And it's quite possible for their
setting of *memory_pressure to 0 to happen before our setting of it to
1, so that it ends up at 1 when it should be 0.

Now, maybe it's the case that it doesn't really matter whether
*->memory_pressure is 0 or 1.  But if so, why bother computing it at
all?

People seem to think that using atomic_t means they don't need to use
a spinlock.  That's fine if there is only one variable involved, but
as soon as there's more than one, there's the possibility of a race,
whether or not you use atomic_t, and whether or not atomic_read has
"volatile" behaviour.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:37         ` Bill Fink
@ 2007-08-16  5:20           ` Satyam Sharma
  2007-08-16  5:57             ` Satyam Sharma
  2007-08-16 20:50             ` Segher Boessenkool
  0 siblings, 2 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-16  5:20 UTC (permalink / raw)
  To: Bill Fink
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

Hi Bill,

On Wed, 15 Aug 2007, Bill Fink wrote:

> On Wed, 15 Aug 2007, Satyam Sharma wrote:
> 
> > (C)
> > $ cat tp3.c
> > int a;
> > 
> > void func(void)
> > {
> > 	*(volatile int *)&a = 10;
> > 	*(volatile int *)&a = 20;
> > }
> > $ gcc -Os -S tp3.c
> > $ cat tp3.s
> > ...
> > movl    $10, a
> > movl    $20, a
> > ...
> 
> I'm curious about one minor tangential point.  Why, instead of:
> 
> 	b = *(volatile int *)&a;
> 
> why can't this just be expressed as:
> 
> 	b = (volatile int)a;
> 
> Isn't it the contents of a that's volatile, i.e. it's value can change
> invisibly to the compiler, and that's why you want to force a read from
> memory?  Why do you need the "*(volatile int *)&" construct?

"b = (volatile int)a;" doesn't help us because a cast to a qualified type
has the same effect as a cast to an unqualified version of that type, as
mentioned in 6.5.4:4 (footnote 86) of the standard. Note that "volatile"
is a type-qualifier, not a type itself, so a cast of the _object_ itself
to a qualified-type i.e. (volatile int) would not make the access itself
volatile-qualified.

To serve our purposes, it is necessary for us to take the address of this
(non-volatile) object, cast the resulting _pointer_ to the corresponding
volatile-qualified pointer-type, and then dereference it. This makes that
particular _access_ be volatile-qualified, without the object itself being
such. Also note that the (dereferenced) result is also a valid lvalue and
hence can be used in "*(volatile int *)&a = b;" kind of construction
(which we use for the atomic_set case).

Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  4:34                                       ` Paul Mackerras
@ 2007-08-16  5:37                                         ` Herbert Xu
  2007-08-16  6:00                                           ` Paul Mackerras
  0 siblings, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  5:37 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 02:34:25PM +1000, Paul Mackerras wrote:
>
> I'm talking about this situation:
> 
> CPU 0 comes into __sk_stream_mem_reclaim, reads memory_allocated, but
> then before it can do the store to *memory_pressure, CPUs 1-1023 all
> go through sk_stream_mem_schedule, collectively increase
> memory_allocated to more than sysctl_mem[2] and set *memory_pressure.
> Finally CPU 0 gets to do its store and it sets *memory_pressure back
> to 0, but by this stage memory_allocated is way larger than
> sysctl_mem[2].

It doesn't matter.  The memory pressure flag is an *advisory*
flag.  If we get it wrong the worst that'll happen is that we'd
waste some time doing work that'll be thrown away.

Please look at the places where it's used before jumping to
conclusions.

> Now, maybe it's the case that it doesn't really matter whether
> *->memory_pressure is 0 or 1.  But if so, why bother computing it at
> all?

As long as we get it right most of the time (and I think you
would agree that we do get it right most of the time), then
this flag has achieved its purpose.

> People seem to think that using atomic_t means they don't need to use
> a spinlock.  That's fine if there is only one variable involved, but
> as soon as there's more than one, there's the possibility of a race,
> whether or not you use atomic_t, and whether or not atomic_read has
> "volatile" behaviour.

In any case, this actually illustrates why the addition of
volatile is completely pointless.  Even if this code was
broken, which it definitely is not, having the volatile
there wouldn't have helped at all.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  4:11                           ` Paul Mackerras
@ 2007-08-16  5:39                             ` Herbert Xu
  2007-08-16  6:56                               ` Paul Mackerras
  2007-08-16 18:54                             ` Christoph Lameter
  1 sibling, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  5:39 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 02:11:43PM +1000, Paul Mackerras wrote:
>
> The uses of atomic_read where one might want it to allow caching of
> the result seem to me to fall into 3 categories:
> 
> 1. Places that are buggy because of a race arising from the way it's
>    used.
> 
> 2. Places where there is a race but it doesn't matter because we're
>    doing some clever trick.
> 
> 3. Places where there is some locking in place that eliminates any
>    potential race.

Agreed.

> In case 1, adding volatile won't solve the race, of course, but it's
> hard to argue that we shouldn't do something because it will slow down
> buggy code.  Case 2 is hopefully pretty rare and accompanied by large
> comment blocks, and in those cases caching the result of atomic_read
> explicitly in a local variable would probably make the code clearer.
> And in case 3 there is no reason to use atomic_t at all; we might as
> well just use an int.

Since adding volatile doesn't help any of the 3 cases, and
takes away optimisations from both 2 and 3, I wonder what
is the point of the addition after all?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  5:20           ` Satyam Sharma
@ 2007-08-16  5:57             ` Satyam Sharma
  2007-08-16  9:25               ` Satyam Sharma
  2007-08-16 21:00               ` Segher Boessenkool
  2007-08-16 20:50             ` Segher Boessenkool
  1 sibling, 2 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-16  5:57 UTC (permalink / raw)
  To: Bill Fink
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney



On Thu, 16 Aug 2007, Satyam Sharma wrote:

> Hi Bill,
> 
> 
> On Wed, 15 Aug 2007, Bill Fink wrote:
> 
> > On Wed, 15 Aug 2007, Satyam Sharma wrote:
> > 
> > > (C)
> > > $ cat tp3.c
> > > int a;
> > > 
> > > void func(void)
> > > {
> > > 	*(volatile int *)&a = 10;
> > > 	*(volatile int *)&a = 20;
> > > }
> > > $ gcc -Os -S tp3.c
> > > $ cat tp3.s
> > > ...
> > > movl    $10, a
> > > movl    $20, a
> > > ...
> > 
> > I'm curious about one minor tangential point.  Why, instead of:
> > 
> > 	b = *(volatile int *)&a;
> > 
> > why can't this just be expressed as:
> > 
> > 	b = (volatile int)a;
> > 
> > Isn't it the contents of a that's volatile, i.e. it's value can change
> > invisibly to the compiler, and that's why you want to force a read from
> > memory?  Why do you need the "*(volatile int *)&" construct?
> 
> "b = (volatile int)a;" doesn't help us because a cast to a qualified type
> has the same effect as a cast to an unqualified version of that type, as
> mentioned in 6.5.4:4 (footnote 86) of the standard. Note that "volatile"
> is a type-qualifier, not a type itself, so a cast of the _object_ itself
> to a qualified-type i.e. (volatile int) would not make the access itself
> volatile-qualified.
> 
> To serve our purposes, it is necessary for us to take the address of this
> (non-volatile) object, cast the resulting _pointer_ to the corresponding
> volatile-qualified pointer-type, and then dereference it. This makes that
> particular _access_ be volatile-qualified, without the object itself being
> such. Also note that the (dereferenced) result is also a valid lvalue and
> hence can be used in "*(volatile int *)&a = b;" kind of construction
> (which we use for the atomic_set case).

Here, I should obviously admit that the semantics of *(volatile int *)&
aren't any neater or well-defined in the _language standard_ at all. The
standard does say (verbatim) "precisely what constitutes as access to
object of volatile-qualified type is implementation-defined", but GCC
does help us out here by doing the right thing. Accessing the non-volatile
object there using the volatile-qualified pointer-type cast makes GCC
treat the object stored at that memory address itself as if it were a 
volatile object, thus making the access end up having what we're calling
as "volatility" semantics here.

Honestly, given such confusion, and the propensity of the "volatile"
type-qualifier keyword to be ill-defined (or at least poorly understood,
often inconsistently implemented), I'd (again) express my opinion that it
would be best to avoid its usage, given other alternatives do exist.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  5:37                                         ` Herbert Xu
@ 2007-08-16  6:00                                           ` Paul Mackerras
  2007-08-16 18:50                                             ` Christoph Lameter
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-16  6:00 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> It doesn't matter.  The memory pressure flag is an *advisory*
> flag.  If we get it wrong the worst that'll happen is that we'd
> waste some time doing work that'll be thrown away.

Ah, so it's the "racy but I don't care because it's only an
optimization" case.  That's fine.  Somehow I find it hard to believe
that all the racy uses of atomic_read in the kernel are like that,
though. :)

> In any case, this actually illustrates why the addition of
> volatile is completely pointless.  Even if this code was
> broken, which it definitely is not, having the volatile
> there wouldn't have helped at all.

Yes, adding volatile to racy code doesn't somehow make it race-free.
Neither does using atomic_t, despite what some seem to believe.

I have actually started going through all the uses of atomic_read in
the kernel.  So far out of the first 100 I have found none where we
have two atomic_reads of the same variable and the compiler could
usefully use the value from the first as the result of the second.
But there's still > 2500 to go...

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  5:39                             ` Herbert Xu
@ 2007-08-16  6:56                               ` Paul Mackerras
  2007-08-16  7:09                                 ` Herbert Xu
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-16  6:56 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> On Thu, Aug 16, 2007 at 02:11:43PM +1000, Paul Mackerras wrote:
> >
> > The uses of atomic_read where one might want it to allow caching of
> > the result seem to me to fall into 3 categories:
> > 
> > 1. Places that are buggy because of a race arising from the way it's
> >    used.
> > 
> > 2. Places where there is a race but it doesn't matter because we're
> >    doing some clever trick.
> > 
> > 3. Places where there is some locking in place that eliminates any
> >    potential race.
> 
> Agreed.
> 
> > In case 1, adding volatile won't solve the race, of course, but it's
> > hard to argue that we shouldn't do something because it will slow down
> > buggy code.  Case 2 is hopefully pretty rare and accompanied by large
> > comment blocks, and in those cases caching the result of atomic_read
> > explicitly in a local variable would probably make the code clearer.
> > And in case 3 there is no reason to use atomic_t at all; we might as
> > well just use an int.
> 
> Since adding volatile doesn't help any of the 3 cases, and
> takes away optimisations from both 2 and 3, I wonder what
> is the point of the addition after all?

Note that I said these are the cases _where one might want to allow
caching_, so of course adding volatile doesn't help _these_ cases.
There are of course other cases where one definitely doesn't want to
allow the compiler to cache the value, such as when polling an atomic
variable waiting for another CPU to change it, and from my inspection
so far these cases seem to be the majority.

The reasons for having "volatile" behaviour of atomic_read (whether or
not that is achieved by use of the "volatile" C keyword) are

- It matches the normal expectation based on the name "atomic_read"
- It matches the behaviour of the other atomic_* primitives
- It avoids bugs in the cases where "volatile" behaviour is required

To my mind these outweigh the small benefit for some code of the
non-volatile (caching-allowed) behaviour.  In fact it's pretty minor
either way, and since x86[-64] has this behaviour, one can expect the
potential bugs in generic code to have mostly been found, although
perhaps not all of them since x86[-64] has less aggressive reordering
of memory accesses and fewer registers in which to cache things than
some other architectures.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  6:56                               ` Paul Mackerras
@ 2007-08-16  7:09                                 ` Herbert Xu
  2007-08-16  8:06                                   ` Stefan Richter
  2007-08-16 14:48                                   ` Ilpo Järvinen
  0 siblings, 2 replies; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  7:09 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 04:56:21PM +1000, Paul Mackerras wrote:
>
> Note that I said these are the cases _where one might want to allow
> caching_, so of course adding volatile doesn't help _these_ cases.
> There are of course other cases where one definitely doesn't want to
> allow the compiler to cache the value, such as when polling an atomic
> variable waiting for another CPU to change it, and from my inspection
> so far these cases seem to be the majority.

We've been through that already.  If it's a busy-wait it
should use cpu_relax.  If it's scheduling away that already
forces the compiler to reread anyway.

Do you have an actual example where volatile is needed?

> - It matches the normal expectation based on the name "atomic_read"
> - It matches the behaviour of the other atomic_* primitives

Can't argue since you left out what those expectations
or properties are.

> - It avoids bugs in the cases where "volatile" behaviour is required

Do you (or anyone else for that matter) have an example of this?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  7:09                                 ` Herbert Xu
@ 2007-08-16  8:06                                   ` Stefan Richter
  2007-08-16  8:10                                     ` Herbert Xu
  2007-08-16 14:48                                   ` Ilpo Järvinen
  1 sibling, 1 reply; 1546+ messages in thread
From: Stefan Richter @ 2007-08-16  8:06 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 04:56:21PM +1000, Paul Mackerras wrote:
>>
>> Note that I said these are the cases _where one might want to allow
>> caching_, so of course adding volatile doesn't help _these_ cases.
>> There are of course other cases where one definitely doesn't want to
>> allow the compiler to cache the value, such as when polling an atomic
>> variable waiting for another CPU to change it, and from my inspection
>> so far these cases seem to be the majority.
> 
> We've been through that already.  If it's a busy-wait it
> should use cpu_relax.  If it's scheduling away that already
> forces the compiler to reread anyway.
> 
> Do you have an actual example where volatile is needed?
> 
>> - It matches the normal expectation based on the name "atomic_read"
>> - It matches the behaviour of the other atomic_* primitives
> 
> Can't argue since you left out what those expectations
> or properties are.

We use atomic_t for data that is concurrently locklessly written and
read at arbitrary times.  My naive expectation as driver author (driver
maintainer) is that all atomic_t accessors, including atomic_read, (and
atomic bitops) work with the then current value of the atomic data.

>> - It avoids bugs in the cases where "volatile" behaviour is required
> 
> Do you (or anyone else for that matter) have an example of this?

The only code I somewhat know, the ieee1394 subsystem, was perhaps
authored and is currently maintained with the expectation that each
occurrence of atomic_read actually results in a load operation, i.e. is
not optimized away.  This means all atomic_t (bus generation, packet and
buffer refcounts, and some other state variables)* and likewise all
atomic bitops in that subsystem.

If that assumption is wrong, then what is the API or language primitive
to force a load operation to occur?


*)  Interesting what a quick LXR session in search for all atomic_t
usages in 'my' subsystem brings to light.  I now noticed an apparently
unused struct member in the bitrotting pcilynx driver, and more
importantly, a pairing of two atomic_t variables in raw1394 that should
be audited for race conditions and for possible replacement by plain int.
-- 
Stefan Richter
-=====-=-=== =--- =----
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  8:06                                   ` Stefan Richter
@ 2007-08-16  8:10                                     ` Herbert Xu
  2007-08-16  9:54                                       ` Stefan Richter
                                                         ` (2 more replies)
  0 siblings, 3 replies; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16  8:10 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 10:06:31AM +0200, Stefan Richter wrote:
> > 
> > Do you (or anyone else for that matter) have an example of this?
> 
> The only code I somewhat know, the ieee1394 subsystem, was perhaps
> authored and is currently maintained with the expectation that each
> occurrence of atomic_read actually results in a load operation, i.e. is
> not optimized away.  This means all atomic_t (bus generation, packet and
> buffer refcounts, and some other state variables)* and likewise all
> atomic bitops in that subsystem.

Can you find an actual atomic_read code snippet there that is
broken without the volatile modifier?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  5:57             ` Satyam Sharma
@ 2007-08-16  9:25               ` Satyam Sharma
  2007-08-16 21:00               ` Segher Boessenkool
  1 sibling, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-16  9:25 UTC (permalink / raw)
  To: Bill Fink
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

[ Bill tells me in private communication he gets this already, but I
  think it's more complicated than the shoddy explanation I'd made
  earlier so would wish to make this clearer in detail one last time,
  for the benefit of others listening in or reading the archives. ]

On Thu, 16 Aug 2007, Satyam Sharma wrote:

> On Thu, 16 Aug 2007, Satyam Sharma wrote:
> [...]
> > On Wed, 15 Aug 2007, Bill Fink wrote:
> > > [...]
> > > I'm curious about one minor tangential point.  Why, instead of:
> > > 
> > > 	b = *(volatile int *)&a;
> > > 
> > > why can't this just be expressed as:
> > > 
> > > 	b = (volatile int)a;
> > > 
> > > Isn't it the contents of a that's volatile, i.e. it's value can change
> > > invisibly to the compiler, and that's why you want to force a read from
> > > memory?  Why do you need the "*(volatile int *)&" construct?
> > 
> > "b = (volatile int)a;" doesn't help us because a cast to a qualified type
> > has the same effect as a cast to an unqualified version of that type, as
> > mentioned in 6.5.4:4 (footnote 86) of the standard. Note that "volatile"
> > is a type-qualifier, not a type itself, so a cast of the _object_ itself
> > to a qualified-type i.e. (volatile int) would not make the access itself
> > volatile-qualified.

Casts don't produce lvalues, and the cast ((volatile int)a) does not
produce the object-int-a-qualified-as-"volatile" -- in fact, the
result of the above cast is whatever is the _value_ of "int a", with
the access to that object having _already_ taken place, as per the
actual type-qualification of the object (that was originally declared
as being _non-volatile_, in fact). Hence, defining atomic_read() as:

#define atomic_read(v)          ((volatile int)((v)->counter))

would be buggy and not give "volatility" semantics at all, unless the
"counter" object itself isn't volatile-qualified already (which it
isn't).

The result of the cast itself being the _value_ of the int object, and
not the object itself (i.e., not an lvalue), is thereby independent of
type-qualification in that cast itself (it just wouldn't make any
difference), hence the "cast to a qualified type has the same effect
as a cast to an unqualified version of that type" bit in section 6.5.4:4
of the standard.

> > To serve our purposes, it is necessary for us to take the address of this
> > (non-volatile) object, cast the resulting _pointer_ to the corresponding
> > volatile-qualified pointer-type, and then dereference it. This makes that
> > particular _access_ be volatile-qualified, without the object itself being
> > such. Also note that the (dereferenced) result is also a valid lvalue and
> > hence can be used in "*(volatile int *)&a = b;" kind of construction
> > (which we use for the atomic_set case).

Dereferencing using the *(pointer-type-cast)& construct, OTOH, serves
us well:

#define atomic_read(v)          (*(volatile int *)&(v)->counter)

Firstly, note that the cast here being (volatile int *) and not
(int * volatile) qualifies the type of the _object_ being pointed to
by the pointer in question as being volatile-qualified, and not the
pointer itself (6.2.5:27 of the standard, and 6.3.2.3:2 allows us to
convert from a pointer-to-non-volatile-qualified-int to a pointer-to-
volatile-qualified-int, which suits us just fine) -- but note that
the _access_ to that address itself has not yet occurred.

_After_ specifying the memory address as containing a volatile-qualified-
int-type object, (and GCC co-operates as mentioned below), we proceed to
dereference it, which is when the _actual access_ occurs, therefore with
"volatility" semantics this time.

Interesting.

> Here, I should obviously admit that the semantics of *(volatile int *)&
> aren't any neater or well-defined in the _language standard_ at all. The
> standard does say (verbatim) "precisely what constitutes as access to
> object of volatile-qualified type is implementation-defined", but GCC
> does help us out here by doing the right thing. Accessing the non-volatile
> object there using the volatile-qualified pointer-type cast makes GCC
> treat the object stored at that memory address itself as if it were a 
> volatile object, thus making the access end up having what we're calling
> as "volatility" semantics here.
> 
> Honestly, given such confusion, and the propensity of the "volatile"
> type-qualifier keyword to be ill-defined (or at least poorly understood,
> often inconsistently implemented), I'd (again) express my opinion that it
> would be best to avoid its usage, given other alternatives do exist.

Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  8:10                                     ` Herbert Xu
@ 2007-08-16  9:54                                       ` Stefan Richter
  2007-08-16 10:31                                         ` Stefan Richter
  2007-08-16 10:35                                         ` Herbert Xu
  2007-08-16 19:48                                       ` Chris Snook
  2007-08-17  5:09                                       ` Paul Mackerras
  2 siblings, 2 replies; 1546+ messages in thread
From: Stefan Richter @ 2007-08-16  9:54 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 10:06:31AM +0200, Stefan Richter wrote:
>> > 
>> > Do you (or anyone else for that matter) have an example of this?
>> 
>> The only code I somewhat know, the ieee1394 subsystem, was perhaps
>> authored and is currently maintained with the expectation that each
>> occurrence of atomic_read actually results in a load operation, i.e. is
>> not optimized away.  This means all atomic_t (bus generation, packet and
>> buffer refcounts, and some other state variables)* and likewise all
>> atomic bitops in that subsystem.
> 
> Can you find an actual atomic_read code snippet there that is
> broken without the volatile modifier?

What do I have to look for?  atomic_read after another read or write
access to the same variable, in the same function scope?  Or in the sum
of scopes of functions that could be merged by function inlining?

One example was discussed here earlier:  The for (;;) loop in
nodemgr_host_thread.  There an msleep_interruptible implicitly acted as
barrier (at the moment because it's in a different translation unit; if
it were the same, then because it hopefully has own barriers).  So that
happens to work, although such an implicit barrier is bad style:  Better
enforce the desired behaviour (== guaranteed load operation) *explicitly*.
-- 
Stefan Richter
-=====-=-=== =--- =----
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  9:54                                       ` Stefan Richter
@ 2007-08-16 10:31                                         ` Stefan Richter
  2007-08-16 10:42                                           ` Herbert Xu
  2007-08-16 10:35                                         ` Herbert Xu
  1 sibling, 1 reply; 1546+ messages in thread
From: Stefan Richter @ 2007-08-16 10:31 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

I wrote:
> Herbert Xu wrote:
>> On Thu, Aug 16, 2007 at 10:06:31AM +0200, Stefan Richter wrote:
[...]
>>> expectation that each
>>> occurrence of atomic_read actually results in a load operation, i.e. is
>>> not optimized away.
[...]
>> Can you find an actual atomic_read code snippet there that is
>> broken without the volatile modifier?

PS:  Just to clarify, I'm not speaking for the volatile modifier.  I'm
not speaking for any particular implementation of atomic_t and its
accessors at all.  All I am saying is that
  - we use atomically accessed data types because we concurrently but
    locklessly access this data,
  - hence a read access to this data that could be optimized away
    makes *no sense at all*.

The only sensible read accessor to an atomic datatype is a read accessor
that will not be optimized away.

So, the architecture guys can implement atomic_read however they want
--- as long as it cannot be optimized away.*

PPS:  If somebody has code where he can afford to let the compiler
coalesce atomic_read with a previous access to the same data, i.e.
doesn't need and doesn't want all guarantees that the atomic_read API
makes (or IMO should make), then he can replace the atomic_read by a
local temporary variable.


*) Exceptions:
	if (known_to_be_false)
		read_access(a);
and the like.
-- 
Stefan Richter
-=====-=-=== =--- =----
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  9:54                                       ` Stefan Richter
  2007-08-16 10:31                                         ` Stefan Richter
@ 2007-08-16 10:35                                         ` Herbert Xu
  1 sibling, 0 replies; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16 10:35 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 11:54:44AM +0200, Stefan Richter wrote:
> 
> One example was discussed here earlier:  The for (;;) loop in
> nodemgr_host_thread.  There an msleep_interruptible implicitly acted as
> barrier (at the moment because it's in a different translation unit; if
> it were the same, then because it hopefully has own barriers).  So that
> happens to work, although such an implicit barrier is bad style:  Better
> enforce the desired behaviour (== guaranteed load operation) *explicitly*.

Hmm, it's not bad style at all.  Let's assume that everything
is in the same scope.  Such a loop must either call a function
that busy-waits, which should always have a cpu_relax or
something equivalent, or it'll call a function that schedules
away which immediately invalidates any values the compiler might
have cached for the atomic_read.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 10:31                                         ` Stefan Richter
@ 2007-08-16 10:42                                           ` Herbert Xu
  2007-08-16 16:34                                             ` Paul E. McKenney
  2007-08-17  5:04                                             ` Paul Mackerras
  0 siblings, 2 replies; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16 10:42 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 12:31:03PM +0200, Stefan Richter wrote:
> 
> PS:  Just to clarify, I'm not speaking for the volatile modifier.  I'm
> not speaking for any particular implementation of atomic_t and its
> accessors at all.  All I am saying is that
>   - we use atomically accessed data types because we concurrently but
>     locklessly access this data,
>   - hence a read access to this data that could be optimized away
>     makes *no sense at all*.

No sane compiler can optimise away an atomic_read per se.
That's only possible if there's a preceding atomic_set or
atomic_read, with no barriers in the middle.

If that's the case, then one has to conclude that doing
away with the second read is acceptable, as otherwise
a memory (or at least a compiler) barrier should have been
used.

In fact, volatile doesn't guarantee that the memory gets
read anyway.  You might be reading some stale value out
of the cache.  Granted this doesn't happen on x86 but
when you're coding for the kernel you can't make such
assumptions.

So the point here is that if you don't mind getting a stale
value from the CPU cache when doing an atomic_read, then
surely you won't mind getting a stale value from the compiler
"cache".

> So, the architecture guys can implement atomic_read however they want
> --- as long as it cannot be optimized away.*

They can implement it however they want as long as it stays
atomic.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  7:09                                 ` Herbert Xu
  2007-08-16  8:06                                   ` Stefan Richter
@ 2007-08-16 14:48                                   ` Ilpo Järvinen
  2007-08-16 16:19                                     ` Stefan Richter
                                                       ` (2 more replies)
  1 sibling, 3 replies; 1546+ messages in thread
From: Ilpo Järvinen @ 2007-08-16 14:48 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On Thu, 16 Aug 2007, Herbert Xu wrote:

> We've been through that already.  If it's a busy-wait it
> should use cpu_relax. 

I looked around a bit by using some command lines and ended up wondering 
if these are equal to busy-wait case (and should be fixed) or not:

./drivers/telephony/ixj.c
6674:   while (atomic_read(&j->DSPWrite) > 0)
6675-           atomic_dec(&j->DSPWrite);

...besides that, there are couple of more similar cases in the same file 
(with braces)...


-- 
 i.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 14:48                                   ` Ilpo Järvinen
@ 2007-08-16 16:19                                     ` Stefan Richter
  2007-08-16 19:55                                     ` Chris Snook
  2007-08-16 19:55                                     ` Chris Snook
  2 siblings, 0 replies; 1546+ messages in thread
From: Stefan Richter @ 2007-08-16 16:19 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Herbert Xu, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Ilpo Järvinen wrote:
> I looked around a bit by using some command lines and ended up wondering 
> if these are equal to busy-wait case (and should be fixed) or not:
> 
> ./drivers/telephony/ixj.c
> 6674:   while (atomic_read(&j->DSPWrite) > 0)
> 6675-           atomic_dec(&j->DSPWrite);
> 
> ...besides that, there are couple of more similar cases in the same file 
> (with braces)...

Generally, ixj.c has several occurrences of couples of atomic write and
atomic read which potentially do not do what the author wanted.
-- 
Stefan Richter
-=====-=-=== =--- =----
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 10:42                                           ` Herbert Xu
@ 2007-08-16 16:34                                             ` Paul E. McKenney
  2007-08-16 23:59                                               ` Herbert Xu
  2007-08-17  3:15                                               ` Nick Piggin
  2007-08-17  5:04                                             ` Paul Mackerras
  1 sibling, 2 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-16 16:34 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stefan Richter, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 06:42:50PM +0800, Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 12:31:03PM +0200, Stefan Richter wrote:
> > 
> > PS:  Just to clarify, I'm not speaking for the volatile modifier.  I'm
> > not speaking for any particular implementation of atomic_t and its
> > accessors at all.  All I am saying is that
> >   - we use atomically accessed data types because we concurrently but
> >     locklessly access this data,
> >   - hence a read access to this data that could be optimized away
> >     makes *no sense at all*.
> 
> No sane compiler can optimise away an atomic_read per se.
> That's only possible if there's a preceding atomic_set or
> atomic_read, with no barriers in the middle.
> 
> If that's the case, then one has to conclude that doing
> away with the second read is acceptable, as otherwise
> a memory (or at least a compiler) barrier should have been
> used.

The compiler can also reorder non-volatile accesses.  For an example
patch that cares about this, please see:

	http://lkml.org/lkml/2007/8/7/280

This patch uses an ORDERED_WRT_IRQ() in rcu_read_lock() and
rcu_read_unlock() to ensure that accesses aren't reordered with respect
to interrupt handlers and NMIs/SMIs running on that same CPU.

> In fact, volatile doesn't guarantee that the memory gets
> read anyway.  You might be reading some stale value out
> of the cache.  Granted this doesn't happen on x86 but
> when you're coding for the kernel you can't make such
> assumptions.
> 
> So the point here is that if you don't mind getting a stale
> value from the CPU cache when doing an atomic_read, then
> surely you won't mind getting a stale value from the compiler
> "cache".

Absolutely disagree.  An interrupt/NMI/SMI handler running on the CPU
will see the same value (whether in cache or in store buffer) that
the mainline code will see.  In this case, we don't care about CPU
misordering, only about compiler misordering.  It is easy to see
other uses that combine communication with handlers on the current
CPU with communication among CPUs -- again, see prior messages in
this thread.

> > So, the architecture guys can implement atomic_read however they want
> > --- as long as it cannot be optimized away.*
> 
> They can implement it however they want as long as it stays
> atomic.

Precisely.  And volatility is a key property of "atomic".  Let's please
not throw it away.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:23                               ` Paul Mackerras
  2007-08-16  3:33                                 ` Herbert Xu
@ 2007-08-16 18:48                                 ` Christoph Lameter
  2007-08-16 19:44                                 ` Segher Boessenkool
  2 siblings, 0 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-16 18:48 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Satyam Sharma, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, 16 Aug 2007, Paul Mackerras wrote:

> 
> It seems that there could be a lot of places where atomic_t is used in
> a non-atomic fashion, and that those uses are either buggy, or there
> is some lock held at the time which guarantees that other CPUs aren't
> changing the value.  In both cases there is no point in using
> atomic_t; we might as well just use an ordinary int.

The point of atomic_t is to do atomic *changes* to the variable.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  6:00                                           ` Paul Mackerras
@ 2007-08-16 18:50                                             ` Christoph Lameter
  0 siblings, 0 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-16 18:50 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Satyam Sharma, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, 16 Aug 2007, Paul Mackerras wrote:

> Herbert Xu writes:
> 
> > It doesn't matter.  The memory pressure flag is an *advisory*
> > flag.  If we get it wrong the worst that'll happen is that we'd
> > waste some time doing work that'll be thrown away.
> 
> Ah, so it's the "racy but I don't care because it's only an
> optimization" case.  That's fine.  Somehow I find it hard to believe
> that all the racy uses of atomic_read in the kernel are like that,
> though. :)

My use of atomic_read in SLUB is like that. Volatile does not magically 
sync up reads somehow.


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  4:11                           ` Paul Mackerras
  2007-08-16  5:39                             ` Herbert Xu
@ 2007-08-16 18:54                             ` Christoph Lameter
  2007-08-16 20:07                               ` Paul E. McKenney
  1 sibling, 1 reply; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-16 18:54 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Satyam Sharma, Herbert Xu, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, 16 Aug 2007, Paul Mackerras wrote:

> The uses of atomic_read where one might want it to allow caching of
> the result seem to me to fall into 3 categories:
> 
> 1. Places that are buggy because of a race arising from the way it's
>    used.
> 
> 2. Places where there is a race but it doesn't matter because we're
>    doing some clever trick.
> 
> 3. Places where there is some locking in place that eliminates any
>    potential race.
> 
> In case 1, adding volatile won't solve the race, of course, but it's
> hard to argue that we shouldn't do something because it will slow down
> buggy code.  Case 2 is hopefully pretty rare and accompanied by large
> comment blocks, and in those cases caching the result of atomic_read
> explicitly in a local variable would probably make the code clearer.
> And in case 3 there is no reason to use atomic_t at all; we might as
> well just use an int.

In 2 + 3 you may increment the atomic variable in some places. The value 
of the atomic variable may not matter because you only do optimizations.

Checking a atomic_t for a definite state has to involve either
some side conditions (lock only taken if refcount is <= 0 or so) or done 
by changing the state (see f.e. atomic_inc_unless_zero).

> So I don't see any good reason to make the atomic API more complex by
> having "volatile" and "non-volatile" versions of atomic_read.  It
> should just have the "volatile" behaviour.

If you want to make it less complex then drop volatile which causes weird 
side effects without solving any problems as you just pointed out.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:23               ` Nick Piggin
@ 2007-08-16 19:32                 ` Segher Boessenkool
  2007-08-17  2:19                   ` Nick Piggin
  0 siblings, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-16 19:32 UTC (permalink / raw)
  To: Nick Piggin
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

>>>> Part of the motivation here is to fix heisenbugs.  If I knew where 
>>>> they
>>>
>>>
>>> By the same token we should probably disable optimisations
>>> altogether since that too can create heisenbugs.
>> Almost everything is a tradeoff; and so is this.  I don't
>> believe most people would find disabling all compiler
>> optimisations an acceptable price to pay for some peace
>> of mind.
>
> So why is this a good tradeoff?

It certainly is better than disabling all compiler optimisations!

> I also think that just adding things to APIs in the hope it might fix
> up some bugs isn't really a good road to go down. Where do you stop?

I look at it the other way: keeping the "volatile" semantics in
atomic_XXX() (or adding them to it, whatever) helps _prevent_ bugs;
certainly most people expect that behaviour, and also that behaviour
is *needed* in some places and no other interface provides that
functionality.


[some confusion about barriers wrt atomics snipped]


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:30                 ` Paul E. McKenney
@ 2007-08-16 19:33                   ` Segher Boessenkool
  0 siblings, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-16 19:33 UTC (permalink / raw)
  To: paulmck
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

>> The only thing volatile on an asm does is create a side effect
>> on the asm statement; in effect, it tells the compiler "do not
>> remove this asm even if you don't need any of its outputs".
>>
>> It's not disabling optimisation likely to result in bugs,
>> heisen- or otherwise; _not_ putting the volatile on an asm
>> that needs it simply _is_ a bug :-)
>
> Yep.  And the reason it is a bug is that it fails to disable
> the relevant compiler optimizations.  So I suspect that we might
> actually be saying the same thing here.

We're not saying the same thing, but we do agree :-)


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:05                         ` Paul Mackerras
@ 2007-08-16 19:39                           ` Segher Boessenkool
  0 siblings, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-16 19:39 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, Paul E. McKenney,
	netdev, ak, cfriesen, rpjday, jesper.juhl, linux-arch,
	Andrew Morton, zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

>> I can't speak for this particular case, but there could be similar 
>> code
>> examples elsewhere, where we do the atomic ops on an atomic_t object
>> inside a higher-level locking scheme that would take care of the kind 
>> of
>> problem you're referring to here. It would be useful for such or 
>> similar
>> code if the compiler kept the value of that atomic object in a 
>> register.
>
> If there is a higher-level locking scheme then there is no point to
> using atomic_t variables.  Atomic_t is specifically for the situation
> where multiple CPUs are updating a variable without locking.

And don't forget about the case where it is an I/O device that is
updating the memory (in buffer descriptors or similar).  The driver
needs to do a "volatile" atomic read to get at the most recent version
of that data, which can be important for optimising latency (or 
throughput
even).  There is no other way the kernel can get that info -- doing an
MMIO read is way way too expensive.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:23                               ` Paul Mackerras
  2007-08-16  3:33                                 ` Herbert Xu
  2007-08-16 18:48                                 ` Christoph Lameter
@ 2007-08-16 19:44                                 ` Segher Boessenkool
  2 siblings, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-16 19:44 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, Paul E. McKenney,
	netdev, ak, cfriesen, rpjday, jesper.juhl, linux-arch,
	Andrew Morton, zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

> I'd go so far as to say that anywhere where you want a non-"volatile"
> atomic_read, either your code is buggy, or else an int would work just
> as well.

Even, the only way to implement a "non-volatile" atomic_read() is
essentially as a plain int (you can do some tricks so you cannot
assign to the result and stuff like that, but that's not the issue
here).

So if that would be the behaviour we wanted, just get rid of that
whole atomic_read() thing, so no one can misuse it anymore.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  8:10                                     ` Herbert Xu
  2007-08-16  9:54                                       ` Stefan Richter
@ 2007-08-16 19:48                                       ` Chris Snook
  2007-08-17  0:02                                         ` Herbert Xu
  2007-08-17  5:09                                       ` Paul Mackerras
  2 siblings, 1 reply; 1546+ messages in thread
From: Chris Snook @ 2007-08-16 19:48 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stefan Richter, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 10:06:31AM +0200, Stefan Richter wrote:
>>> Do you (or anyone else for that matter) have an example of this?
>> The only code I somewhat know, the ieee1394 subsystem, was perhaps
>> authored and is currently maintained with the expectation that each
>> occurrence of atomic_read actually results in a load operation, i.e. is
>> not optimized away.  This means all atomic_t (bus generation, packet and
>> buffer refcounts, and some other state variables)* and likewise all
>> atomic bitops in that subsystem.
> 
> Can you find an actual atomic_read code snippet there that is
> broken without the volatile modifier?

A whole bunch of atomic_read uses will be broken without the volatile 
modifier once we start removing barriers that aren't needed if volatile 
behavior is guaranteed.

barrier() clobbers all your registers.  volatile atomic_read() only 
clobbers one register, and more often than not it's a register you 
wanted to clobber anyway.

	-- Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 14:48                                   ` Ilpo Järvinen
  2007-08-16 16:19                                     ` Stefan Richter
@ 2007-08-16 19:55                                     ` Chris Snook
  2007-08-16 20:20                                       ` Christoph Lameter
  2007-08-16 21:08                                         ` Luck, Tony
  2007-08-16 19:55                                     ` Chris Snook
  2 siblings, 2 replies; 1546+ messages in thread
From: Chris Snook @ 2007-08-16 19:55 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Herbert Xu, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Ilpo Järvinen wrote:
> On Thu, 16 Aug 2007, Herbert Xu wrote:
> 
>> We've been through that already.  If it's a busy-wait it
>> should use cpu_relax. 
> 
> I looked around a bit by using some command lines and ended up wondering 
> if these are equal to busy-wait case (and should be fixed) or not:
> 
> ./drivers/telephony/ixj.c
> 6674:   while (atomic_read(&j->DSPWrite) > 0)
> 6675-           atomic_dec(&j->DSPWrite);
> 
> ...besides that, there are couple of more similar cases in the same file 
> (with braces)...

atomic_dec() already has volatile behavior everywhere, so this is 
semantically okay, but this code (and any like it) should be calling 
cpu_relax() each iteration through the loop, unless there's a compelling 
reason not to.  I'll allow that for some hardware drivers (possibly this 
one) such a compelling reason may exist, but hardware-independent core 
subsystems probably have no excuse.

If the maintainer of this code doesn't see a compelling reason to add 
cpu_relax() in this loop, then it should be patched.

	-- Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 14:48                                   ` Ilpo Järvinen
  2007-08-16 16:19                                     ` Stefan Richter
  2007-08-16 19:55                                     ` Chris Snook
@ 2007-08-16 19:55                                     ` Chris Snook
  2 siblings, 0 replies; 1546+ messages in thread
From: Chris Snook @ 2007-08-16 19:55 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Herbert Xu, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Ilpo Järvinen wrote:
> On Thu, 16 Aug 2007, Herbert Xu wrote:
> 
>> We've been through that already.  If it's a busy-wait it
>> should use cpu_relax. 
> 
> I looked around a bit by using some command lines and ended up wondering 
> if these are equal to busy-wait case (and should be fixed) or not:
> 
> ./drivers/telephony/ixj.c
> 6674:   while (atomic_read(&j->DSPWrite) > 0)
> 6675-           atomic_dec(&j->DSPWrite);
> 
> ...besides that, there are couple of more similar cases in the same file 
> (with braces)...

atomic_dec() already has volatile behavior everywhere, so this is 
semantically okay, but this code (and any like it) should be calling 
cpu_relax() each iteration through the loop, unless there's a compelling 
reason not to.  I'll allow that for some hardware drivers (possibly this 
one) such a compelling reason may exist, but hardware-independent core 
subsystems probably have no excuse.

If the maintainer of this code doesn't see a compelling reason not to 
add cpu_relax() in this loop, then it should be patched.

	-- Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 18:54                             ` Christoph Lameter
@ 2007-08-16 20:07                               ` Paul E. McKenney
  0 siblings, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-16 20:07 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, Satyam Sharma, Herbert Xu, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 11:54:54AM -0700, Christoph Lameter wrote:
> On Thu, 16 Aug 2007, Paul Mackerras wrote:
> > So I don't see any good reason to make the atomic API more complex by
> > having "volatile" and "non-volatile" versions of atomic_read.  It
> > should just have the "volatile" behaviour.
> 
> If you want to make it less complex then drop volatile which causes weird 
> side effects without solving any problems as you just pointed out.

The other set of problems are communication between process context
and interrupt/NMI handlers.  Volatile does help here.  And the performance
impact of volatile is pretty near zero, so why have the non-volatile
variant?

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 19:55                                     ` Chris Snook
@ 2007-08-16 20:20                                       ` Christoph Lameter
  2007-08-17  1:02                                         ` Paul E. McKenney
                                                           ` (2 more replies)
  2007-08-16 21:08                                         ` Luck, Tony
  1 sibling, 3 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-16 20:20 UTC (permalink / raw)
  To: Chris Snook
  Cc: Ilpo Järvinen, Herbert Xu, Paul Mackerras, Satyam Sharma,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, 16 Aug 2007, Chris Snook wrote:

> atomic_dec() already has volatile behavior everywhere, so this is semantically
> okay, but this code (and any like it) should be calling cpu_relax() each
> iteration through the loop, unless there's a compelling reason not to.  I'll
> allow that for some hardware drivers (possibly this one) such a compelling
> reason may exist, but hardware-independent core subsystems probably have no
> excuse.

No it does not have any volatile semantics. atomic_dec() can be reordered 
at will by the compiler within the current basic unit if you do not add a 
barrier.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  5:20           ` Satyam Sharma
  2007-08-16  5:57             ` Satyam Sharma
@ 2007-08-16 20:50             ` Segher Boessenkool
  2007-08-16 22:40               ` David Schwartz
  2007-08-17  4:24               ` Satyam Sharma
  1 sibling, 2 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-16 20:50 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Bill Fink, Linux Kernel Mailing List, Paul E. McKenney, netdev,
	ak, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

> Note that "volatile"
> is a type-qualifier, not a type itself, so a cast of the _object_ 
> itself
> to a qualified-type i.e. (volatile int) would not make the access 
> itself
> volatile-qualified.

There is no such thing as "volatile-qualified access" defined
anywhere; there only is the concept of a "volatile-qualified
*object*".

> To serve our purposes, it is necessary for us to take the address of 
> this
> (non-volatile) object, cast the resulting _pointer_ to the 
> corresponding
> volatile-qualified pointer-type, and then dereference it. This makes 
> that
> particular _access_ be volatile-qualified, without the object itself 
> being
> such. Also note that the (dereferenced) result is also a valid lvalue 
> and
> hence can be used in "*(volatile int *)&a = b;" kind of construction
> (which we use for the atomic_set case).

There is a quite convincing argument that such an access _is_ an
access to a volatile object; see GCC PR21568 comment #9.  This
probably isn't the last word on the matter though...


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  5:57             ` Satyam Sharma
  2007-08-16  9:25               ` Satyam Sharma
@ 2007-08-16 21:00               ` Segher Boessenkool
  2007-08-17  4:32                 ` Satyam Sharma
  1 sibling, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-16 21:00 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Bill Fink, Linux Kernel Mailing List, Paul E. McKenney, netdev,
	ak, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

> Here, I should obviously admit that the semantics of *(volatile int *)&
> aren't any neater or well-defined in the _language standard_ at all. 
> The
> standard does say (verbatim) "precisely what constitutes as access to
> object of volatile-qualified type is implementation-defined", but GCC
> does help us out here by doing the right thing.

Where do you get that idea?  GCC manual, section 6.1, "When
is a Volatile Object Accessed?" doesn't say anything of the
kind.  PR33053 and some others.

> Honestly, given such confusion, and the propensity of the "volatile"
> type-qualifier keyword to be ill-defined (or at least poorly 
> understood,
> often inconsistently implemented), I'd (again) express my opinion that 
> it
> would be best to avoid its usage, given other alternatives do exist.

Yeah.  Or we can have an email thread like this every time
someone proposes a patch that uses an atomic variable ;-)


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 19:55                                     ` Chris Snook
@ 2007-08-16 21:08                                         ` Luck, Tony
  2007-08-16 21:08                                         ` Luck, Tony
  1 sibling, 0 replies; 1546+ messages in thread
From: Luck, Tony @ 2007-08-16 21:08 UTC (permalink / raw)
  To: Chris Snook, Ilpo Järvinen
  Cc: Herbert Xu, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

>> 6674:   while (atomic_read(&j->DSPWrite) > 0)
>> 6675-           atomic_dec(&j->DSPWrite);
>
> If the maintainer of this code doesn't see a compelling reason to add 
> cpu_relax() in this loop, then it should be patched.

Shouldn't it be just re-written without the loop:

	if ((tmp = atomic_read(&j->DSPWrite)) > 0)
		atomic_sub(&j->DSPWrite, tmp);

Has all the same bugs, but runs much faster :-)

-Tony

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE: [PATCH 0/24] make atomic_read() behave consistently across all architectures
@ 2007-08-16 21:08                                         ` Luck, Tony
  0 siblings, 0 replies; 1546+ messages in thread
From: Luck, Tony @ 2007-08-16 21:08 UTC (permalink / raw)
  To: Chris Snook, Ilpo Järvinen
  Cc: Herbert Xu, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

>> 6674:   while (atomic_read(&j->DSPWrite) > 0)
>> 6675-           atomic_dec(&j->DSPWrite);
>
> If the maintainer of this code doesn't see a compelling reason to add 
> cpu_relax() in this loop, then it should be patched.

Shouldn't it be just re-written without the loop:

	if ((tmp = atomic_read(&j->DSPWrite)) > 0)
		atomic_sub(&j->DSPWrite, tmp);

Has all the same bugs, but runs much faster :-)

-Tony

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 20:50             ` Segher Boessenkool
@ 2007-08-16 22:40               ` David Schwartz
  2007-08-17  4:36                 ` Satyam Sharma
  2007-08-17  4:24               ` Satyam Sharma
  1 sibling, 1 reply; 1546+ messages in thread
From: David Schwartz @ 2007-08-16 22:40 UTC (permalink / raw)
  To: Linux-Kernel@Vger. Kernel. Org

> There is a quite convincing argument that such an access _is_ an
> access to a volatile object; see GCC PR21568 comment #9.  This
> probably isn't the last word on the matter though...

I find this argument completely convincing and retract the contrary argument
that I've made many times in this forum and others. You learn something new
every day.

Just in case it wasn't clear:
int i;
*(volatile int *)&i=2;

In this case, there *is* an access to a volatile object. This is the end
result of the the standard's definition of what it means to apply the
'volatile int *' cast to '&i' and then apply the '*' operator to the result
and use it as an lvalue.

C does not define the type of an object by how it is defined but by how it
is accessed!

DS

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 16:34                                             ` Paul E. McKenney
@ 2007-08-16 23:59                                               ` Herbert Xu
  2007-08-17  1:01                                                 ` Paul E. McKenney
  2007-08-17  3:15                                               ` Nick Piggin
  1 sibling, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-16 23:59 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Stefan Richter, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 09:34:41AM -0700, Paul E. McKenney wrote:
>
> The compiler can also reorder non-volatile accesses.  For an example
> patch that cares about this, please see:
> 
> 	http://lkml.org/lkml/2007/8/7/280
> 
> This patch uses an ORDERED_WRT_IRQ() in rcu_read_lock() and
> rcu_read_unlock() to ensure that accesses aren't reordered with respect
> to interrupt handlers and NMIs/SMIs running on that same CPU.

Good, finally we have some code to discuss (even though it's
not actually in the kernel yet).

First of all, I think this illustrates that what you want
here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
macro occurs a lot more times in your patch than atomic
reads/sets.  So *assuming* that it was necessary at all,
then having an ordered variant of the atomic_read/atomic_set
ops could do just as well.

However, I still don't know which atomic_read/atomic_set in
your patch would be broken if there were no volatile.  Could
you please point them out?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 19:48                                       ` Chris Snook
@ 2007-08-17  0:02                                         ` Herbert Xu
  2007-08-17  2:04                                           ` Chris Snook
  0 siblings, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-17  0:02 UTC (permalink / raw)
  To: Chris Snook
  Cc: Stefan Richter, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 03:48:54PM -0400, Chris Snook wrote:
>
> >Can you find an actual atomic_read code snippet there that is
> >broken without the volatile modifier?
> 
> A whole bunch of atomic_read uses will be broken without the volatile 
> modifier once we start removing barriers that aren't needed if volatile 
> behavior is guaranteed.

Could you please cite the file/function names so we can
see whether removing the barrier makes sense?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 23:59                                               ` Herbert Xu
@ 2007-08-17  1:01                                                 ` Paul E. McKenney
  2007-08-17  7:39                                                   ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-17  1:01 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stefan Richter, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 09:34:41AM -0700, Paul E. McKenney wrote:
> >
> > The compiler can also reorder non-volatile accesses.  For an example
> > patch that cares about this, please see:
> > 
> > 	http://lkml.org/lkml/2007/8/7/280
> > 
> > This patch uses an ORDERED_WRT_IRQ() in rcu_read_lock() and
> > rcu_read_unlock() to ensure that accesses aren't reordered with respect
> > to interrupt handlers and NMIs/SMIs running on that same CPU.
> 
> Good, finally we have some code to discuss (even though it's
> not actually in the kernel yet).

There was some earlier in this thread as well.

> First of all, I think this illustrates that what you want
> here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
> macro occurs a lot more times in your patch than atomic
> reads/sets.  So *assuming* that it was necessary at all,
> then having an ordered variant of the atomic_read/atomic_set
> ops could do just as well.

Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
to maintain ordering, then I could just use them instead of having to
create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
different patch.)

> However, I still don't know which atomic_read/atomic_set in
> your patch would be broken if there were no volatile.  Could
> you please point them out?

Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
atomic_read() and atomic_set().  Starting with __rcu_read_lock():

o	If "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++"
	was ordered by the compiler after
	"ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1", then
	suppose an NMI/SMI happened after the rcu_read_lock_nesting but
	before the rcu_flipctr.

	Then if there was an rcu_read_lock() in the SMI/NMI
	handler (which is perfectly legal), the nested rcu_read_lock()
	would believe that it could take the then-clause of the
	enclosing "if" statement.  But because the rcu_flipctr per-CPU
	variable had not yet been incremented, an RCU updater would
	be within its rights to assume that there were no RCU reads
	in progress, thus possibly yanking a data structure out from
	under the reader in the SMI/NMI function.

	Fatal outcome.  Note that only one CPU is involved here
	because these are all either per-CPU or per-task variables.

o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1"
	was ordered by the compiler to follow the
	"ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx", and an NMI/SMI
	happened between the two, then an __rcu_read_lock() in the NMI/SMI
	would incorrectly take the "else" clause of the enclosing "if"
	statement.  If some other CPU flipped the rcu_ctrlblk.completed
	in the meantime, then the __rcu_read_lock() would (correctly)
	write the new value into rcu_flipctr_idx.

	Well and good so far.  But the problem arises in
	__rcu_read_unlock(), which then decrements the wrong counter.
	Depending on exactly how subsequent events played out, this could
	result in either prematurely ending grace periods or never-ending
	grace periods, both of which are fatal outcomes.

And the following are not needed in the current version of the
patch, but will be in a future version that either avoids disabling
irqs or that dispenses with the smp_read_barrier_depends() that I
have 99% convinced myself is unneeded:

o	nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting);

o	idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1;

Furthermore, in that future version, irq handlers can cause the same
mischief that SMI/NMI handlers can in this version.

Next, looking at __rcu_read_unlock():

o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting - 1"
	was reordered by the compiler to follow the
	"ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])--",
	then if an NMI/SMI containing an rcu_read_lock() occurs between
	the two, this nested rcu_read_lock() would incorrectly believe
	that it was protected by an enclosing RCU read-side critical
	section as described in the first reversal discussed for
	__rcu_read_lock() above.  Again, fatal outcome.

This is what we have now.  It is not hard to imagine situations that
interact with -both- interrupt handlers -and- other CPUs, as described
earlier.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 20:20                                       ` Christoph Lameter
@ 2007-08-17  1:02                                         ` Paul E. McKenney
  2007-08-17  1:28                                           ` Herbert Xu
  2007-08-17  2:16                                         ` Paul Mackerras
  2007-08-17 17:41                                         ` Segher Boessenkool
  2 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-17  1:02 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Chris Snook, Ilpo Järvinen, Herbert Xu, Paul Mackerras,
	Satyam Sharma, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 01:20:26PM -0700, Christoph Lameter wrote:
> On Thu, 16 Aug 2007, Chris Snook wrote:
> 
> > atomic_dec() already has volatile behavior everywhere, so this is semantically
> > okay, but this code (and any like it) should be calling cpu_relax() each
> > iteration through the loop, unless there's a compelling reason not to.  I'll
> > allow that for some hardware drivers (possibly this one) such a compelling
> > reason may exist, but hardware-independent core subsystems probably have no
> > excuse.
> 
> No it does not have any volatile semantics. atomic_dec() can be reordered 
> at will by the compiler within the current basic unit if you do not add a 
> barrier.

Yep.  Or you can use atomic_dec_return() instead of using a barrier.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  1:02                                         ` Paul E. McKenney
@ 2007-08-17  1:28                                           ` Herbert Xu
  2007-08-17  5:07                                             ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-17  1:28 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Christoph Lameter, Chris Snook, Ilpo Järvinen,
	Paul Mackerras, Satyam Sharma, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On Thu, Aug 16, 2007 at 06:02:32PM -0700, Paul E. McKenney wrote:
> 
> Yep.  Or you can use atomic_dec_return() instead of using a barrier.

Or you could use smp_mb__{before,after}_atomic_dec.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  0:02                                         ` Herbert Xu
@ 2007-08-17  2:04                                           ` Chris Snook
  2007-08-17  2:13                                             ` Herbert Xu
  2007-08-17  2:31                                             ` Nick Piggin
  0 siblings, 2 replies; 1546+ messages in thread
From: Chris Snook @ 2007-08-17  2:04 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stefan Richter, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 03:48:54PM -0400, Chris Snook wrote:
>>> Can you find an actual atomic_read code snippet there that is
>>> broken without the volatile modifier?
>> A whole bunch of atomic_read uses will be broken without the volatile 
>> modifier once we start removing barriers that aren't needed if volatile 
>> behavior is guaranteed.
> 
> Could you please cite the file/function names so we can
> see whether removing the barrier makes sense?
> 
> Thanks,

At a glance, several architectures' implementations of smp_call_function() have 
one or more legitimate atomic_read() busy-waits that shouldn't be using 
CPU-relax.  Some of them do work in the loop.

I'm sure there are plenty more examples that various maintainers could find in 
their own code.

	-- Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  2:04                                           ` Chris Snook
@ 2007-08-17  2:13                                             ` Herbert Xu
  2007-08-17  2:31                                             ` Nick Piggin
  1 sibling, 0 replies; 1546+ messages in thread
From: Herbert Xu @ 2007-08-17  2:13 UTC (permalink / raw)
  To: Chris Snook
  Cc: Stefan Richter, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 10:04:24PM -0400, Chris Snook wrote:
>
> >Could you please cite the file/function names so we can
> >see whether removing the barrier makes sense?
> 
> At a glance, several architectures' implementations of smp_call_function() 
> have one or more legitimate atomic_read() busy-waits that shouldn't be 
> using CPU-relax.  Some of them do work in the loop.

Care to name one so we can discuss it?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 20:20                                       ` Christoph Lameter
  2007-08-17  1:02                                         ` Paul E. McKenney
@ 2007-08-17  2:16                                         ` Paul Mackerras
  2007-08-17  3:03                                           ` Linus Torvalds
  2007-08-17 17:41                                         ` Segher Boessenkool
  2 siblings, 1 reply; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-17  2:16 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Chris Snook, Ilpo Järvinen, Herbert Xu, Satyam Sharma,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Christoph Lameter writes:

> No it does not have any volatile semantics. atomic_dec() can be reordered 
> at will by the compiler within the current basic unit if you do not add a 
> barrier.

Volatile doesn't mean it can't be reordered; volatile means the
accesses can't be eliminated.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 19:32                 ` Segher Boessenkool
@ 2007-08-17  2:19                   ` Nick Piggin
  2007-08-17  3:16                     ` Paul Mackerras
  2007-08-17 17:37                     ` Segher Boessenkool
  0 siblings, 2 replies; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17  2:19 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

Segher Boessenkool wrote:
>>>>> Part of the motivation here is to fix heisenbugs.  If I knew where 
>>>>> they
>>>>
>>>>
>>>>
>>>> By the same token we should probably disable optimisations
>>>> altogether since that too can create heisenbugs.
>>>
>>> Almost everything is a tradeoff; and so is this.  I don't
>>> believe most people would find disabling all compiler
>>> optimisations an acceptable price to pay for some peace
>>> of mind.
>>
>>
>> So why is this a good tradeoff?
> 
> 
> It certainly is better than disabling all compiler optimisations!

It's easy to be better than something really stupid :)

So i386 and x86-64 don't have volatiles there, and it saves them a
few K of kernel text. What you need to justify is why it is a good
tradeoff to make them volatile (which btw, is much harder to go
the other way after we let people make those assumptions).

>> I also think that just adding things to APIs in the hope it might fix
>> up some bugs isn't really a good road to go down. Where do you stop?
> 
> 
> I look at it the other way: keeping the "volatile" semantics in
> atomic_XXX() (or adding them to it, whatever) helps _prevent_ bugs;

Yeah, but we could add lots of things to help prevent bugs and
would never be included. I would also contend that it helps _hide_
bugs and encourages people to be lazy when thinking about these
things.

Also, you dismiss the fact that we'd actually be *adding* volatile
semantics back to the 2 most widely tested architectures (in terms
of test time, number of testers, variety of configurations, and
coverage of driver code). This is a very important different from
just keeping volatile semantics because it is basically a one-way
API change.

> certainly most people expect that behaviour, and also that behaviour
> is *needed* in some places and no other interface provides that
> functionality.

I don't know that most people would expect that behaviour. Is there
any documentation anywhere that would suggest this?

Also, barrier() most definitely provides the required functionality.
It is overkill in some situations, but volatile is overkill in _most_
situations. If that's what you're worried about, we should add a new
ordering primitive.

> [some confusion about barriers wrt atomics snipped]

What were you confused about?

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  2:04                                           ` Chris Snook
  2007-08-17  2:13                                             ` Herbert Xu
@ 2007-08-17  2:31                                             ` Nick Piggin
  1 sibling, 0 replies; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17  2:31 UTC (permalink / raw)
  To: Chris Snook
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Satyam Sharma,
	Christoph Lameter, Paul E. McKenney, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Chris Snook wrote:
> Herbert Xu wrote:
> 
>> On Thu, Aug 16, 2007 at 03:48:54PM -0400, Chris Snook wrote:
>>
>>>> Can you find an actual atomic_read code snippet there that is
>>>> broken without the volatile modifier?
>>>
>>> A whole bunch of atomic_read uses will be broken without the volatile 
>>> modifier once we start removing barriers that aren't needed if 
>>> volatile behavior is guaranteed.
>>
>>
>> Could you please cite the file/function names so we can
>> see whether removing the barrier makes sense?
>>
>> Thanks,
> 
> 
> At a glance, several architectures' implementations of 
> smp_call_function() have one or more legitimate atomic_read() busy-waits 
> that shouldn't be using CPU-relax.  Some of them do work in the loop.

sh looks like the only one there that would be broken (and that's only
because they don't have a cpu_relax there, but it should be added anyway).
Sure, if we removed volatile from other architectures, it would be wise
to audit arch code because arch maintainers do sometimes make assumptions
about their implementation details... however we can assume most generic
code is safe without volatile.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  2:16                                         ` Paul Mackerras
@ 2007-08-17  3:03                                           ` Linus Torvalds
  2007-08-17  3:43                                             ` Paul Mackerras
  2007-08-17 22:09                                             ` Segher Boessenkool
  0 siblings, 2 replies; 1546+ messages in thread
From: Linus Torvalds @ 2007-08-17  3:03 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Chris Snook, Ilpo J?rvinen, Herbert Xu,
	Satyam Sharma, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, 17 Aug 2007, Paul Mackerras wrote:
>
> Volatile doesn't mean it can't be reordered; volatile means the
> accesses can't be eliminated.

It also does limit re-ordering. 

Of course, since *normal* accesses aren't necessarily limited wrt 
re-ordering, the question then becomes one of "with regard to *what* does 
it limit re-ordering?".

A C compiler that re-orders two different volatile accesses that have a 
sequence point in between them is pretty clearly a buggy compiler. So at a 
minimum, it limits re-ordering wrt other volatiles (assuming sequence 
points exists). It also means that the compiler cannot move it 
speculatively across conditionals, but other than that it's starting to 
get fuzzy.

In general, I'd *much* rather we used barriers. Anything that "depends" on 
volatile is pretty much set up to be buggy. But I'm certainly also willing 
to have that volatile inside "atomic_read/atomic_set()" if it avoids code 
that would otherwise break - ie if it hides a bug.

		Linus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 16:34                                             ` Paul E. McKenney
  2007-08-16 23:59                                               ` Herbert Xu
@ 2007-08-17  3:15                                               ` Nick Piggin
  2007-08-17  4:02                                                 ` Paul Mackerras
                                                                   ` (2 more replies)
  1 sibling, 3 replies; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17  3:15 UTC (permalink / raw)
  To: paulmck
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Paul E. McKenney wrote:
> On Thu, Aug 16, 2007 at 06:42:50PM +0800, Herbert Xu wrote:

>>In fact, volatile doesn't guarantee that the memory gets
>>read anyway.  You might be reading some stale value out
>>of the cache.  Granted this doesn't happen on x86 but
>>when you're coding for the kernel you can't make such
>>assumptions.
>>
>>So the point here is that if you don't mind getting a stale
>>value from the CPU cache when doing an atomic_read, then
>>surely you won't mind getting a stale value from the compiler
>>"cache".
> 
> 
> Absolutely disagree.  An interrupt/NMI/SMI handler running on the CPU
> will see the same value (whether in cache or in store buffer) that
> the mainline code will see.  In this case, we don't care about CPU
> misordering, only about compiler misordering.  It is easy to see
> other uses that combine communication with handlers on the current
> CPU with communication among CPUs -- again, see prior messages in
> this thread.

I still don't agree with the underlying assumption that everybody
(or lots of kernel code) treats atomic accesses as volatile.

Nobody that does has managed to explain my logic problem either:
loads and stores to long and ptr have always been considered to be
atomic, test_bit is atomic; so why are another special subclass of
atomic loads and stores? (and yes, it is perfectly legitimate to
want a non-volatile read for a data type that you also want to do
atomic RMW operations on)

Why are people making these undocumented and just plain false
assumptions about atomic_t? If they're using lockless code (ie.
which they must be if using atomics), then they actually need to be
thinking much harder about memory ordering issues. If that is too
much for them, then they can just use locks.

>>>So, the architecture guys can implement atomic_read however they want
>>>--- as long as it cannot be optimized away.*
>>
>>They can implement it however they want as long as it stays
>>atomic.
> 
> 
> Precisely.  And volatility is a key property of "atomic".  Let's please
> not throw it away.

It isn't, though (at least not since i386 and x86-64 don't have it).
_Adding_ it is trivial, and can be done any time. Throwing it away
(ie. making the API weaker) is _hard_. So let's not add it without
really good reasons. It most definitely results in worse code
generation in practice.

I don't know why people would assume volatile of atomics. AFAIK, most
of the documentation is pretty clear that all the atomic stuff can be
reordered etc. except for those that modify and return a value.

It isn't even intuitive: `*lp = value` is like the most fundamental
atomic operation in Linux.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  2:19                   ` Nick Piggin
@ 2007-08-17  3:16                     ` Paul Mackerras
  2007-08-17  3:32                       ` Nick Piggin
  2007-08-17  3:42                       ` Linus Torvalds
  2007-08-17 17:37                     ` Segher Boessenkool
  1 sibling, 2 replies; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-17  3:16 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Segher Boessenkool, heiko.carstens, horms, linux-kernel, rpjday,
	ak, netdev, cfriesen, akpm, torvalds, jesper.juhl, linux-arch,
	zlynx, satyam, clameter, schwidefsky, Chris Snook, Herbert Xu,
	davem, wensong, wjiang

Nick Piggin writes:

> So i386 and x86-64 don't have volatiles there, and it saves them a
> few K of kernel text. What you need to justify is why it is a good

I'm really surprised it's as much as a few K.  I tried it on powerpc
and it only saved 40 bytes (10 instructions) for a G5 config.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:16                     ` Paul Mackerras
@ 2007-08-17  3:32                       ` Nick Piggin
  2007-08-17  3:50                         ` Linus Torvalds
  2007-08-17  3:42                       ` Linus Torvalds
  1 sibling, 1 reply; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17  3:32 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Segher Boessenkool, heiko.carstens, horms, linux-kernel, rpjday,
	ak, netdev, cfriesen, akpm, torvalds, jesper.juhl, linux-arch,
	zlynx, satyam, clameter, schwidefsky, Chris Snook, Herbert Xu,
	davem, wensong, wjiang

Paul Mackerras wrote:
> Nick Piggin writes:
> 
> 
>>So i386 and x86-64 don't have volatiles there, and it saves them a
>>few K of kernel text. What you need to justify is why it is a good
> 
> 
> I'm really surprised it's as much as a few K.  I tried it on powerpc
> and it only saved 40 bytes (10 instructions) for a G5 config.
> 
> Paul.
> 

I'm surprised too. Numbers were from the "...use asm() like the other
atomic operations already do" thread. According to them,

   text    data     bss     dec     hex filename
3434150  249176  176128 3859454  3ae3fe atomic_normal/vmlinux
3436203  249176  176128 3861507  3aec03 atomic_volatile/vmlinux

The first one is a stock kenel, the second is with atomic_read/set
cast to volatile. gcc-4.1 -- maybe if you have an earlier gcc it
won't optimise as much?

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:16                     ` Paul Mackerras
  2007-08-17  3:32                       ` Nick Piggin
@ 2007-08-17  3:42                       ` Linus Torvalds
  2007-08-17  5:18                         ` Paul E. McKenney
                                           ` (4 more replies)
  1 sibling, 5 replies; 1546+ messages in thread
From: Linus Torvalds @ 2007-08-17  3:42 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Nick Piggin, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, satyam, clameter, schwidefsky, Chris Snook,
	Herbert Xu, davem, wensong, wjiang

On Fri, 17 Aug 2007, Paul Mackerras wrote:
> 
> I'm really surprised it's as much as a few K.  I tried it on powerpc
> and it only saved 40 bytes (10 instructions) for a G5 config.

One of the things that "volatile" generally screws up is a simple

	volatile int i;

	i++;

which a compiler will generally get horribly, horribly wrong.

In a reasonable world, gcc should just make that be (on x86)

	addl $1,i(%rip)

on x86-64, which is indeed what it does without the volatile. But with the 
volatile, the compiler gets really nervous, and doesn't dare do it in one 
instruction, and thus generates crap like

        movl    i(%rip), %eax
        addl    $1, %eax
        movl    %eax, i(%rip)

instead. For no good reason, except that "volatile" just doesn't have any 
good/clear semantics for the compiler, so most compilers will just make it 
be "I will not touch this access in any way, shape, or form". Including 
even trivially correct instruction optimization/combination.

This is one of the reasons why we should never use "volatile". It 
pessimises code generation for no good reason - just because compilers 
don't know what the heck it even means! 

Now, people don't do "i++" on atomics (you'd use "atomic_inc()" for that), 
but people *do* do things like

	if (atomic_read(..) <= 1)
		..

On ppc, things like that probably don't much matter. But on x86, it makes 
a *huge* difference whether you do

	movl i(%rip),%eax
	cmpl $1,%eax

or if you can just use the value directly for the operation, like this:

	cmpl $1,i(%rip)

which is again a totally obvious and totally safe optimization, but is 
(again) something that gcc doesn't dare do, since "i" is volatile.

In other words: "volatile" is a horribly horribly bad way of doing things, 
because it generates *worse*code*, for no good reason. You just don't see 
it on powerpc, because it's already a load-store architecture, so there is 
no "good code" for doing direct-to-memory operations.

		Linus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:03                                           ` Linus Torvalds
@ 2007-08-17  3:43                                             ` Paul Mackerras
  2007-08-17  3:53                                               ` Herbert Xu
  2007-08-17 22:09                                             ` Segher Boessenkool
  1 sibling, 1 reply; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-17  3:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Lameter, Chris Snook, Ilpo J?rvinen, Herbert Xu,
	Satyam Sharma, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Linus Torvalds writes:

> In general, I'd *much* rather we used barriers. Anything that "depends" on 
> volatile is pretty much set up to be buggy. But I'm certainly also willing 
> to have that volatile inside "atomic_read/atomic_set()" if it avoids code 
> that would otherwise break - ie if it hides a bug.

The cost of doing so seems to me to be well down in the noise - 44
bytes of extra kernel text on a ppc64 G5 config, and I don't believe
the extra few cycles for the occasional extra load would be measurable
(they should all hit in the L1 dcache).  I don't mind if x86[-64] have
atomic_read/set be nonvolatile and find all the missing barriers, but
for now on powerpc, I think that not having to find those missing
barriers is worth the 0.00076% increase in kernel text size.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:32                       ` Nick Piggin
@ 2007-08-17  3:50                         ` Linus Torvalds
  2007-08-17 23:59                           ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Linus Torvalds @ 2007-08-17  3:50 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, satyam, clameter, schwidefsky, Chris Snook,
	Herbert Xu, davem, wensong, wjiang



On Fri, 17 Aug 2007, Nick Piggin wrote:
> 
> I'm surprised too. Numbers were from the "...use asm() like the other
> atomic operations already do" thread. According to them,
> 
>   text    data     bss     dec     hex filename
> 3434150  249176  176128 3859454  3ae3fe atomic_normal/vmlinux
> 3436203  249176  176128 3861507  3aec03 atomic_volatile/vmlinux
> 
> The first one is a stock kenel, the second is with atomic_read/set
> cast to volatile. gcc-4.1 -- maybe if you have an earlier gcc it
> won't optimise as much?

No, see my earlier reply. "volatile" really *is* an incredible piece of 
crap.

Just try it yourself:

	volatile int i;
	int j;

	int testme(void)
	{
	        return i <= 1;
	}

	int testme2(void)
	{
	        return j <= 1;
	}

and compile with all the optimizations you can.

I get:

	testme:
	        movl    i(%rip), %eax
	        subl    $1, %eax
	        setle   %al
	        movzbl  %al, %eax
	        ret

vs

	testme2:
	        xorl    %eax, %eax
	        cmpl    $1, j(%rip)
	        setle   %al
	        ret

(now, whether that "xorl + setle" is better than "setle + movzbl", I don't 
really know - maybe it is. But that's not thepoint. The point is the 
difference between

                movl    i(%rip), %eax
                subl    $1, %eax

and

                cmpl    $1, j(%rip)

and imagine this being done for *every* single volatile access.

Just do a 

	git grep atomic_read

to see how atomics are actually used. A lot of them are exactly the above 
kind of "compare against a value".

			Linus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:43                                             ` Paul Mackerras
@ 2007-08-17  3:53                                               ` Herbert Xu
  2007-08-17  6:26                                                 ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-17  3:53 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Linus Torvalds, Christoph Lameter, Chris Snook, Ilpo J?rvinen,
	Satyam Sharma, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, Aug 17, 2007 at 01:43:27PM +1000, Paul Mackerras wrote:
>
> The cost of doing so seems to me to be well down in the noise - 44
> bytes of extra kernel text on a ppc64 G5 config, and I don't believe
> the extra few cycles for the occasional extra load would be measurable
> (they should all hit in the L1 dcache).  I don't mind if x86[-64] have
> atomic_read/set be nonvolatile and find all the missing barriers, but
> for now on powerpc, I think that not having to find those missing
> barriers is worth the 0.00076% increase in kernel text size.

BTW, the sort of missing barriers that triggered this thread
aren't that subtle.  It'll result in a simple lock-up if the
loop condition holds upon entry.  At which point it's fairly
straightforward to find the culprit.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:15                                               ` Nick Piggin
@ 2007-08-17  4:02                                                 ` Paul Mackerras
  2007-08-17  4:39                                                   ` Nick Piggin
  2007-08-17  7:25                                                 ` Stefan Richter
  2007-08-17 22:14                                                 ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Segher Boessenkool
  2 siblings, 1 reply; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-17  4:02 UTC (permalink / raw)
  To: Nick Piggin
  Cc: paulmck, Herbert Xu, Stefan Richter, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Nick Piggin writes:

> Why are people making these undocumented and just plain false
> assumptions about atomic_t?

Well, it has only been false since December 2006.  Prior to that
atomics *were* volatile on all platforms.

> If they're using lockless code (ie.
> which they must be if using atomics), then they actually need to be
> thinking much harder about memory ordering issues.

Indeed.  I believe that most uses of atomic_read other than in polling
loops or debug printk statements are actually racy.  In some cases the
race doesn't seem to matter, but I'm sure there are cases where it
does.

> If that is too
> much for them, then they can just use locks.

Why use locks when you can just sprinkle magic fix-the-races dust (aka
atomic_t) over your code? :) :)

> > Precisely.  And volatility is a key property of "atomic".  Let's please
> > not throw it away.
> 
> It isn't, though (at least not since i386 and x86-64 don't have it).

Conceptually it is, because atomic_t is specifically for variables
which are liable to be modified by other CPUs, and volatile _means_
"liable to be changed by mechanisms outside the knowledge of the
compiler".

> _Adding_ it is trivial, and can be done any time. Throwing it away
> (ie. making the API weaker) is _hard_. So let's not add it without

Well, in one sense it's not that hard - Linus did it just 8 months ago
in commit f9e9dcb3. :)

> really good reasons. It most definitely results in worse code
> generation in practice.

0.0008% increase in kernel text size on powerpc according to my
measurement. :)

> I don't know why people would assume volatile of atomics. AFAIK, most

By making something an atomic_t you're saying "other CPUs are going to
be modifying this, so treat it specially".  It's reasonable to assume
that special treatment extends to reading and setting it.

> of the documentation is pretty clear that all the atomic stuff can be
> reordered etc. except for those that modify and return a value.

Volatility isn't primarily about reordering (though as Linus says it
does restrict reordering to some extent).

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 20:50             ` Segher Boessenkool
  2007-08-16 22:40               ` David Schwartz
@ 2007-08-17  4:24               ` Satyam Sharma
  2007-08-17 22:34                 ` Segher Boessenkool
  1 sibling, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17  4:24 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Bill Fink, Linux Kernel Mailing List, Paul E. McKenney, netdev,
	ak, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang, davids



On Thu, 16 Aug 2007, Segher Boessenkool wrote:

> > Note that "volatile"
> > is a type-qualifier, not a type itself, so a cast of the _object_ itself
> > to a qualified-type i.e. (volatile int) would not make the access itself
> > volatile-qualified.
> 
> There is no such thing as "volatile-qualified access" defined
> anywhere; there only is the concept of a "volatile-qualified
> *object*".

Sure, "volatile-qualified access" was not some standard term I used
there. Just something to mean "an access that would make the compiler
treat the object at that memory as if it were an object with a
volatile-qualified type".

Now the second wording *IS* technically correct, but come on, it's
24 words long whereas the original one was 3 -- and hopefully anybody
reading the shorter phrase *would* have known anyway what was meant,
without having to be pedantic about it :-)


Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 21:00               ` Segher Boessenkool
@ 2007-08-17  4:32                 ` Satyam Sharma
  2007-08-17 22:38                   ` Segher Boessenkool
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17  4:32 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Bill Fink, Linux Kernel Mailing List, Paul E. McKenney, netdev,
	ak, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang



On Thu, 16 Aug 2007, Segher Boessenkool wrote:

> > Here, I should obviously admit that the semantics of *(volatile int *)&
> > aren't any neater or well-defined in the _language standard_ at all. The
> > standard does say (verbatim) "precisely what constitutes as access to
> > object of volatile-qualified type is implementation-defined", but GCC
> > does help us out here by doing the right thing.
> 
> Where do you get that idea?

Try a testcase (experimentally verify).

> GCC manual, section 6.1, "When
> is a Volatile Object Accessed?" doesn't say anything of the
> kind.

True, "implementation-defined" as per the C standard _is_ supposed to mean
"unspecified behaviour where each implementation documents how the choice
is made". So ok, probably GCC isn't "documenting" this
implementation-defined behaviour which it is supposed to, but can't really
fault them much for this, probably.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 22:40               ` David Schwartz
@ 2007-08-17  4:36                 ` Satyam Sharma
  0 siblings, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17  4:36 UTC (permalink / raw)
  To: David Schwartz; +Cc: Linux-Kernel@Vger. Kernel. Org

[ Your mailer drops Cc: lists, munges headers,
  does all sorts of badness. Please fix that. ]


On Thu, 16 Aug 2007, David Schwartz wrote:

> 
> > There is a quite convincing argument that such an access _is_ an
> > access to a volatile object; see GCC PR21568 comment #9.  This
> > probably isn't the last word on the matter though...
> 
> I find this argument completely convincing and retract the contrary argument
> that I've made many times in this forum and others. You learn something new
> every day.
> 
> Just in case it wasn't clear:
> int i;
> *(volatile int *)&i=2;
> 
> In this case, there *is* an access to a volatile object. This is the end
> result of the the standard's definition of what it means to apply the
> 'volatile int *' cast to '&i' and then apply the '*' operator to the result
> and use it as an lvalue.

True, see my last mail in this sub-thread that explains precisely this :-)


Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  4:02                                                 ` Paul Mackerras
@ 2007-08-17  4:39                                                   ` Nick Piggin
  0 siblings, 0 replies; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17  4:39 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: paulmck, Herbert Xu, Stefan Richter, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Paul Mackerras wrote:
> Nick Piggin writes:
> 
> 
>>Why are people making these undocumented and just plain false
>>assumptions about atomic_t?
> 
> 
> Well, it has only been false since December 2006.  Prior to that
> atomics *were* volatile on all platforms.

Hmm, although I don't think it has ever been guaranteed by the
API documentation (concede documentation is often not treated
as the authoritative source here, but for atomic it is actually
very good and obviously indispensable as the memory ordering
reference).

>>If they're using lockless code (ie.
>>which they must be if using atomics), then they actually need to be
>>thinking much harder about memory ordering issues.
> 
> 
> Indeed.  I believe that most uses of atomic_read other than in polling
> loops or debug printk statements are actually racy.  In some cases the
> race doesn't seem to matter, but I'm sure there are cases where it
> does.
> 
> 
>>If that is too
>>much for them, then they can just use locks.
> 
> 
> Why use locks when you can just sprinkle magic fix-the-races dust (aka
> atomic_t) over your code? :) :)

I agree with your skepticism of a lot of lockless code. But I think
a lot of the more subtle race problems will not be fixed with volatile.
The big, dumb infinite loop bugs would be fixed, but they're pretty
trivial to debug and even audit for.

>>>Precisely.  And volatility is a key property of "atomic".  Let's please
>>>not throw it away.
>>
>>It isn't, though (at least not since i386 and x86-64 don't have it).
> 
> 
> Conceptually it is, because atomic_t is specifically for variables
> which are liable to be modified by other CPUs, and volatile _means_
> "liable to be changed by mechanisms outside the knowledge of the
> compiler".

Usually that is the case, yes. But also most of the time we don't
care that it has been changed and don't mind it being reordered or
eliminated.

One of the only places we really care about that at all is for
variables that are modified by the *same* CPU.

>>_Adding_ it is trivial, and can be done any time. Throwing it away
>>(ie. making the API weaker) is _hard_. So let's not add it without
> 
> 
> Well, in one sense it's not that hard - Linus did it just 8 months ago
> in commit f9e9dcb3. :)

Well it would have been harder if the documentation also guaranteed
that atomic_read/atomic_set was ordered. Or it would have been harder
for _me_ to make such a change, anyway ;)

>>really good reasons. It most definitely results in worse code
>>generation in practice.
> 
> 
> 0.0008% increase in kernel text size on powerpc according to my
> measurement. :)

I don't think you're making a bad choice by keeping it volatile on
powerpc and waiting for others to shake out more of the bugs. You
get to fix everybody else's memory ordering bugs :)

>>I don't know why people would assume volatile of atomics. AFAIK, most
> 
> 
> By making something an atomic_t you're saying "other CPUs are going to
> be modifying this, so treat it specially".  It's reasonable to assume
> that special treatment extends to reading and setting it.

But I don't actually know what that "special treatment" is. Well
actually, I do know that operations will never result in a partial
modification being exposed. I also know that the operators that
do not modify and return are not guaranteed to have any sort of
ordering constraints.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 10:42                                           ` Herbert Xu
  2007-08-16 16:34                                             ` Paul E. McKenney
@ 2007-08-17  5:04                                             ` Paul Mackerras
  1 sibling, 0 replies; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-17  5:04 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stefan Richter, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> So the point here is that if you don't mind getting a stale
> value from the CPU cache when doing an atomic_read, then
> surely you won't mind getting a stale value from the compiler
> "cache".

No, that particular argument is bogus, because there is a cache
coherency protocol operating to keep the CPU cache coherent with
stores from other CPUs, but there isn't any such protocol (nor should
there be) for a register used as a "cache".

(Linux requires SMP systems to keep any CPU caches coherent as far as
accesses by other CPUs are concerned.  It doesn't support any SMP
systems that are not cache-coherent as far as CPU accesses are
concerned.  It does support systems with non-cache-coherent DMA.)

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  1:28                                           ` Herbert Xu
@ 2007-08-17  5:07                                             ` Paul E. McKenney
  0 siblings, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-17  5:07 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Chris Snook, Ilpo Järvinen,
	Paul Mackerras, Satyam Sharma, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On Fri, Aug 17, 2007 at 09:28:00AM +0800, Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 06:02:32PM -0700, Paul E. McKenney wrote:
> > 
> > Yep.  Or you can use atomic_dec_return() instead of using a barrier.
> 
> Or you could use smp_mb__{before,after}_atomic_dec.

Yep.  That would be an example of a barrier, either in the
atomic_dec() itself or in the smp_mb...().

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  8:10                                     ` Herbert Xu
  2007-08-16  9:54                                       ` Stefan Richter
  2007-08-16 19:48                                       ` Chris Snook
@ 2007-08-17  5:09                                       ` Paul Mackerras
  2007-08-17  5:32                                         ` Herbert Xu
  2 siblings, 1 reply; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-17  5:09 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stefan Richter, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> Can you find an actual atomic_read code snippet there that is
> broken without the volatile modifier?

There are some in arch-specific code, for example line 1073 of
arch/mips/kernel/smtc.c.  On mips, cpu_relax() is just barrier(), so
the empty loop body is ok provided that atomic_read actually does the
load each time around the loop.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:42                       ` Linus Torvalds
@ 2007-08-17  5:18                         ` Paul E. McKenney
  2007-08-17  5:56                         ` Satyam Sharma
                                           ` (3 subsequent siblings)
  4 siblings, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-17  5:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Mackerras, Nick Piggin, Segher Boessenkool, heiko.carstens,
	horms, linux-kernel, rpjday, ak, netdev, cfriesen, akpm,
	jesper.juhl, linux-arch, zlynx, satyam, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, wensong, wjiang

On Thu, Aug 16, 2007 at 08:42:23PM -0700, Linus Torvalds wrote:
> 
> 
> On Fri, 17 Aug 2007, Paul Mackerras wrote:
> > 
> > I'm really surprised it's as much as a few K.  I tried it on powerpc
> > and it only saved 40 bytes (10 instructions) for a G5 config.
> 
> One of the things that "volatile" generally screws up is a simple
> 
> 	volatile int i;
> 
> 	i++;
> 
> which a compiler will generally get horribly, horribly wrong.
> 
> In a reasonable world, gcc should just make that be (on x86)
> 
> 	addl $1,i(%rip)
> 
> on x86-64, which is indeed what it does without the volatile. But with the 
> volatile, the compiler gets really nervous, and doesn't dare do it in one 
> instruction, and thus generates crap like
> 
>         movl    i(%rip), %eax
>         addl    $1, %eax
>         movl    %eax, i(%rip)

Blech.  Sounds like a chat with some gcc people is in order.  Will
see what I can do.

						Thanx, Paul

> instead. For no good reason, except that "volatile" just doesn't have any 
> good/clear semantics for the compiler, so most compilers will just make it 
> be "I will not touch this access in any way, shape, or form". Including 
> even trivially correct instruction optimization/combination.
> 
> This is one of the reasons why we should never use "volatile". It 
> pessimises code generation for no good reason - just because compilers 
> don't know what the heck it even means! 
> 
> Now, people don't do "i++" on atomics (you'd use "atomic_inc()" for that), 
> but people *do* do things like
> 
> 	if (atomic_read(..) <= 1)
> 		..
> 
> On ppc, things like that probably don't much matter. But on x86, it makes 
> a *huge* difference whether you do
> 
> 	movl i(%rip),%eax
> 	cmpl $1,%eax
> 
> or if you can just use the value directly for the operation, like this:
> 
> 	cmpl $1,i(%rip)
> 
> which is again a totally obvious and totally safe optimization, but is 
> (again) something that gcc doesn't dare do, since "i" is volatile.
> 
> In other words: "volatile" is a horribly horribly bad way of doing things, 
> because it generates *worse*code*, for no good reason. You just don't see 
> it on powerpc, because it's already a load-store architecture, so there is 
> no "good code" for doing direct-to-memory operations.
> 
> 		Linus
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  5:09                                       ` Paul Mackerras
@ 2007-08-17  5:32                                         ` Herbert Xu
  2007-08-17  5:41                                           ` Paul Mackerras
  0 siblings, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-17  5:32 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Stefan Richter, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, Aug 17, 2007 at 03:09:57PM +1000, Paul Mackerras wrote:
> Herbert Xu writes:
> 
> > Can you find an actual atomic_read code snippet there that is
> > broken without the volatile modifier?
> 
> There are some in arch-specific code, for example line 1073 of
> arch/mips/kernel/smtc.c.  On mips, cpu_relax() is just barrier(), so
> the empty loop body is ok provided that atomic_read actually does the
> load each time around the loop.

A barrier() is all you need to force the compiler to reread
the value.

The people advocating volatile in this thread are talking
about code that doesn't use barrier()/cpu_relax().

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  5:32                                         ` Herbert Xu
@ 2007-08-17  5:41                                           ` Paul Mackerras
  2007-08-17  8:28                                             ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-17  5:41 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stefan Richter, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> On Fri, Aug 17, 2007 at 03:09:57PM +1000, Paul Mackerras wrote:
> > Herbert Xu writes:
> > 
> > > Can you find an actual atomic_read code snippet there that is
> > > broken without the volatile modifier?
> > 
> > There are some in arch-specific code, for example line 1073 of
> > arch/mips/kernel/smtc.c.  On mips, cpu_relax() is just barrier(), so
> > the empty loop body is ok provided that atomic_read actually does the
> > load each time around the loop.
> 
> A barrier() is all you need to force the compiler to reread
> the value.
> 
> The people advocating volatile in this thread are talking
> about code that doesn't use barrier()/cpu_relax().

Did you look at it?  Here it is:

	/* Someone else is initializing in parallel - let 'em finish */
	while (atomic_read(&idle_hook_initialized) < 1000)
		;

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:42                       ` Linus Torvalds
  2007-08-17  5:18                         ` Paul E. McKenney
@ 2007-08-17  5:56                         ` Satyam Sharma
  2007-08-17  7:26                           ` Nick Piggin
  2007-08-17 22:49                           ` Segher Boessenkool
  2007-08-17  6:42                         ` Geert Uytterhoeven
                                           ` (2 subsequent siblings)
  4 siblings, 2 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17  5:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Mackerras, Nick Piggin, Segher Boessenkool, heiko.carstens,
	horms, Linux Kernel Mailing List, rpjday, ak, netdev, cfriesen,
	Andrew Morton, jesper.juhl, linux-arch, zlynx, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

Hi Linus,

[ and others; I think there's a communication gap in a lot of this
  thread, and a little summary would be useful. Hence this posting. ]

On Thu, 16 Aug 2007, Linus Torvalds wrote:

> On Fri, 17 Aug 2007, Paul Mackerras wrote:
> > 
> > I'm really surprised it's as much as a few K.  I tried it on powerpc
> > and it only saved 40 bytes (10 instructions) for a G5 config.
> 
> One of the things that "volatile" generally screws up is a simple
> 
> 	volatile int i;
> 
> 	i++;
> 
> which a compiler will generally get horribly, horribly wrong.
> 
> [...] For no good reason, except that "volatile" just doesn't have any 
> good/clear semantics for the compiler, so most compilers will just make it 
> be "I will not touch this access in any way, shape, or form". Including 
> even trivially correct instruction optimization/combination.
> 
> This is one of the reasons why we should never use "volatile". It 
> pessimises code generation for no good reason - just because compilers 
> don't know what the heck it even means! 
> [...]
> In other words: "volatile" is a horribly horribly bad way of doing things, 
> because it generates *worse*code*, for no good reason. You just don't see 
> it on powerpc, because it's already a load-store architecture, so there is 
> no "good code" for doing direct-to-memory operations.

True, and I bet *everybody* on this list has already agreed for a very
long time that using "volatile" to type-qualify the _declaration_ of an
object itself as being horribly bad (taste-wise, code-generation-wise,
often even buggy for sitations where real CPU barriers should've been
used instead).

However, the discussion on this thread (IIRC) began with only "giving
volatility semantics" to atomic ops. Now that is different, and may not
require the use the "volatile" keyword (at least not in the declaration
of the object) itself.

Sadly, most arch's *still* do type-qualify the declaration of the
"counter" member of atomic_t as "volatile". This is probably a historic
hangover, and I suspect not yet rectified because of lethargy.

Anyway, some of the variants I can think of are:

[1]

#define atomic_read_volatile(v)				\
	({						\
		forget((v)->counter);			\
		((v)->counter);				\
	})

where:

#define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))

[ This is exactly equivalent to using "+m" in the constraints, as recently
  explained on a GCC list somewhere, in response to the patch in my bitops
  series a few weeks back where I thought "+m" was bogus. ]

[2]

#define atomic_read_volatile(v)		(*(volatile int *)&(v)->counter)

This is something that does work. It has reasonably good semantics
guaranteed by the C standard in conjunction with how GCC currently
behaves (and how it has behaved for all supported versions). I haven't
checked if generates much different code than the first variant above,
(it probably will generate similar code to just declaring the object
as volatile, but would still be better in terms of code-clarity and
taste, IMHO), but in any case, we should pick whichever of these variants
works for us and generates good code.

[3]

static inline int atomic_read_volatile(atomic_t *v)
{
	... arch-dependent __asm__ __volatile__ stuff ...
}

I can reasonably bet this variant would often generate worse code than
at least the variant "[1]" above.

Now, why do we even require these "volatility" semantics variants?

Note, "volatility" semantics *know* / assume that it can have a meaning
_only_ as far as the compiler, so atomic_read_volatile() doesn't really
care reading stale values from the cache for certain non-x86 archs, etc.

The first argument is "safety":

Use of atomic_read() (possibly in conjunction with other atomic ops) in
a lot of code out there in the kernel *assumes* the compiler will not
optimize away those ops. (which is possible given current definitions
of atomic_set and atomic_read on archs such as x86 in present code).
An additional argument that builds on this one says that by ensuring
the compiler will not elid or coalesce these ops, we could even avoid
potential heisenbugs in the future.

However, there is a counter-argument:

As Herbert Xu has often been making the point, there is *no* bug out
there involving "atomic_read" in busy-while-loops that should not have
a compiler barrier (or cpu_relax() in fact) anyway. As for non-busy-loops,
they would invariable call schedule() at some point (possibly directly)
and thus have an "implicit" compiler barrier by virtue of calling out
a function that is not in scope of the current compilation unit (although
users in sched.c itself would probably require an explicit compiler
barrier).

The second pro-volatility-in-atomic-ops argument is performance:
(surprise!)

Using a full memory clobber compiler barrier in busy loops will disqualify
optimizations for loop invariants so it probably makes sense to *only*
make the compiler forget *that* particular address of the atomic counter
object, and none other. All 3 variants above would work nicely here.

So the final idea may be to have a cpu_relax_no_barrier() variant as a
rep;nop (pause) *without* an included full memory clobber, and replace
a lot of kernel busy-while-loops out there with:

-	cpu_relax();
+	cpu_relax_no_barrier();
+	forget(x);

or may be just:

-	cpu_relax();
+	cpu_relax_no_barrier();

because the "forget" / "volatility" / specific-variable-compiler-barrier
could be made implicit inside the atomic ops themselves.

This could especially make a difference for register-rich CPUs (probably
not x86) where using a full memory clobber will disqualify a hell lot of
compiler optimizations for loop-invariants.

On x86 itself, cpu_relax_no_barrier() could be:

#define cpu_relax_no_barrier()	__asm__ __volatile__ ("rep;nop":::);

and still continue to do its job as it is doing presently.

However, there is still a counter-argument:

As Herbert Xu and Christoph Lameter have often been saying, giving
"volatility" semantics to the atomic ops will disqualify compiler
optimizations such as eliding / coalescing of atomic ops, etc, and
probably some sections of code in the kernel (Christoph mentioned code
in SLUB, and I saw such code in sched) benefit from such optimizations.

Paul Mackerras has, otoh, mentioned that a lot of such places probably
don't need (or shouldn't use) atomic ops in the first place.
Alternatively, such callsites should probably just cache the atomic_read
in a local variable (which compiler could just as well make a register)
explicitly, and repeating atomic_read() isn't really necessary.

There could still be legitimate uses of atomic ops that don't care about
them being elided / coalesced, but given the loop-invariant-optimization
benefit, personally, I do see some benefit in the use of defining atomic
ops variants with "volatility" semantics (for only the atomic counter
object) but also having a non-volatile atomic ops API side-by-side for
performance critical users (probably sched, slub) that may require that.

Possibly, one of the two APIs above could turn out to be redundant, but
that's still very much the issue of debate presently.

Satyam

[ Sorry if I missed anything important, but this thread has been long
  and noisy, although I've tried to keep up ... ]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:53                                               ` Herbert Xu
@ 2007-08-17  6:26                                                 ` Satyam Sharma
  2007-08-17  8:38                                                   ` Nick Piggin
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17  6:26 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Linus Torvalds, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher



On Fri, 17 Aug 2007, Herbert Xu wrote:

> On Fri, Aug 17, 2007 at 01:43:27PM +1000, Paul Mackerras wrote:
> >
> > The cost of doing so seems to me to be well down in the noise - 44
> > bytes of extra kernel text on a ppc64 G5 config, and I don't believe
> > the extra few cycles for the occasional extra load would be measurable
> > (they should all hit in the L1 dcache).  I don't mind if x86[-64] have
> > atomic_read/set be nonvolatile and find all the missing barriers, but
> > for now on powerpc, I think that not having to find those missing
> > barriers is worth the 0.00076% increase in kernel text size.
> 
> BTW, the sort of missing barriers that triggered this thread
> aren't that subtle.  It'll result in a simple lock-up if the
> loop condition holds upon entry.  At which point it's fairly
> straightforward to find the culprit.

Not necessarily. A barrier-less buggy code such as below:

	atomic_set(&v, 0);

	... /* some initial code */

	while (atomic_read(&v))
		;

	... /* code that MUST NOT be executed unless v becomes non-zero */

(where v->counter is has no volatile access semantics)

could be generated by the compiler to simply *elid* or *do away* with
the loop itself, thereby making the:

"/* code that MUST NOT be executed unless v becomes non-zero */"

to be executed even when v is zero! That is subtle indeed, and causes
no hard lockups.

Granted, the above IS buggy code. But, the stated objective is to avoid
heisenbugs. And we have driver / subsystem maintainers such as Stefan
coming up and admitting that often a lot of code that's written to use
atomic_read() does assume the read will not be elided by the compiler.

See, I agree, "volatility" semantics != what we often want. However, if
what we want is compiler barrier, for only the object under consideration,
"volatility" semantics aren't really "nonsensical" or anything.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:42                       ` Linus Torvalds
  2007-08-17  5:18                         ` Paul E. McKenney
  2007-08-17  5:56                         ` Satyam Sharma
@ 2007-08-17  6:42                         ` Geert Uytterhoeven
  2007-08-17  8:52                         ` Andi Kleen
  2007-08-17 22:29                         ` Segher Boessenkool
  4 siblings, 0 replies; 1546+ messages in thread
From: Geert Uytterhoeven @ 2007-08-17  6:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Mackerras, Nick Piggin, Segher Boessenkool, heiko.carstens,
	horms, linux-kernel, rpjday, ak, netdev, cfriesen, akpm,
	jesper.juhl, linux-arch, zlynx, satyam, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, wensong, wjiang

On Thu, 16 Aug 2007, Linus Torvalds wrote:
> On Fri, 17 Aug 2007, Paul Mackerras wrote:
> > I'm really surprised it's as much as a few K.  I tried it on powerpc
> > and it only saved 40 bytes (10 instructions) for a G5 config.
> 
> One of the things that "volatile" generally screws up is a simple
> 
> 	volatile int i;
> 
> 	i++;
> 
> which a compiler will generally get horribly, horribly wrong.
> 
> In a reasonable world, gcc should just make that be (on x86)
> 
> 	addl $1,i(%rip)
> 
> on x86-64, which is indeed what it does without the volatile. But with the 
> volatile, the compiler gets really nervous, and doesn't dare do it in one 
> instruction, and thus generates crap like
> 
>         movl    i(%rip), %eax
>         addl    $1, %eax
>         movl    %eax, i(%rip)
> 
> instead. For no good reason, except that "volatile" just doesn't have any 
> good/clear semantics for the compiler, so most compilers will just make it 
> be "I will not touch this access in any way, shape, or form". Including 
> even trivially correct instruction optimization/combination.

Apart from having to fetch more bytes for the instructions (which does
matter), execution time is probably the same on modern processors, as they
convert the single instruction to RISC-style load, modify, store anyway.

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:15                                               ` Nick Piggin
  2007-08-17  4:02                                                 ` Paul Mackerras
@ 2007-08-17  7:25                                                 ` Stefan Richter
  2007-08-17  8:06                                                   ` Nick Piggin
  2007-08-17 22:14                                                 ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Segher Boessenkool
  2 siblings, 1 reply; 1546+ messages in thread
From: Stefan Richter @ 2007-08-17  7:25 UTC (permalink / raw)
  To: Nick Piggin
  Cc: paulmck, Herbert Xu, Paul Mackerras, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Nick Piggin wrote:
> I don't know why people would assume volatile of atomics. AFAIK, most
> of the documentation is pretty clear that all the atomic stuff can be
> reordered etc. except for those that modify and return a value.

Which documentation is there?

For driver authors, there is LDD3.  It doesn't specifically cover
effects of optimization on accesses to atomic_t.

For architecture port authors, there is Documentation/atomic_ops.txt.
Driver authors also can learn something from that document, as it
indirectly documents the atomic_t and bitops APIs.

Prompted by this thread, I reread this document, and indeed, the
sentence "Unlike the above routines, it is required that explicit memory
barriers are performed before and after [atomic_{inc,dec}_return]"
indicates that atomic_read (one of the "above routines") is very
different from all other atomic_t accessors that return values.

This is strange.  Why is it that atomic_read stands out that way?  IMO
this API imbalance is quite unexpected by many people.  Wouldn't it be
beneficial to change the atomic_read API to behave the same like all
other atomic_t accessors that return values?

OK, it is also different from the other accessors that return data in so
far as it doesn't modify the data.  But as driver "author", i.e. user of
the API, I can't see much use of an atomic_read that can be reordered
and, more importantly, can be optimized away by the compiler.  Sure, now
that I learned of these properties I can start to audit code and insert
barriers where I believe they are needed, but this simply means that
almost all occurrences of atomic_read will get barriers (unless there
already are implicit but more or less obvious barriers like msleep).
-- 
Stefan Richter
-=====-=-=== =--- =---=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  5:56                         ` Satyam Sharma
@ 2007-08-17  7:26                           ` Nick Piggin
  2007-08-17  8:47                             ` Satyam Sharma
  2007-08-17 22:49                           ` Segher Boessenkool
  1 sibling, 1 reply; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17  7:26 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Linus Torvalds, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, Linux Kernel Mailing List, rpjday, ak,
	netdev, cfriesen, Andrew Morton, jesper.juhl, linux-arch, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem, wensong,
	wjiang

Satyam Sharma wrote:

> #define atomic_read_volatile(v)				\
> 	({						\
> 		forget((v)->counter);			\
> 		((v)->counter);				\
> 	})
> 
> where:

*vomit* :)

Not only do I hate the keyword volatile, but the barrier is only a
one-sided affair so its probable this is going to have slightly
different allowed reorderings than a real volatile access.

Also, why would you want to make these insane accessors for atomic_t
types? Just make sure everybody knows the basics of barriers, and they
can apply that knowledge to atomic_t and all other lockless memory
accesses as well.

> #define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))

I like order(x) better, but it's not the most perfect name either.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  1:01                                                 ` Paul E. McKenney
@ 2007-08-17  7:39                                                   ` Satyam Sharma
  2007-08-17 14:31                                                     ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17  7:39 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher



On Thu, 16 Aug 2007, Paul E. McKenney wrote:

> On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:
> > On Thu, Aug 16, 2007 at 09:34:41AM -0700, Paul E. McKenney wrote:
> > >
> > > The compiler can also reorder non-volatile accesses.  For an example
> > > patch that cares about this, please see:
> > > 
> > > 	http://lkml.org/lkml/2007/8/7/280
> > > 
> > > This patch uses an ORDERED_WRT_IRQ() in rcu_read_lock() and
> > > rcu_read_unlock() to ensure that accesses aren't reordered with respect
> > > to interrupt handlers and NMIs/SMIs running on that same CPU.
> > 
> > Good, finally we have some code to discuss (even though it's
> > not actually in the kernel yet).
> 
> There was some earlier in this thread as well.

Hmm, I never quite got what all this interrupt/NMI/SMI handling and
RCU business you mentioned earlier was all about, but now that you've
pointed to the actual code and issues with it ...


> > First of all, I think this illustrates that what you want
> > here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
> > macro occurs a lot more times in your patch than atomic
> > reads/sets.  So *assuming* that it was necessary at all,
> > then having an ordered variant of the atomic_read/atomic_set
> > ops could do just as well.
> 
> Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
> to maintain ordering, then I could just use them instead of having to
> create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
> different patch.)

+#define WHATEVER(x)	(*(volatile typeof(x) *)&(x))

I suppose one could want volatile access semantics for stuff that's
a bit-field too, no?

Also, this gives *zero* "re-ordering" guarantees that your code wants
as you've explained it below) -- neither w.r.t. CPU re-ordering (which
probably you don't care about) *nor* w.r.t. compiler re-ordering
(which you definitely _do_ care about).


> > However, I still don't know which atomic_read/atomic_set in
> > your patch would be broken if there were no volatile.  Could
> > you please point them out?
> 
> Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
> atomic_read() and atomic_set().  Starting with __rcu_read_lock():
> 
> o	If "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++"
> 	was ordered by the compiler after
> 	"ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1", then
> 	suppose an NMI/SMI happened after the rcu_read_lock_nesting but
> 	before the rcu_flipctr.
> 
> 	Then if there was an rcu_read_lock() in the SMI/NMI
> 	handler (which is perfectly legal), the nested rcu_read_lock()
> 	would believe that it could take the then-clause of the
> 	enclosing "if" statement.  But because the rcu_flipctr per-CPU
> 	variable had not yet been incremented, an RCU updater would
> 	be within its rights to assume that there were no RCU reads
> 	in progress, thus possibly yanking a data structure out from
> 	under the reader in the SMI/NMI function.
> 
> 	Fatal outcome.  Note that only one CPU is involved here
> 	because these are all either per-CPU or per-task variables.

Ok, so you don't care about CPU re-ordering. Still, I should let you know
that your ORDERED_WRT_IRQ() -- bad name, btw -- is still buggy. What you
want is a full compiler optimization barrier().

[ Your code probably works now, and emits correct code, but that's
  just because of gcc did what it did. Nothing in any standard,
  or in any documented behaviour of gcc, or anything about the real
  (or expected) semantics of "volatile" is protecting the code here. ]


> o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1"
> 	was ordered by the compiler to follow the
> 	"ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx", and an NMI/SMI
> 	happened between the two, then an __rcu_read_lock() in the NMI/SMI
> 	would incorrectly take the "else" clause of the enclosing "if"
> 	statement.  If some other CPU flipped the rcu_ctrlblk.completed
> 	in the meantime, then the __rcu_read_lock() would (correctly)
> 	write the new value into rcu_flipctr_idx.
> 
> 	Well and good so far.  But the problem arises in
> 	__rcu_read_unlock(), which then decrements the wrong counter.
> 	Depending on exactly how subsequent events played out, this could
> 	result in either prematurely ending grace periods or never-ending
> 	grace periods, both of which are fatal outcomes.
> 
> And the following are not needed in the current version of the
> patch, but will be in a future version that either avoids disabling
> irqs or that dispenses with the smp_read_barrier_depends() that I
> have 99% convinced myself is unneeded:
> 
> o	nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting);
> 
> o	idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1;
> 
> Furthermore, in that future version, irq handlers can cause the same
> mischief that SMI/NMI handlers can in this version.
> 
> Next, looking at __rcu_read_unlock():
> 
> o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting - 1"
> 	was reordered by the compiler to follow the
> 	"ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])--",
> 	then if an NMI/SMI containing an rcu_read_lock() occurs between
> 	the two, this nested rcu_read_lock() would incorrectly believe
> 	that it was protected by an enclosing RCU read-side critical
> 	section as described in the first reversal discussed for
> 	__rcu_read_lock() above.  Again, fatal outcome.
> 
> This is what we have now.  It is not hard to imagine situations that
> interact with -both- interrupt handlers -and- other CPUs, as described
> earlier.

It's not about interrupt/SMI/NMI handlers at all! What you clearly want,
simply put, is that a certain stream of C statements must be emitted
by the compiler _as they are_ with no re-ordering optimizations! You must
*definitely* use barrier(), IMHO.


Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  7:25                                                 ` Stefan Richter
@ 2007-08-17  8:06                                                   ` Nick Piggin
  2007-08-17  8:58                                                     ` Satyam Sharma
                                                                       ` (2 more replies)
  0 siblings, 3 replies; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17  8:06 UTC (permalink / raw)
  To: Stefan Richter
  Cc: paulmck, Herbert Xu, Paul Mackerras, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Stefan Richter wrote:
> Nick Piggin wrote:
> 
>>I don't know why people would assume volatile of atomics. AFAIK, most
>>of the documentation is pretty clear that all the atomic stuff can be
>>reordered etc. except for those that modify and return a value.
> 
> 
> Which documentation is there?

Documentation/atomic_ops.txt


> For driver authors, there is LDD3.  It doesn't specifically cover
> effects of optimization on accesses to atomic_t.
> 
> For architecture port authors, there is Documentation/atomic_ops.txt.
> Driver authors also can learn something from that document, as it
> indirectly documents the atomic_t and bitops APIs.
>

"Semantics and Behavior of Atomic and Bitmask Operations" is
pretty direct :)

Sure, it says that it's for arch maintainers, but there is no
reason why users can't make use of it.


> Prompted by this thread, I reread this document, and indeed, the
> sentence "Unlike the above routines, it is required that explicit memory
> barriers are performed before and after [atomic_{inc,dec}_return]"
> indicates that atomic_read (one of the "above routines") is very
> different from all other atomic_t accessors that return values.
> 
> This is strange.  Why is it that atomic_read stands out that way?  IMO

It is not just atomic_read of course. It is atomic_add,sub,inc,dec,set.


> this API imbalance is quite unexpected by many people.  Wouldn't it be
> beneficial to change the atomic_read API to behave the same like all
> other atomic_t accessors that return values?

It is very consistent and well defined. Operations which both modify
the data _and_ return something are defined to have full barriers
before and after.

What do you want to add to the other atomic accessors? Full memory
barriers? Only compiler barriers? It's quite likely that if you think
some barriers will fix bugs, then there are other bugs lurking there
anyway.

Just use spinlocks if you're not absolutely clear about potential
races and memory ordering issues -- they're pretty cheap and simple.


> OK, it is also different from the other accessors that return data in so
> far as it doesn't modify the data.  But as driver "author", i.e. user of
> the API, I can't see much use of an atomic_read that can be reordered
> and, more importantly, can be optimized away by the compiler.

It will return to you an atomic snapshot of the data (loaded from
memory at some point since the last compiler barrier). All you have
to be aware of compiler barriers and the Linux SMP memory ordering
model, which should be a given if you are writing lockless code.


> Sure, now
> that I learned of these properties I can start to audit code and insert
> barriers where I believe they are needed, but this simply means that
> almost all occurrences of atomic_read will get barriers (unless there
> already are implicit but more or less obvious barriers like msleep).

You might find that these places that appear to need barriers are
buggy for other reasons anyway. Can you point to some in-tree code
we can have a look at?

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  5:41                                           ` Paul Mackerras
@ 2007-08-17  8:28                                             ` Satyam Sharma
  0 siblings, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17  8:28 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Stefan Richter, Christoph Lameter, Paul E. McKenney,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Fri, 17 Aug 2007, Paul Mackerras wrote:

> Herbert Xu writes:
> 
> > On Fri, Aug 17, 2007 at 03:09:57PM +1000, Paul Mackerras wrote:
> > > Herbert Xu writes:
> > > 
> > > > Can you find an actual atomic_read code snippet there that is
> > > > broken without the volatile modifier?
> > > 
> > > There are some in arch-specific code, for example line 1073 of
> > > arch/mips/kernel/smtc.c.  On mips, cpu_relax() is just barrier(), so
> > > the empty loop body is ok provided that atomic_read actually does the
> > > load each time around the loop.
> > 
> > A barrier() is all you need to force the compiler to reread
> > the value.
> > 
> > The people advocating volatile in this thread are talking
> > about code that doesn't use barrier()/cpu_relax().
> 
> Did you look at it?  Here it is:
> 
> 	/* Someone else is initializing in parallel - let 'em finish */
> 	while (atomic_read(&idle_hook_initialized) < 1000)
> 		;

Honestly, this thread is suffering from HUGE communication gaps.

What Herbert (obviously) meant there was that "this loop could've
been okay _without_ using volatile-semantics-atomic_read() also, if
only it used cpu_relax()".

That does work, because cpu_relax() is _at least_ barrier() on all
archs (on some it also emits some arch-dependent "pause" kind of
instruction).

Now, saying that "MIPS does not have such an instruction so I won't
use cpu_relax() for arch-dependent-busy-while-loops in arch/mips/"
sounds like a wrong argument, because: tomorrow, such arch's _may_
introduce such an instruction, so naturally, at that time we'd
change cpu_relax() appropriately (in reality, we would actually
*re-define* cpu_relax() and ensure that the correct version gets
pulled in depending on whether the callsite code is legacy or only
for the newer such CPUs of said arch, whatever), but loops such as
this would remain un-changed, because they never used cpu_relax()!

OTOH an argument that said the following would've made a stronger case:

"I don't want to use cpu_relax() because that's a full memory
clobber barrier() and I have loop-invariants / other variables
around in that code that I *don't* want the compiler to forget
just because it used cpu_relax(), and hence I will not use
cpu_relax() but instead make my atomic_read() itself have
"volatility" semantics. Not just that, but I will introduce a
cpu_relax_no_barrier() on MIPS, that would be a no-op #define
for now, but which may not be so forever, and continue to use
that in such busy loops."

In general, please read the thread-summary I've tried to do at:
http://lkml.org/lkml/2007/8/17/25
Feel free to continue / comment / correct stuff from there, there's
too much confusion and circular-arguments happening on this thread
otherwise.

[ I might've made an incorrect statement there about
  "volatile" w.r.t. cache on non-x86 archs, I think. ]

Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  6:26                                                 ` Satyam Sharma
@ 2007-08-17  8:38                                                   ` Nick Piggin
  2007-08-17  9:14                                                     ` Satyam Sharma
  2007-08-17 11:08                                                     ` Stefan Richter
  0 siblings, 2 replies; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17  8:38 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Paul Mackerras, Linus Torvalds, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Satyam Sharma wrote:
> 
> On Fri, 17 Aug 2007, Herbert Xu wrote:
> 
> 
>>On Fri, Aug 17, 2007 at 01:43:27PM +1000, Paul Mackerras wrote:
>>
>>BTW, the sort of missing barriers that triggered this thread
>>aren't that subtle.  It'll result in a simple lock-up if the
>>loop condition holds upon entry.  At which point it's fairly
>>straightforward to find the culprit.
> 
> 
> Not necessarily. A barrier-less buggy code such as below:
> 
> 	atomic_set(&v, 0);
> 
> 	... /* some initial code */
> 
> 	while (atomic_read(&v))
> 		;
> 
> 	... /* code that MUST NOT be executed unless v becomes non-zero */
> 
> (where v->counter is has no volatile access semantics)
> 
> could be generated by the compiler to simply *elid* or *do away* with
> the loop itself, thereby making the:
> 
> "/* code that MUST NOT be executed unless v becomes non-zero */"
> 
> to be executed even when v is zero! That is subtle indeed, and causes
> no hard lockups.

Then I presume you mean

while (!atomic_read(&v))
     ;

Which is just the same old infinite loop bug solved with cpu_relax().
These are pretty trivial to audit and fix, and also to debug, I would
think.

> Granted, the above IS buggy code. But, the stated objective is to avoid
> heisenbugs.

Anyway, why are you making up code snippets that are buggy in other
ways in order to support this assertion being made that lots of kernel
code supposedly depends on volatile semantics. Just reference the
actual code.

> And we have driver / subsystem maintainers such as Stefan
> coming up and admitting that often a lot of code that's written to use
> atomic_read() does assume the read will not be elided by the compiler.

So these are broken on i386 and x86-64?

Are they definitely safe on SMP and weakly ordered machines with
just a simple compiler barrier there? Because I would not be
surprised if there are a lot of developers who don't really know
what to assume when it comes to memory ordering issues.

This is not a dig at driver writers: we still have memory ordering
problems in the VM too (and probably most of the subtle bugs in
lockless VM code are memory ordering ones). Let's not make up a
false sense of security and hope that sprinkling volatile around
will allow people to write bug-free lockless code. If a writer
can't be bothered reading API documentation and learning the Linux
memory model, they can still be productive writing safely locked
code.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  7:26                           ` Nick Piggin
@ 2007-08-17  8:47                             ` Satyam Sharma
  2007-08-17  9:15                               ` Nick Piggin
  2007-08-17  9:48                               ` Paul Mackerras
  0 siblings, 2 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17  8:47 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Linus Torvalds, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, Linux Kernel Mailing List, rpjday, ak,
	netdev, cfriesen, Andrew Morton, jesper.juhl, linux-arch, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem, wensong,
	wjiang



On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> 
> > #define atomic_read_volatile(v)				\
> > 	({						\
> > 		forget((v)->counter);			\
> > 		((v)->counter);				\
> > 	})
> > 
> > where:
> 
> *vomit* :)

I wonder if this'll generate smaller and better code than _both_ the
other atomic_read_volatile() variants. Would need to build allyesconfig
on lots of diff arch's etc to test the theory though.


> Not only do I hate the keyword volatile, but the barrier is only a
> one-sided affair so its probable this is going to have slightly
> different allowed reorderings than a real volatile access.

True ...


> Also, why would you want to make these insane accessors for atomic_t
> types? Just make sure everybody knows the basics of barriers, and they
> can apply that knowledge to atomic_t and all other lockless memory
> accesses as well.

Code that looks like:

	while (!atomic_read(&v)) {
		...
		cpu_relax_no_barrier();
		forget(v.counter);
		        ^^^^^^^^
	}

would be uglier. Also think about code such as:

	a = atomic_read();
	if (!a)
		do_something();

	forget();
	a = atomic_read();
	... /* some code that depends on value of a, obviously */

	forget();
	a = atomic_read();
	...

So much explicit sprinkling of "forget()" looks ugly.

	atomic_read_volatile()

on the other hand, looks neater. The "_volatile()" suffix makes it also
no less explicit than an explicit barrier-like macro that this primitive
is something "special", for code clarity purposes.


> > #define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))
> 
> I like order(x) better, but it's not the most perfect name either.

forget(x) is just a stupid-placeholder-for-a-better-name. order(x) sounds
good but we could leave quibbling about function or macro names for later,
this thread is noisy as it is :-)

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:42                       ` Linus Torvalds
                                           ` (2 preceding siblings ...)
  2007-08-17  6:42                         ` Geert Uytterhoeven
@ 2007-08-17  8:52                         ` Andi Kleen
  2007-08-17 10:08                           ` Satyam Sharma
  2007-08-17 22:29                         ` Segher Boessenkool
  4 siblings, 1 reply; 1546+ messages in thread
From: Andi Kleen @ 2007-08-17  8:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Mackerras, Nick Piggin, Segher Boessenkool, heiko.carstens,
	horms, linux-kernel, rpjday, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, satyam, clameter, schwidefsky, Chris Snook,
	Herbert Xu, davem, wensong, wjiang

On Friday 17 August 2007 05:42, Linus Torvalds wrote:
> On Fri, 17 Aug 2007, Paul Mackerras wrote:
> > I'm really surprised it's as much as a few K.  I tried it on powerpc
> > and it only saved 40 bytes (10 instructions) for a G5 config.
>
> One of the things that "volatile" generally screws up is a simple
>
> 	volatile int i;
>
> 	i++;

But for atomic_t people use atomic_inc() anyways which does this correctly.
It shouldn't really matter for atomic_t.

I'm worrying a bit that the volatile atomic_t change caused subtle code 
breakage like these delay read loops people here pointed out.
Wouldn't it be safer to just re-add the volatile to atomic_read() 
for 2.6.23? Or alternatively make it asm(), but volatile seems more
proven.

-Andi

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:06                                                   ` Nick Piggin
@ 2007-08-17  8:58                                                     ` Satyam Sharma
  2007-08-17  9:15                                                       ` Nick Piggin
  2007-08-17 10:48                                                     ` Stefan Richter
  2007-08-18 14:35                                                     ` LDD3 pitfalls (was Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures) Stefan Richter
  2 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17  8:58 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Stefan Richter, paulmck, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher



On Fri, 17 Aug 2007, Nick Piggin wrote:

> Stefan Richter wrote:
> [...]
> Just use spinlocks if you're not absolutely clear about potential
> races and memory ordering issues -- they're pretty cheap and simple.

I fully agree with this. As Paul Mackerras mentioned elsewhere,
a lot of authors sprinkle atomic_t in code thinking they're somehow
done with *locking*. This is sad, and I wonder if it's time for a
Documentation/atomic-considered-dodgy.txt kind of document :-)


> > Sure, now
> > that I learned of these properties I can start to audit code and insert
> > barriers where I believe they are needed, but this simply means that
> > almost all occurrences of atomic_read will get barriers (unless there
> > already are implicit but more or less obvious barriers like msleep).
> 
> You might find that these places that appear to need barriers are
> buggy for other reasons anyway. Can you point to some in-tree code
> we can have a look at?

Such code was mentioned elsewhere (query nodemgr_host_thread in cscope)
that managed to escape the requirement for a barrier only because of
some completely un-obvious compilation-unit-scope thing. But I find such
an non-explicit barrier quite bad taste. Stefan, do consider plunking an
explicit call to barrier() there.


Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:38                                                   ` Nick Piggin
@ 2007-08-17  9:14                                                     ` Satyam Sharma
  2007-08-17  9:31                                                       ` Nick Piggin
  2007-08-17 11:08                                                     ` Stefan Richter
  1 sibling, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17  9:14 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Herbert Xu, Paul Mackerras, Linus Torvalds, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> [...]
> > Granted, the above IS buggy code. But, the stated objective is to avoid
> > heisenbugs.
    ^^^^^^^^^^

> Anyway, why are you making up code snippets that are buggy in other
> ways in order to support this assertion being made that lots of kernel
> code supposedly depends on volatile semantics. Just reference the
> actual code.

Because the point is *not* about existing bugs in kernel code. At some
point Chris Snook (who started this thread) did write that "If I knew
of the existing bugs in the kernel, I would be sending patches for them,
not this series" or something to that effect.

The point is about *author expecations*. If people do expect atomic_read()
(or a variant thereof) to have volatile semantics, why not give them such
a variant?

And by the way, the point is *also* about the fact that cpu_relax(), as
of today, implies a full memory clobber, which is not what a lot of such
loops want. (due to stuff mentioned elsewhere, summarized in that summary)

> > And we have driver / subsystem maintainers such as Stefan
> > coming up and admitting that often a lot of code that's written to use
> > atomic_read() does assume the read will not be elided by the compiler.
                                                             ^^^^^^^^^^^^^

(so it's about compiler barrier expectations only, though I fully agree
that those who're using atomic_t as if it were some magic thing that lets
them write lockless code are sorrily mistaken.)

> So these are broken on i386 and x86-64?

Possibly, but the point is not about existing bugs, as mentioned above.

Some such bugs have been found nonetheless -- reminds me, can somebody
please apply http://www.gossamer-threads.com/lists/linux/kernel/810674 ?

Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:47                             ` Satyam Sharma
@ 2007-08-17  9:15                               ` Nick Piggin
  2007-08-17 10:12                                 ` Satyam Sharma
  2007-08-17  9:48                               ` Paul Mackerras
  1 sibling, 1 reply; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17  9:15 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Linus Torvalds, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, Linux Kernel Mailing List, rpjday, ak,
	netdev, cfriesen, Andrew Morton, jesper.juhl, linux-arch, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem, wensong,
	wjiang

Satyam Sharma wrote:
> 
> On Fri, 17 Aug 2007, Nick Piggin wrote:

>>Also, why would you want to make these insane accessors for atomic_t
>>types? Just make sure everybody knows the basics of barriers, and they
>>can apply that knowledge to atomic_t and all other lockless memory
>>accesses as well.
> 
> 
> Code that looks like:
> 
> 	while (!atomic_read(&v)) {
> 		...
> 		cpu_relax_no_barrier();
> 		forget(v.counter);
> 		        ^^^^^^^^
> 	}
> 
> would be uglier. Also think about code such as:

I think they would both be equally ugly, but the atomic_read_volatile
variant would be more prone to subtle bugs because of the weird
implementation.

And it would be more ugly than introducing an order(x) statement for
all memory operations, and adding an order_atomic() wrapper for it
for atomic types.

> 	a = atomic_read();
> 	if (!a)
> 		do_something();
> 
> 	forget();
> 	a = atomic_read();
> 	... /* some code that depends on value of a, obviously */
> 
> 	forget();
> 	a = atomic_read();
> 	...
> 
> So much explicit sprinkling of "forget()" looks ugly.

Firstly, why is it ugly? It's nice because of those nice explicit
statements there that give us a good heads up and would have some
comments attached to them (also, lack of the word "volatile" is
always a plus).

Secondly, what sort of code would do such a thing? In most cases,
it is probably riddled with bugs anyway (unless it is doing a
really specific sequence of interrupts or something, but in that
case it is very likely to either require locking or busy waits
anyway -> ie. barriers).

> on the other hand, looks neater. The "_volatile()" suffix makes it also
> no less explicit than an explicit barrier-like macro that this primitive
> is something "special", for code clarity purposes.

Just don't use the word volatile, and have barriers both before
and after the memory operation, and I'm OK with it. I don't see
the point though, when you could just have a single barrier(x)
barrier function defined for all memory locations, rather than
this odd thing that only works for atomics (and would have to
be duplicated for atomic_set.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:58                                                     ` Satyam Sharma
@ 2007-08-17  9:15                                                       ` Nick Piggin
  2007-08-17 10:03                                                         ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17  9:15 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, paulmck, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Satyam Sharma wrote:
> 
> On Fri, 17 Aug 2007, Nick Piggin wrote:

>>>Sure, now
>>>that I learned of these properties I can start to audit code and insert
>>>barriers where I believe they are needed, but this simply means that
>>>almost all occurrences of atomic_read will get barriers (unless there
>>>already are implicit but more or less obvious barriers like msleep).
>>
>>You might find that these places that appear to need barriers are
>>buggy for other reasons anyway. Can you point to some in-tree code
>>we can have a look at?
> 
> 
> Such code was mentioned elsewhere (query nodemgr_host_thread in cscope)
> that managed to escape the requirement for a barrier only because of
> some completely un-obvious compilation-unit-scope thing. But I find such
> an non-explicit barrier quite bad taste. Stefan, do consider plunking an
> explicit call to barrier() there.

It is very obvious. msleep calls schedule() (ie. sleeps), which is
always a barrier.

The "unobvious" thing is that you wanted to know how the compiler knows
a function is a barrier -- answer is that if it does not *know* it is not
a barrier, it must assume it is a barrier. If the whole msleep call chain
including the scheduler were defined static in the current compilation
unit, then it would still be a barrier because it would actually be able
to see the barriers in schedule(void), if nothing else.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  9:14                                                     ` Satyam Sharma
@ 2007-08-17  9:31                                                       ` Nick Piggin
  2007-08-17 10:55                                                         ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17  9:31 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Paul Mackerras, Linus Torvalds, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Satyam Sharma wrote:
> 
> On Fri, 17 Aug 2007, Nick Piggin wrote:
> 
> 
>>Satyam Sharma wrote:
>>[...]
>>
>>>Granted, the above IS buggy code. But, the stated objective is to avoid
>>>heisenbugs.
> 
>     ^^^^^^^^^^
> 
> 
>>Anyway, why are you making up code snippets that are buggy in other
>>ways in order to support this assertion being made that lots of kernel
>>code supposedly depends on volatile semantics. Just reference the
>>actual code.
> 
> 
> Because the point is *not* about existing bugs in kernel code. At some
> point Chris Snook (who started this thread) did write that "If I knew
> of the existing bugs in the kernel, I would be sending patches for them,
> not this series" or something to that effect.
> 
> The point is about *author expecations*. If people do expect atomic_read()
> (or a variant thereof) to have volatile semantics, why not give them such
> a variant?

Because they should be thinking about them in terms of barriers, over
which the compiler / CPU is not to reorder accesses or cache memory
operations, rather than "special" "volatile" accesses. Linux's whole
memory ordering and locking model is completely geared around the
former.


> And by the way, the point is *also* about the fact that cpu_relax(), as
> of today, implies a full memory clobber, which is not what a lot of such
> loops want. (due to stuff mentioned elsewhere, summarized in that summary)

That's not the point, because as I also mentioned, the logical extention
to Linux's barrier API to handle this is the order(x) macro. Again, not
special volatile accessors.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:47                             ` Satyam Sharma
  2007-08-17  9:15                               ` Nick Piggin
@ 2007-08-17  9:48                               ` Paul Mackerras
  2007-08-17 10:23                                 ` Satyam Sharma
  1 sibling, 1 reply; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-17  9:48 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Nick Piggin, Linus Torvalds, Segher Boessenkool, heiko.carstens,
	horms, Linux Kernel Mailing List, rpjday, ak, netdev, cfriesen,
	Andrew Morton, jesper.juhl, linux-arch, zlynx, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

Satyam Sharma writes:

> I wonder if this'll generate smaller and better code than _both_ the
> other atomic_read_volatile() variants. Would need to build allyesconfig
> on lots of diff arch's etc to test the theory though.

I'm sure it would be a tiny effect.

This whole thread is arguing about effects that are quite
insignificant.  On the one hand we have the non-volatile proponents,
who want to let the compiler do extra optimizations - which amounts to
letting it elide maybe a dozen loads in the whole kernel, loads which
would almost always be L1 cache hits.

On the other hand we have the volatile proponents, who are concerned
that some code somewhere in the kernel might be buggy without the
volatile behaviour, and who also want to be able to remove some
barriers and thus save a few bytes of code and a few loads here and
there (and possibly some stores too).

Either way the effect on code size and execution time is miniscule.

In the end the strongest argument is actually that gcc generates
unnecessarily verbose code on x86[-64] for volatile accesses.  Even
then we're only talking about ~2000 bytes, or less than 1 byte per
instance of atomic_read on average, about 0.06% of the kernel text
size.

The x86[-64] developers seem to be willing to bear the debugging cost
involved in having the non-volatile behaviour for atomic_read, in
order to save the 2kB.  That's fine with me.  Either way I think
somebody should audit all the uses of atomic_read, not just for
missing barriers, but also to find the places where it's used in a
racy manner.  Then we can work out where the races matter and fix them
if they do.

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  9:15                                                       ` Nick Piggin
@ 2007-08-17 10:03                                                         ` Satyam Sharma
  2007-08-17 11:50                                                           ` Nick Piggin
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17 10:03 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Stefan Richter, paulmck, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher



On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> > 
> > On Fri, 17 Aug 2007, Nick Piggin wrote:
> 
> > > > Sure, now
> > > > that I learned of these properties I can start to audit code and insert
> > > > barriers where I believe they are needed, but this simply means that
> > > > almost all occurrences of atomic_read will get barriers (unless there
> > > > already are implicit but more or less obvious barriers like msleep).
> > > 
> > > You might find that these places that appear to need barriers are
> > > buggy for other reasons anyway. Can you point to some in-tree code
> > > we can have a look at?
> > 
> > 
> > Such code was mentioned elsewhere (query nodemgr_host_thread in cscope)
> > that managed to escape the requirement for a barrier only because of
> > some completely un-obvious compilation-unit-scope thing. But I find such
> > an non-explicit barrier quite bad taste. Stefan, do consider plunking an
> > explicit call to barrier() there.
> 
> It is very obvious. msleep calls schedule() (ie. sleeps), which is
> always a barrier.

Probably you didn't mean that, but no, schedule() is not barrier because
it sleeps. It's a barrier because it's invisible.

> The "unobvious" thing is that you wanted to know how the compiler knows
> a function is a barrier -- answer is that if it does not *know* it is not
> a barrier, it must assume it is a barrier.

True, that's clearly what happens here. But are you're definitely joking
that this is "obvious" in terms of code-clarity, right?

Just 5 minutes back you mentioned elsewhere you like seeing lots of
explicit calls to barrier() (with comments, no less, hmm? :-)

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:52                         ` Andi Kleen
@ 2007-08-17 10:08                           ` Satyam Sharma
  0 siblings, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17 10:08 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Linus Torvalds, Paul Mackerras, Nick Piggin, Segher Boessenkool,
	heiko.carstens, horms, Linux Kernel Mailing List, rpjday, netdev,
	cfriesen, Andrew Morton, jesper.juhl, linux-arch, zlynx, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

On Fri, 17 Aug 2007, Andi Kleen wrote:

> On Friday 17 August 2007 05:42, Linus Torvalds wrote:
> > On Fri, 17 Aug 2007, Paul Mackerras wrote:
> > > I'm really surprised it's as much as a few K.  I tried it on powerpc
> > > and it only saved 40 bytes (10 instructions) for a G5 config.
> >
> > One of the things that "volatile" generally screws up is a simple
> >
> > 	volatile int i;
> >
> > 	i++;
> 
> But for atomic_t people use atomic_inc() anyways which does this correctly.
> It shouldn't really matter for atomic_t.
> 
> I'm worrying a bit that the volatile atomic_t change caused subtle code 
> breakage like these delay read loops people here pointed out.

Umm, I followed most of the thread, but which breakage is this?

> Wouldn't it be safer to just re-add the volatile to atomic_read() 
> for 2.6.23? Or alternatively make it asm(), but volatile seems more
> proven.

The problem with volatile is not just trashy code generation (which also
definitely is a major problem), but definition holes, and implementation
inconsistencies. Making it asm() is not the only other alternative to
volatile either (read another reply to this mail), but considering most
of the thread has been about people not wanting even a
atomic_read_volatile() variant, making atomic_read() itself have volatile
semantics sounds ... strange :-)

PS: http://lkml.org/lkml/2007/8/15/407 was submitted a couple days back,
any word if you saw that?

I have another one for you:

[PATCH] i386, x86_64: __const_udelay() should not be marked inline

Because it can never get inlined in any callsite (each translation unit
is compiled separately for the kernel and so the implementation of
__const_udelay() would be invisible to all other callsites). In fact it
turns out, the correctness of callsites at arch/x86_64/kernel/crash.c:97
and arch/i386/kernel/crash.c:101 explicitly _depends_ upon it not being
inlined, and also it's an exported symbol (modules may want to call
mdelay() and udelay() that often becomes __const_udelay() after some
macro-ing in various headers). So let's not mark it as "inline" either.

Signed-off-by: Satyam Sharma <satyam@infradead.org>

---

 arch/i386/lib/delay.c   |    2 +-
 arch/x86_64/lib/delay.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/i386/lib/delay.c b/arch/i386/lib/delay.c
index f6edb11..0082c99 100644
--- a/arch/i386/lib/delay.c
+++ b/arch/i386/lib/delay.c
@@ -74,7 +74,7 @@ void __delay(unsigned long loops)
 	delay_fn(loops);
 }

-inline void __const_udelay(unsigned long xloops)
+void __const_udelay(unsigned long xloops)
 {
 	int d0;

diff --git a/arch/x86_64/lib/delay.c b/arch/x86_64/lib/delay.c
index 2dbebd3..d0cd9cd 100644
--- a/arch/x86_64/lib/delay.c
+++ b/arch/x86_64/lib/delay.c
@@ -38,7 +38,7 @@ void __delay(unsigned long loops)
 }
 EXPORT_SYMBOL(__delay);

-inline void __const_udelay(unsigned long xloops)
+void __const_udelay(unsigned long xloops)
 {
 	__delay(((xloops * HZ * cpu_data[raw_smp_processor_id()].loops_per_jiffy) >> 32) + 1);
 }

^ permalink raw reply related	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  9:15                               ` Nick Piggin
@ 2007-08-17 10:12                                 ` Satyam Sharma
  2007-08-17 12:14                                   ` Nick Piggin
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17 10:12 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Linus Torvalds, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, Linux Kernel Mailing List, rpjday, ak,
	netdev, cfriesen, Andrew Morton, jesper.juhl, linux-arch, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem, wensong,
	wjiang



On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> > 
> > On Fri, 17 Aug 2007, Nick Piggin wrote:
> 
> > > Also, why would you want to make these insane accessors for atomic_t
> > > types? Just make sure everybody knows the basics of barriers, and they
> > > can apply that knowledge to atomic_t and all other lockless memory
> > > accesses as well.
> > 
> > 
> > Code that looks like:
> > 
> > 	while (!atomic_read(&v)) {
> > 		...
> > 		cpu_relax_no_barrier();
> > 		forget(v.counter);
> > 		        ^^^^^^^^
> > 	}
> > 
> > would be uglier. Also think about code such as:
> 
> I think they would both be equally ugly,

You think both these are equivalent in terms of "looks":

					|
while (!atomic_read(&v)) {		|	while (!atomic_read_xxx(&v)) {
	...				|		...
	cpu_relax_no_barrier();		|		cpu_relax_no_barrier();
	order_atomic(&v);		|	}
}					|

(where order_atomic() is an atomic_t
specific wrapper as you mentioned below)

?

Well, taste varies, but ...

> but the atomic_read_volatile
> variant would be more prone to subtle bugs because of the weird
> implementation.

What bugs?

> And it would be more ugly than introducing an order(x) statement for
> all memory operations, and adding an order_atomic() wrapper for it
> for atomic types.

Oh, that order() / forget() macro [forget() was named such by Chuck Ebbert
earlier in this thread where he first mentioned it, btw] could definitely
be generically introduced for any memory operations.

> > 	a = atomic_read();
> > 	if (!a)
> > 		do_something();
> > 
> > 	forget();
> > 	a = atomic_read();
> > 	... /* some code that depends on value of a, obviously */
> > 
> > 	forget();
> > 	a = atomic_read();
> > 	...
> > 
> > So much explicit sprinkling of "forget()" looks ugly.
> 
> Firstly, why is it ugly? It's nice because of those nice explicit
> statements there that give us a good heads up and would have some
> comments attached to them

atomic_read_xxx (where xxx = whatever naming sounds nice to you) would
obviously also give a heads up, and could also have some comments
attached to it.

> (also, lack of the word "volatile" is always a plus).

Ok, xxx != volatile.

> Secondly, what sort of code would do such a thing?

See the nodemgr_host_thread() that does something similar, though not
exactly same.

> > on the other hand, looks neater. The "_volatile()" suffix makes it also
> > no less explicit than an explicit barrier-like macro that this primitive
> > is something "special", for code clarity purposes.
> 
> Just don't use the word volatile,

That sounds amazingly frivolous, but hey, why not. As I said, ok,
xxx != volatile.

> and have barriers both before and after the memory operation,

How could that lead to bugs? (if you can point to existing code,
but just some testcase / sample code would be fine as well).

> [...] I don't see
> the point though, when you could just have a single barrier(x)
> barrier function defined for all memory locations,

As I said, barrier() is too heavy-handed.

> rather than
> this odd thing that only works for atomics

Why would it work only for atomics? You could use that generic macro
for anything you well damn please.

> (and would have to
> be duplicated for atomic_set.

#define atomic_set_xxx for something similar. Big deal ... NOT.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  9:48                               ` Paul Mackerras
@ 2007-08-17 10:23                                 ` Satyam Sharma
  0 siblings, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17 10:23 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Nick Piggin, Linus Torvalds, Segher Boessenkool, heiko.carstens,
	horms, Linux Kernel Mailing List, rpjday, ak, netdev, cfriesen,
	Andrew Morton, jesper.juhl, linux-arch, zlynx, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang



On Fri, 17 Aug 2007, Paul Mackerras wrote:

> Satyam Sharma writes:
> 
> > I wonder if this'll generate smaller and better code than _both_ the
> > other atomic_read_volatile() variants. Would need to build allyesconfig
> > on lots of diff arch's etc to test the theory though.
> 
> I'm sure it would be a tiny effect.
> 
> This whole thread is arguing about effects that are quite
> insignificant.

Hmm, the fact that this thread became what it did, probably means that
most developers on this list do not mind thinking/arguing about effects
or optimizations that are otherwise "tiny". But yeah, they are tiny
nonetheless.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:06                                                   ` Nick Piggin
  2007-08-17  8:58                                                     ` Satyam Sharma
@ 2007-08-17 10:48                                                     ` Stefan Richter
  2007-08-17 10:58                                                       ` Stefan Richter
  2007-08-18 14:35                                                     ` LDD3 pitfalls (was Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures) Stefan Richter
  2 siblings, 1 reply; 1546+ messages in thread
From: Stefan Richter @ 2007-08-17 10:48 UTC (permalink / raw)
  To: Nick Piggin
  Cc: paulmck, Herbert Xu, Paul Mackerras, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Nick Piggin wrote:
> Stefan Richter wrote:
>> For architecture port authors, there is Documentation/atomic_ops.txt.
>> Driver authors also can learn something from that document, as it
>> indirectly documents the atomic_t and bitops APIs.
> 
> "Semantics and Behavior of Atomic and Bitmask Operations" is
> pretty direct :)

"Indirect", "pretty direct"... It's subjective.

(It is not an API documentation; it is an implementation specification.)

> Sure, it says that it's for arch maintainers, but there is no
> reason why users can't make use of it.
> 
> 
>> Prompted by this thread, I reread this document, and indeed, the
>> sentence "Unlike the above routines, it is required that explicit memory
>> barriers are performed before and after [atomic_{inc,dec}_return]"
>> indicates that atomic_read (one of the "above routines") is very
>> different from all other atomic_t accessors that return values.
>> 
>> This is strange.  Why is it that atomic_read stands out that way?  IMO
> 
> It is not just atomic_read of course. It is atomic_add,sub,inc,dec,set.

Yes, but unlike these, atomic_read returns a value.

Without me (the API user) providing extra barriers, that value may
become something else whenever someone touches code in the vicinity of
the atomic_read.

>> this API imbalance is quite unexpected by many people.  Wouldn't it be
>> beneficial to change the atomic_read API to behave the same like all
>> other atomic_t accessors that return values?
> 
> It is very consistent and well defined. Operations which both modify
> the data _and_ return something are defined to have full barriers
> before and after.

You are right, atomic_read is not only different from accessors that
don't retunr values, it is also different from all other accessors that
return values (because they all also modify the value).  There is just
no actual API documentation, which contributes to the issue that some
people (or at least one: me) learn a little bit late how special
atomic_read is.

> What do you want to add to the other atomic accessors? Full memory
> barriers? Only compiler barriers? It's quite likely that if you think
> some barriers will fix bugs, then there are other bugs lurking there
> anyway.

A lot of different though related issues are discussed in this thread,
but I personally am only occupied by one particular thing:  What kind of
return values do I get from atomic_read.

> Just use spinlocks if you're not absolutely clear about potential
> races and memory ordering issues -- they're pretty cheap and simple.

Probably good advice, like generally if driver guys consider lockless
algorithms.

>> OK, it is also different from the other accessors that return data in so
>> far as it doesn't modify the data.  But as driver "author", i.e. user of
>> the API, I can't see much use of an atomic_read that can be reordered
>> and, more importantly, can be optimized away by the compiler.
> 
> It will return to you an atomic snapshot of the data (loaded from
> memory at some point since the last compiler barrier). All you have
> to be aware of compiler barriers and the Linux SMP memory ordering
> model, which should be a given if you are writing lockless code.

OK, that's what I slowly realized during this discussion, and I
appreciate the explanations that were given here.

>> Sure, now
>> that I learned of these properties I can start to audit code and insert
>> barriers where I believe they are needed, but this simply means that
>> almost all occurrences of atomic_read will get barriers (unless there
>> already are implicit but more or less obvious barriers like msleep).
> 
> You might find that these places that appear to need barriers are
> buggy for other reasons anyway. Can you point to some in-tree code
> we can have a look at?

I could, or could not, if I were through with auditing the code.  I
remembered one case and posted it (nodemgr_host_thread) which was safe
because msleep_interruptible provided the necessary barrier there, and
this implicit barrier is not in danger to be removed by future patches.
-- 
Stefan Richter
-=====-=-=== =--- =---=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  9:31                                                       ` Nick Piggin
@ 2007-08-17 10:55                                                         ` Satyam Sharma
  2007-08-17 12:39                                                           ` Nick Piggin
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17 10:55 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Herbert Xu, Paul Mackerras, Linus Torvalds, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher



On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> > [...]
> > The point is about *author expecations*. If people do expect atomic_read()
> > (or a variant thereof) to have volatile semantics, why not give them such
> > a variant?
> 
> Because they should be thinking about them in terms of barriers, over
> which the compiler / CPU is not to reorder accesses or cache memory
> operations, rather than "special" "volatile" accesses.

This is obviously just a taste thing. Whether to have that forget(x)
barrier as something author should explicitly sprinkle appropriately
in appropriate places in the code by himself or use a primitive that
includes it itself.

I'm not saying "taste matters aren't important" (they are), but I'm really
skeptical if most folks would find the former tasteful.

> > And by the way, the point is *also* about the fact that cpu_relax(), as
> > of today, implies a full memory clobber, which is not what a lot of such
> > loops want. (due to stuff mentioned elsewhere, summarized in that summary)
> 
> That's not the point,

That's definitely the point, why not. This is why "barrier()", being
heavy-handed, is not the best option.

> because as I also mentioned, the logical extention
> to Linux's barrier API to handle this is the order(x) macro. Again, not
> special volatile accessors.

Sure, that forget(x) macro _is_ proposed to be made part of the generic
API. Doesn't explain why not to define/use primitives that has volatility
semantics in itself, though (taste matters apart).

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 10:48                                                     ` Stefan Richter
@ 2007-08-17 10:58                                                       ` Stefan Richter
  0 siblings, 0 replies; 1546+ messages in thread
From: Stefan Richter @ 2007-08-17 10:58 UTC (permalink / raw)
  To: Nick Piggin
  Cc: paulmck, Herbert Xu, Paul Mackerras, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

I wrote:
> Nick Piggin wrote:
>> You might find that these places that appear to need barriers are
>> buggy for other reasons anyway. Can you point to some in-tree code
>> we can have a look at?
> 
> I could, or could not, if I were through with auditing the code.  I
> remembered one case and posted it (nodemgr_host_thread) which was safe
> because msleep_interruptible provided the necessary barrier there, and
> this implicit barrier is not in danger to be removed by future patches.

PS, just in case anybody holds his breath for more example code from me,
I don't plan to continue with an actual audit of the drivers I maintain.
It's an important issue, but my current time budget will restrict me to
look at it ad hoc, per case.  (Open bugs have higher priority than
potential bugs.)
-- 
Stefan Richter
-=====-=-=== =--- =---=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:38                                                   ` Nick Piggin
  2007-08-17  9:14                                                     ` Satyam Sharma
@ 2007-08-17 11:08                                                     ` Stefan Richter
  1 sibling, 0 replies; 1546+ messages in thread
From: Stefan Richter @ 2007-08-17 11:08 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Satyam Sharma, Herbert Xu, Paul Mackerras, Linus Torvalds,
	Christoph Lameter, Chris Snook, Ilpo Jarvinen, Paul E. McKenney,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Nick Piggin wrote:
> Satyam Sharma wrote:
>> And we have driver / subsystem maintainers such as Stefan
>> coming up and admitting that often a lot of code that's written to use
>> atomic_read() does assume the read will not be elided by the compiler.
> 
> So these are broken on i386 and x86-64?

The ieee1394 and firewire subsystems have open, undiagnosed bugs, also
on i386 and x86-64.  But whether there is any bug because of wrong
assumptions about atomic_read among them, I don't know.  I don't know
which assumptions the authors made, I only know that I wasn't aware of
all the properties of atomic_read until now.

> Are they definitely safe on SMP and weakly ordered machines with
> just a simple compiler barrier there? Because I would not be
> surprised if there are a lot of developers who don't really know
> what to assume when it comes to memory ordering issues.
> 
> This is not a dig at driver writers: we still have memory ordering
> problems in the VM too (and probably most of the subtle bugs in
> lockless VM code are memory ordering ones). Let's not make up a
> false sense of security and hope that sprinkling volatile around
> will allow people to write bug-free lockless code. If a writer
> can't be bothered reading API documentation

...or, if there is none, the implementation specification (as in case of
the atomic ops), or, if there is none, the implementation (as in case of
a some infrastructure code here and there)...

> and learning the Linux memory model, they can still be productive
> writing safely locked code.

Provided they are aware that they might not have the full picture of the
lockless primitives.  :-)
-- 
Stefan Richter
-=====-=-=== =--- =---=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 10:03                                                         ` Satyam Sharma
@ 2007-08-17 11:50                                                           ` Nick Piggin
  2007-08-17 12:50                                                             ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17 11:50 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, paulmck, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Satyam Sharma wrote:

>
>On Fri, 17 Aug 2007, Nick Piggin wrote:
>
>
>>Satyam Sharma wrote:
>>
>>It is very obvious. msleep calls schedule() (ie. sleeps), which is
>>always a barrier.
>>
>
>Probably you didn't mean that, but no, schedule() is not barrier because
>it sleeps. It's a barrier because it's invisible.
>

Where did I say it is a barrier because it sleeps?

It is always a barrier because, at the lowest level, schedule() (and thus
anything that sleeps) is defined to always be a barrier. Regardless of
whatever obscure means the compiler might need to infer the barrier.

In other words, you can ignore those obscure details because schedule() is
always going to have an explicit barrier in it.

>>The "unobvious" thing is that you wanted to know how the compiler knows
>>a function is a barrier -- answer is that if it does not *know* it is not
>>a barrier, it must assume it is a barrier.
>>
>
>True, that's clearly what happens here. But are you're definitely joking
>that this is "obvious" in terms of code-clarity, right?
>

No. If you accept that barrier() is implemented correctly, and you know
that sleeping is defined to be a barrier, then its perfectly clear. You
don't have to know how the compiler "knows" that some function contains
a barrier.

>Just 5 minutes back you mentioned elsewhere you like seeing lots of
>explicit calls to barrier() (with comments, no less, hmm? :-)
>

Sure, but there are well known primitives which contain barriers, and
trivial recognisable code sequences for which you don't need comments.
waiting-loops using sleeps or cpu_relax() are prime examples.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 10:12                                 ` Satyam Sharma
@ 2007-08-17 12:14                                   ` Nick Piggin
  2007-08-17 13:05                                     ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17 12:14 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Linus Torvalds, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, Linux Kernel Mailing List, rpjday, ak,
	netdev, cfriesen, Andrew Morton, jesper.juhl, linux-arch, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem, wensong,
	wjiang

Satyam Sharma wrote:

>
>On Fri, 17 Aug 2007, Nick Piggin wrote:
>
>>I think they would both be equally ugly,
>>
>
>You think both these are equivalent in terms of "looks":
>
>					|
>while (!atomic_read(&v)) {		|	while (!atomic_read_xxx(&v)) {
>	...				|		...
>	cpu_relax_no_barrier();		|		cpu_relax_no_barrier();
>	order_atomic(&v);		|	}
>}					|
>
>(where order_atomic() is an atomic_t
>specific wrapper as you mentioned below)
>
>?
>

I think the LHS is better if your atomic_read_xxx primitive is using the
crazy one-sided barrier, because the LHS code you immediately know what
barriers are happening, and with the RHS you have to look at the 
atomic_read_xxx
definition.

If your atomic_read_xxx implementation was more intuitive, then both are
pretty well equal. More lines != ugly code.

>>but the atomic_read_volatile
>>variant would be more prone to subtle bugs because of the weird
>>implementation.
>>
>
>What bugs?
>

You can't think for yourself? Your atomic_read_volatile contains a compiler
barrier to the atomic variable before the load. 2 such reads from different
locations look like this:

asm volatile("" : "+m" (v1));
atomic_read(&v1);
asm volatile("" : "+m" (v2));
atomic_read(&v2);

Which implies that the load of v1 can be reordered to occur after the load
of v2. Bet you didn't expect that?

>>Secondly, what sort of code would do such a thing?
>>
>
>See the nodemgr_host_thread() that does something similar, though not
>exactly same.
>

I'm sorry, all this waffling about made up code which might do this and
that is just a waste of time. Seriously, the thread is bloated enough
and never going to get anywhere with all this handwaving. If someone is
saving up all the really real and actually good arguments for why we
must have a volatile here, now is the time to use them.

>>and have barriers both before and after the memory operation,
>>
>
>How could that lead to bugs? (if you can point to existing code,
>but just some testcase / sample code would be fine as well).
>

See above.

>As I said, barrier() is too heavy-handed.
>

Typo. I meant: defined for a single memory location (ie. order(x)).

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 10:55                                                         ` Satyam Sharma
@ 2007-08-17 12:39                                                           ` Nick Piggin
  2007-08-17 13:36                                                             ` Satyam Sharma
  2007-08-17 16:48                                                             ` Linus Torvalds
  0 siblings, 2 replies; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17 12:39 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Paul Mackerras, Linus Torvalds, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Satyam Sharma wrote:

>
>On Fri, 17 Aug 2007, Nick Piggin wrote:
>
>
>>Because they should be thinking about them in terms of barriers, over
>>which the compiler / CPU is not to reorder accesses or cache memory
>>operations, rather than "special" "volatile" accesses.
>>
>
>This is obviously just a taste thing. Whether to have that forget(x)
>barrier as something author should explicitly sprinkle appropriately
>in appropriate places in the code by himself or use a primitive that
>includes it itself.
>

That's not obviously just taste to me. Not when the primitive has many
(perhaps, the majority) of uses that do not require said barriers. And
this is not solely about the code generation (which, as Paul says, is
relatively minor even on x86). I prefer people to think explicitly
about barriers in their lockless code.

>I'm not saying "taste matters aren't important" (they are), but I'm really
>skeptical if most folks would find the former tasteful.
>

So I /do/ have better taste than most folks? Thanks! :-)

>>>And by the way, the point is *also* about the fact that cpu_relax(), as
>>>of today, implies a full memory clobber, which is not what a lot of such
>>>loops want. (due to stuff mentioned elsewhere, summarized in that summary)
>>>
>>That's not the point,
>>
>
>That's definitely the point, why not. This is why "barrier()", being
>heavy-handed, is not the best option.
>

That is _not_ the point (of why a volatile atomic_read is good) because 
there
has already been an alternative posted that better conforms with Linux 
barrier
API and is much more widely useful and more usable. If you are so 
worried about
barrier() being too heavyweight, then you're off to a poor start by 
wanting to
add a few K of kernel text by making atomic_read volatile.

>>because as I also mentioned, the logical extention
>>to Linux's barrier API to handle this is the order(x) macro. Again, not
>>special volatile accessors.
>>
>
>Sure, that forget(x) macro _is_ proposed to be made part of the generic
>API. Doesn't explain why not to define/use primitives that has volatility
>semantics in itself, though (taste matters apart).
>

If you follow the discussion.... You were thinking of a reason why the
semantics *should* be changed or added, and I was rebutting your argument
that it must be used when a full barrier() is too heavy (ie. by pointing
out that order() has superior semantics anyway).

Why do I keep repeating the same things? I'll not continue bloating this
thread until a new valid point comes up...

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 11:50                                                           ` Nick Piggin
@ 2007-08-17 12:50                                                             ` Satyam Sharma
  2007-08-17 12:56                                                               ` Nick Piggin
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17 12:50 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Stefan Richter, paulmck, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> > On Fri, 17 Aug 2007, Nick Piggin wrote:
> > > Satyam Sharma wrote:
> > > 
> > > It is very obvious. msleep calls schedule() (ie. sleeps), which is
> > > always a barrier.
> > 
> > Probably you didn't mean that, but no, schedule() is not barrier because
> > it sleeps. It's a barrier because it's invisible.
> 
> Where did I say it is a barrier because it sleeps?

Just below. What you wrote:

> It is always a barrier because, at the lowest level, schedule() (and thus
> anything that sleeps) is defined to always be a barrier.

"It is always a barrier because, at the lowest level, anything that sleeps
is defined to always be a barrier".

> Regardless of
> whatever obscure means the compiler might need to infer the barrier.
> 
> In other words, you can ignore those obscure details because schedule() is
> always going to have an explicit barrier in it.

I didn't quite understand what you said here, so I'll tell what I think:

* foo() is a compiler barrier if the definition of foo() is invisible to
  the compiler at a callsite.

* foo() is also a compiler barrier if the definition of foo() includes
  a barrier, and it is inlined at the callsite.

If the above is wrong, or if there's something else at play as well,
do let me know.

> > > The "unobvious" thing is that you wanted to know how the compiler knows
> > > a function is a barrier -- answer is that if it does not *know* it is not
> > > a barrier, it must assume it is a barrier.
> > 
> > True, that's clearly what happens here. But are you're definitely joking
> > that this is "obvious" in terms of code-clarity, right?
> 
> No. If you accept that barrier() is implemented correctly, and you know
> that sleeping is defined to be a barrier,

Curiously, that's the second time you've said "sleeping is defined to
be a (compiler) barrier". How does the compiler even know if foo() is
a function that "sleeps"? Do compilers have some notion of "sleeping"
to ensure they automatically assume a compiler barrier whenever such
a function is called? Or are you saying that the compiler can see the
barrier() inside said function ... nopes, you're saying quite the
opposite below.

> then its perfectly clear. You
> don't have to know how the compiler "knows" that some function contains
> a barrier.

I think I do, why not? Would appreciate if you could elaborate on this.

Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 12:50                                                             ` Satyam Sharma
@ 2007-08-17 12:56                                                               ` Nick Piggin
  2007-08-18  2:15                                                                 ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Nick Piggin @ 2007-08-17 12:56 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, paulmck, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Satyam Sharma wrote:

>
>On Fri, 17 Aug 2007, Nick Piggin wrote:
>
>
>>Satyam Sharma wrote:
>>
>>>On Fri, 17 Aug 2007, Nick Piggin wrote:
>>>
>>>>Satyam Sharma wrote:
>>>>
>>>>It is very obvious. msleep calls schedule() (ie. sleeps), which is
>>>>always a barrier.
>>>>
>>>Probably you didn't mean that, but no, schedule() is not barrier because
>>>it sleeps. It's a barrier because it's invisible.
>>>
>>Where did I say it is a barrier because it sleeps?
>>
>
>Just below. What you wrote:
>
>
>>It is always a barrier because, at the lowest level, schedule() (and thus
>>anything that sleeps) is defined to always be a barrier.
>>
>
>"It is always a barrier because, at the lowest level, anything that sleeps
>is defined to always be a barrier".
>

... because it must call schedule and schedule is a barrier.


>>Regardless of
>>whatever obscure means the compiler might need to infer the barrier.
>>
>>In other words, you can ignore those obscure details because schedule() is
>>always going to have an explicit barrier in it.
>>
>
>I didn't quite understand what you said here, so I'll tell what I think:
>
>* foo() is a compiler barrier if the definition of foo() is invisible to
>  the compiler at a callsite.
>
>* foo() is also a compiler barrier if the definition of foo() includes
>  a barrier, and it is inlined at the callsite.
>
>If the above is wrong, or if there's something else at play as well,
>do let me know.
>

Right.


>>>>The "unobvious" thing is that you wanted to know how the compiler knows
>>>>a function is a barrier -- answer is that if it does not *know* it is not
>>>>a barrier, it must assume it is a barrier.
>>>>
>>>True, that's clearly what happens here. But are you're definitely joking
>>>that this is "obvious" in terms of code-clarity, right?
>>>
>>No. If you accept that barrier() is implemented correctly, and you know
>>that sleeping is defined to be a barrier,
>>
>
>Curiously, that's the second time you've said "sleeping is defined to
>be a (compiler) barrier".
>

_In Linux,_ sleeping is defined to be a compiler barrier.

>How does the compiler even know if foo() is
>a function that "sleeps"? Do compilers have some notion of "sleeping"
>to ensure they automatically assume a compiler barrier whenever such
>a function is called? Or are you saying that the compiler can see the
>barrier() inside said function ... nopes, you're saying quite the
>opposite below.
>

You're getting too worried about the compiler implementation. Start
by assuming that it does work ;)


>>then its perfectly clear. You
>>don't have to know how the compiler "knows" that some function contains
>>a barrier.
>>
>
>I think I do, why not? Would appreciate if you could elaborate on this.
>

If a function is not completely visible to the compiler (so it can't
determine whether a barrier could be in it or not), then it must always
assume it will contain a barrier so it always does the right thing.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 12:14                                   ` Nick Piggin
@ 2007-08-17 13:05                                     ` Satyam Sharma
  0 siblings, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17 13:05 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Linus Torvalds, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, Linux Kernel Mailing List, rpjday, ak,
	netdev, cfriesen, Andrew Morton, jesper.juhl, linux-arch, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem, wensong,
	wjiang



On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> [...]
> > You think both these are equivalent in terms of "looks":
> > 
> > 					|
> > while (!atomic_read(&v)) {		|	while (!atomic_read_xxx(&v)) {
> > 	...				|		...
> > 	cpu_relax_no_barrier();		|
> > cpu_relax_no_barrier();
> > 	order_atomic(&v);		|	}
> > }					|
> > 
> > (where order_atomic() is an atomic_t
> > specific wrapper as you mentioned below)
> > 
> > ?
> 
> I think the LHS is better if your atomic_read_xxx primitive is using the
> crazy one-sided barrier,
  ^^^^^

I'd say it's purposefully one-sided.

> because the LHS code you immediately know what
> barriers are happening, and with the RHS you have to look at the
> atomic_read_xxx
> definition.

No. As I said, the _xxx (whatever the heck you want to name it as) should
give the same heads-up that your "order_atomic" thing is supposed to give.


> If your atomic_read_xxx implementation was more intuitive, then both are
> pretty well equal. More lines != ugly code.
> 
> > [...]
> > What bugs?
> 
> You can't think for yourself? Your atomic_read_volatile contains a compiler
> barrier to the atomic variable before the load. 2 such reads from different
> locations look like this:
> 
> asm volatile("" : "+m" (v1));
> atomic_read(&v1);
> asm volatile("" : "+m" (v2));
> atomic_read(&v2);
> 
> Which implies that the load of v1 can be reordered to occur after the load
> of v2.

And how would that be a bug? (sorry, I really can't think for myself)


> > > Secondly, what sort of code would do such a thing?
> > 
> > See the nodemgr_host_thread() that does something similar, though not
> > exactly same.
> 
> I'm sorry, all this waffling about made up code which might do this and
> that is just a waste of time.

First, you could try looking at the code.

And by the way, as I've already said (why do *require* people to have to
repeat things to you?) this isn't even about only existing code.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 12:39                                                           ` Nick Piggin
@ 2007-08-17 13:36                                                             ` Satyam Sharma
  2007-08-17 16:48                                                             ` Linus Torvalds
  1 sibling, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17 13:36 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Herbert Xu, Paul Mackerras, Linus Torvalds, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> 
> > On Fri, 17 Aug 2007, Nick Piggin wrote:
> > 
> > > Because they should be thinking about them in terms of barriers, over
> > > which the compiler / CPU is not to reorder accesses or cache memory
> > > operations, rather than "special" "volatile" accesses.
> > 
> > This is obviously just a taste thing. Whether to have that forget(x)
> > barrier as something author should explicitly sprinkle appropriately
> > in appropriate places in the code by himself or use a primitive that
> > includes it itself.
> 
> That's not obviously just taste to me. Not when the primitive has many
> (perhaps, the majority) of uses that do not require said barriers. And
> this is not solely about the code generation (which, as Paul says, is
> relatively minor even on x86).

See, you do *require* people to have to repeat the same things to you!

As has been written about enough times already, and if you followed the
discussion on this thread, I am *not* proposing that atomic_read()'s
semantics be changed to have any extra barriers. What is proposed is a
different atomic_read_xxx() variant thereof, that those can use who do
want that.

Now whether to have a kind of barrier ("volatile", whatever) in the
atomic_read_xxx() itself, or whether to make the code writer himself to
explicitly write the order(x) appropriately in appropriate places in the
code _is_ a matter of taste.

> > That's definitely the point, why not. This is why "barrier()", being
> > heavy-handed, is not the best option.
> 
> That is _not_ the point [...]

Again, you're requiring me to repeat things that were already made evident
on this thread (if you follow it).

This _is_ the point, because a lot of loops out there (too many of them,
I WILL NOT bother citing file_name:line_number) end up having to use a
barrier just because they're using a loop-exit-condition that depends
on a value returned by atomic_read(). It would be good for them if they
used an atomic_read_xxx() primitive that gave these "volatility" semantics
without junking compiler optimizations for other memory references.

> because there has already been an alternative posted

Whether that alternative (explicitly using forget(x), or wrappers thereof,
such as the "order_atomic" you proposed) is better than other alternatives
(such as atomic_read_xxx() which includes the volatility behaviour in
itself) is still open, and precisely what we started discussing just one
mail back.

(The above was also mostly stuff I had to repeated for you, sadly.)

> that better conforms with Linux barrier
> API and is much more widely useful and more usable.

I don't think so.

(Now *this* _is_ the "taste-dependent matter" that I mentioned earlier.)

> If you are so worried
> about
> barrier() being too heavyweight, then you're off to a poor start by wanting to
> add a few K of kernel text by making atomic_read volatile.

Repeating myself, for the N'th time, NO, I DON'T want to make atomic_read
have "volatile" semantics.

> > > because as I also mentioned, the logical extention
> > > to Linux's barrier API to handle this is the order(x) macro. Again, not
> > > special volatile accessors.
> > 
> > Sure, that forget(x) macro _is_ proposed to be made part of the generic
> > API. Doesn't explain why not to define/use primitives that has volatility
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > semantics in itself, though (taste matters apart).
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> If you follow the discussion.... You were thinking of a reason why the
> semantics *should* be changed or added, and I was rebutting your argument
> that it must be used when a full barrier() is too heavy (ie. by pointing
> out that order() has superior semantics anyway).

Amazing. Either you have reading comprehension problems, or else, please
try reading this thread (or at least this sub-thread) again. I don't want
_you_ blaming _me_ for having to repeat things to you all over again.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  7:39                                                   ` Satyam Sharma
@ 2007-08-17 14:31                                                     ` Paul E. McKenney
  2007-08-17 18:31                                                       ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-17 14:31 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Fri, Aug 17, 2007 at 01:09:08PM +0530, Satyam Sharma wrote:
> 
> 
> On Thu, 16 Aug 2007, Paul E. McKenney wrote:
> 
> > On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:
> > > On Thu, Aug 16, 2007 at 09:34:41AM -0700, Paul E. McKenney wrote:
> > > >
> > > > The compiler can also reorder non-volatile accesses.  For an example
> > > > patch that cares about this, please see:
> > > > 
> > > > 	http://lkml.org/lkml/2007/8/7/280
> > > > 
> > > > This patch uses an ORDERED_WRT_IRQ() in rcu_read_lock() and
> > > > rcu_read_unlock() to ensure that accesses aren't reordered with respect
> > > > to interrupt handlers and NMIs/SMIs running on that same CPU.
> > > 
> > > Good, finally we have some code to discuss (even though it's
> > > not actually in the kernel yet).
> > 
> > There was some earlier in this thread as well.
> 
> Hmm, I never quite got what all this interrupt/NMI/SMI handling and
> RCU business you mentioned earlier was all about, but now that you've
> pointed to the actual code and issues with it ...

Glad to help...

> > > First of all, I think this illustrates that what you want
> > > here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
> > > macro occurs a lot more times in your patch than atomic
> > > reads/sets.  So *assuming* that it was necessary at all,
> > > then having an ordered variant of the atomic_read/atomic_set
> > > ops could do just as well.
> > 
> > Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
> > to maintain ordering, then I could just use them instead of having to
> > create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
> > different patch.)
> 
> +#define WHATEVER(x)	(*(volatile typeof(x) *)&(x))
> 
> I suppose one could want volatile access semantics for stuff that's
> a bit-field too, no?

One could, but this is not supported in general.  So if you want that,
you need to use the usual bit-mask tricks and (for setting) atomic
operations.

> Also, this gives *zero* "re-ordering" guarantees that your code wants
> as you've explained it below) -- neither w.r.t. CPU re-ordering (which
> probably you don't care about) *nor* w.r.t. compiler re-ordering
> (which you definitely _do_ care about).

You are correct about CPU re-ordering (and about the fact that this
example doesn't care about it), but not about compiler re-ordering.

The compiler is prohibited from moving a volatile access across a sequence
point.  One example of a sequence point is a statement boundary.  Because
all of the volatile accesses in this code are separated by statement
boundaries, a conforming compiler is prohibited from reordering them.

> > > However, I still don't know which atomic_read/atomic_set in
> > > your patch would be broken if there were no volatile.  Could
> > > you please point them out?
> > 
> > Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
> > atomic_read() and atomic_set().  Starting with __rcu_read_lock():
> > 
> > o	If "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++"
> > 	was ordered by the compiler after
> > 	"ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1", then
> > 	suppose an NMI/SMI happened after the rcu_read_lock_nesting but
> > 	before the rcu_flipctr.
> > 
> > 	Then if there was an rcu_read_lock() in the SMI/NMI
> > 	handler (which is perfectly legal), the nested rcu_read_lock()
> > 	would believe that it could take the then-clause of the
> > 	enclosing "if" statement.  But because the rcu_flipctr per-CPU
> > 	variable had not yet been incremented, an RCU updater would
> > 	be within its rights to assume that there were no RCU reads
> > 	in progress, thus possibly yanking a data structure out from
> > 	under the reader in the SMI/NMI function.
> > 
> > 	Fatal outcome.  Note that only one CPU is involved here
> > 	because these are all either per-CPU or per-task variables.
> 
> Ok, so you don't care about CPU re-ordering. Still, I should let you know
> that your ORDERED_WRT_IRQ() -- bad name, btw -- is still buggy. What you
> want is a full compiler optimization barrier().

No.  See above.

> [ Your code probably works now, and emits correct code, but that's
>   just because of gcc did what it did. Nothing in any standard,
>   or in any documented behaviour of gcc, or anything about the real
>   (or expected) semantics of "volatile" is protecting the code here. ]

Really?  Why doesn't the prohibition against moving volatile accesses
across sequence points take care of this?

> > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1"
> > 	was ordered by the compiler to follow the
> > 	"ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx", and an NMI/SMI
> > 	happened between the two, then an __rcu_read_lock() in the NMI/SMI
> > 	would incorrectly take the "else" clause of the enclosing "if"
> > 	statement.  If some other CPU flipped the rcu_ctrlblk.completed
> > 	in the meantime, then the __rcu_read_lock() would (correctly)
> > 	write the new value into rcu_flipctr_idx.
> > 
> > 	Well and good so far.  But the problem arises in
> > 	__rcu_read_unlock(), which then decrements the wrong counter.
> > 	Depending on exactly how subsequent events played out, this could
> > 	result in either prematurely ending grace periods or never-ending
> > 	grace periods, both of which are fatal outcomes.
> > 
> > And the following are not needed in the current version of the
> > patch, but will be in a future version that either avoids disabling
> > irqs or that dispenses with the smp_read_barrier_depends() that I
> > have 99% convinced myself is unneeded:
> > 
> > o	nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting);
> > 
> > o	idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1;
> > 
> > Furthermore, in that future version, irq handlers can cause the same
> > mischief that SMI/NMI handlers can in this version.
> > 
> > Next, looking at __rcu_read_unlock():
> > 
> > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting - 1"
> > 	was reordered by the compiler to follow the
> > 	"ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])--",
> > 	then if an NMI/SMI containing an rcu_read_lock() occurs between
> > 	the two, this nested rcu_read_lock() would incorrectly believe
> > 	that it was protected by an enclosing RCU read-side critical
> > 	section as described in the first reversal discussed for
> > 	__rcu_read_lock() above.  Again, fatal outcome.
> > 
> > This is what we have now.  It is not hard to imagine situations that
> > interact with -both- interrupt handlers -and- other CPUs, as described
> > earlier.
> 
> It's not about interrupt/SMI/NMI handlers at all! What you clearly want,
> simply put, is that a certain stream of C statements must be emitted
> by the compiler _as they are_ with no re-ordering optimizations! You must
> *definitely* use barrier(), IMHO.

Almost.  I don't care about most of the operations, only about the loads
and stores marked volatile.  Again, although the compiler is free to
reorder volatile accesses that occur -within- a single statement, it
is prohibited by the standard from moving volatile accesses from one
statement to another.  Therefore, this code can legitimately use volatile.

Or am I missing something subtle?

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 12:39                                                           ` Nick Piggin
  2007-08-17 13:36                                                             ` Satyam Sharma
@ 2007-08-17 16:48                                                             ` Linus Torvalds
  2007-08-17 18:50                                                               ` Chris Friesen
                                                                                 ` (2 more replies)
  1 sibling, 3 replies; 1546+ messages in thread
From: Linus Torvalds @ 2007-08-17 16:48 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Satyam Sharma, Herbert Xu, Paul Mackerras, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, 17 Aug 2007, Nick Piggin wrote:
> 
> That's not obviously just taste to me. Not when the primitive has many
> (perhaps, the majority) of uses that do not require said barriers. And
> this is not solely about the code generation (which, as Paul says, is
> relatively minor even on x86). I prefer people to think explicitly
> about barriers in their lockless code.

Indeed.

I think the important issues are:

 - "volatile" itself is simply a badly/weakly defined issue. The semantics 
   of it as far as the compiler is concerned are really not very good, and 
   in practice tends to boil down to "I will generate so bad code that 
   nobody can accuse me of optimizing anything away".

 - "volatile" - regardless of how well or badly defined it is - is purely 
   a compiler thing. It has absolutely no meaning for the CPU itself, so 
   it at no point implies any CPU barriers. As a result, even if the 
   compiler generates crap code and doesn't re-order anything, there's 
   nothing that says what the CPU will do.

 - in other words, the *only* possible meaning for "volatile" is a purely 
   single-CPU meaning. And if you only have a single CPU involved in the 
   process, the "volatile" is by definition pointless (because even 
   without a volatile, the compiler is required to make the C code appear 
   consistent as far as a single CPU is concerned).

So, let's take the example *buggy* code where we use "volatile" to wait 
for other CPU's:

	atomic_set(&var, 0);
	while (!atomic_read(&var))
		/* nothing */;

which generates an endless loop if we don't have atomic_read() imply 
volatile.

The point here is that it's buggy whether the volatile is there or not! 
Exactly because the user expects multi-processing behaviour, but 
"volatile" doesn't actually give any real guarantees about it. Another CPU 
may have done:

	external_ptr = kmalloc(..);
	/* Setup is now complete, inform the waiter */
	atomic_inc(&var);

but the fact is, since the other CPU isn't serialized in any way, the 
"while-loop" (even in the presense of "volatile") doesn't actually work 
right! Whatever the "atomic_read()" was waiting for may not have 
completed, because we have no barriers!

So if "volatile" makes a difference, it is invariably a sign of a bug in 
serialization (the one exception is for IO - we use "volatile" to avoid 
having to use inline asm for IO on x86) - and for "random values" like 
jiffies).

So the question should *not* be whether "volatile" actually fixes bugs. It 
*never* fixes a bug. But what it can do is to hide the obvious ones. In 
other words, adding a volaile in the above kind of situation of 
"atomic_read()" will certainly turn an obvious bug into something that 
works "practically all of the time).

So anybody who argues for "volatile" fixing bugs is fundamentally 
incorrect. It does NO SUCH THING. By arguing that, such people only show 
that you have no idea what they are talking about.

So the only reason to add back "volatile" to the atomic_read() sequence is 
not to fix bugs, but to _hide_ the bugs better. They're still there, they 
are just a lot harder to trigger, and tend to be a lot subtler.

And hey, sometimes "hiding bugs well enough" is ok. In this case, I'd 
argue that we've successfully *not* had the volatile there for eight 
months on x86-64, and that should tell people something. 

(Does _removing_ the volatile fix bugs? No - callers still need to think 
about barriers etc, and lots of people don't. So I'm not claiming that 
removing volatile fixes any bugs either, but I *am* claiming that:

 - removing volatile makes some bugs easier to see (which is mostly a good 
   thing: they were there before, anyway).

 - removing volatile generates better code (which is a good thing, even if 
   it's just 0.1%)

 - removing volatile removes a huge mental *bug* that lots of people seem 
   to have, as shown by this whole thread. Anybody who thinks that 
   "volatile" actually fixes anything has a gaping hole in their head, and 
   we should remove volatile just to make sure that nobody thinks that it 
   means soemthign that it doesn't mean!

In other words, this whole discussion has just convinced me that we should 
*not* add back "volatile" to "atomic_read()" - I was willing to do it for 
practical and "hide the bugs" reasons, but having seen people argue for 
it, thinking that it actually fixes something, I'm now convinced that the 
*last* thing we should do is to encourage that kind of superstitious 
thinking.

"volatile" is like a black cat crossing the road. Sure, it affects 
*something* (at a minimum: before, the black cat was on one side of the 
road, afterwards it is on the other side of the road), but it has no 
bigger and longer-lasting direct affects. 

People who think "volatile" really matters are just fooling themselves.

		Linus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  2:19                   ` Nick Piggin
  2007-08-17  3:16                     ` Paul Mackerras
@ 2007-08-17 17:37                     ` Segher Boessenkool
  1 sibling, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-17 17:37 UTC (permalink / raw)
  To: Nick Piggin
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

>>>>>> Part of the motivation here is to fix heisenbugs.  If I knew 
>>>>>> where they
>>>>>
>>>>> By the same token we should probably disable optimisations
>>>>> altogether since that too can create heisenbugs.
>>>>
>>>> Almost everything is a tradeoff; and so is this.  I don't
>>>> believe most people would find disabling all compiler
>>>> optimisations an acceptable price to pay for some peace
>>>> of mind.
>>>
>>>
>>> So why is this a good tradeoff?
>> It certainly is better than disabling all compiler optimisations!
>
> It's easy to be better than something really stupid :)

Sure, it wasn't me who made the comparison though.

> So i386 and x86-64 don't have volatiles there, and it saves them a
> few K of kernel text.

Which has to be investigated.  A few kB is a lot more than expected.

> What you need to justify is why it is a good
> tradeoff to make them volatile (which btw, is much harder to go
> the other way after we let people make those assumptions).

My point is that people *already* made those assumptions.  There
are two ways to clean up this mess:

1) Have the "volatile" semantics by default, change the users
    that don't need it;
2) Have "non-volatile" semantics by default, change the users
    that do need it.

Option 2) randomly breaks stuff all over the place, option 1)
doesn't.  Yeah 1) could cause some extremely minor speed or
code size regression, but only temporarily until everything has
been audited.

>>> I also think that just adding things to APIs in the hope it might fix
>>> up some bugs isn't really a good road to go down. Where do you stop?
>> I look at it the other way: keeping the "volatile" semantics in
>> atomic_XXX() (or adding them to it, whatever) helps _prevent_ bugs;
>
> Yeah, but we could add lots of things to help prevent bugs and
> would never be included. I would also contend that it helps _hide_
> bugs and encourages people to be lazy when thinking about these
> things.

Sure.  We aren't _adding_ anything here though, not on the platforms
where it is most likely to show up, anyway.

> Also, you dismiss the fact that we'd actually be *adding* volatile
> semantics back to the 2 most widely tested architectures (in terms
> of test time, number of testers, variety of configurations, and
> coverage of driver code).

I'm not dismissing that.  x86 however is one of the few architectures
where mistakenly leaving out a "volatile" will not easily show up on
user testing, since the compiler will very often produce a memory
reference anyway because it has no registers to play with.

> This is a very important different from
> just keeping volatile semantics because it is basically a one-way
> API change.

That's a good point.  Maybe we should create _two_ new APIs, one
explicitly going each way.

>> certainly most people expect that behaviour, and also that behaviour
>> is *needed* in some places and no other interface provides that
>> functionality.
>
> I don't know that most people would expect that behaviour.

I didn't conduct any formal poll either :-)

> Is there any documentation anywhere that would suggest this?

Not really I think, no.  But not the other way around, either.
Most uses of it seem to expect it though.

>> [some confusion about barriers wrt atomics snipped]
>
> What were you confused about?

Me?  Not much.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 20:20                                       ` Christoph Lameter
  2007-08-17  1:02                                         ` Paul E. McKenney
  2007-08-17  2:16                                         ` Paul Mackerras
@ 2007-08-17 17:41                                         ` Segher Boessenkool
  2007-08-17 18:38                                           ` Satyam Sharma
  2007-09-10 18:59                                           ` Christoph Lameter
  2 siblings, 2 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-17 17:41 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

>> atomic_dec() already has volatile behavior everywhere, so this is 
>> semantically
>> okay, but this code (and any like it) should be calling cpu_relax() 
>> each
>> iteration through the loop, unless there's a compelling reason not 
>> to.  I'll
>> allow that for some hardware drivers (possibly this one) such a 
>> compelling
>> reason may exist, but hardware-independent core subsystems probably 
>> have no
>> excuse.
>
> No it does not have any volatile semantics. atomic_dec() can be 
> reordered
> at will by the compiler within the current basic unit if you do not 
> add a
> barrier.

"volatile" has nothing to do with reordering.  atomic_dec() writes
to memory, so it _does_ have "volatile semantics", implicitly, as
long as the compiler cannot optimise the atomic variable away
completely -- any store counts as a side effect.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 14:31                                                     ` Paul E. McKenney
@ 2007-08-17 18:31                                                       ` Satyam Sharma
  2007-08-17 18:56                                                         ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17 18:31 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Fri, 17 Aug 2007, Paul E. McKenney wrote:

> On Fri, Aug 17, 2007 at 01:09:08PM +0530, Satyam Sharma wrote:
> > 
> > On Thu, 16 Aug 2007, Paul E. McKenney wrote:
> > 
> > > On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:
> > > > 
> > > > First of all, I think this illustrates that what you want
> > > > here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
> > > > macro occurs a lot more times in your patch than atomic
> > > > reads/sets.  So *assuming* that it was necessary at all,
> > > > then having an ordered variant of the atomic_read/atomic_set
> > > > ops could do just as well.
> > > 
> > > Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
> > > to maintain ordering, then I could just use them instead of having to
> > > create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
> > > different patch.)
> > 
> > +#define WHATEVER(x)	(*(volatile typeof(x) *)&(x))
> > [...]
> > Also, this gives *zero* "re-ordering" guarantees that your code wants
> > as you've explained it below) -- neither w.r.t. CPU re-ordering (which
> > probably you don't care about) *nor* w.r.t. compiler re-ordering
> > (which you definitely _do_ care about).
> 
> You are correct about CPU re-ordering (and about the fact that this
> example doesn't care about it), but not about compiler re-ordering.
> 
> The compiler is prohibited from moving a volatile access across a sequence
> point.  One example of a sequence point is a statement boundary.  Because
> all of the volatile accesses in this code are separated by statement
> boundaries, a conforming compiler is prohibited from reordering them.

Yes, you're right, and I believe precisely this was discussed elsewhere
as well today.

But I'd call attention to what Herbert mentioned there. You're using
ORDERED_WRT_IRQ() on stuff that is _not_ defined to be an atomic_t at all:

* Member "completed" of struct rcu_ctrlblk is a long.
* Per-cpu variable rcu_flipctr is an array of ints.
* Members "rcu_read_lock_nesting" and "rcu_flipctr_idx" of
  struct task_struct are ints.

So are you saying you're "having to use" this volatile-access macro
because you *couldn't* declare all the above as atomic_t and thus just
expect the right thing to happen by using the atomic ops API by default,
because it lacks volatile access semantics (on x86)?

If so, then I wonder if using the volatile access cast is really the
best way to achieve (at least in terms of code clarity) the kind of
re-ordering guarantees it wants there. (there could be alternative
solutions, such as using barrier(), or that at bottom of this mail)

What I mean is this: If you switch to atomic_t, and x86 switched to
make atomic_t have "volatile" semantics by default, the statements
would be simply a string of: atomic_inc(), atomic_add(), atomic_set(),
and atomic_read() statements, and nothing in there that clearly makes
it *explicit* that the code is correct (and not buggy) simply because
of the re-ordering guarantees that the C "volatile" type-qualifier
keyword gives us as per the standard. But now we're firmly in
"subjective" territory, so you or anybody could legitimately disagree.

> > > Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
> > > atomic_read() and atomic_set().  Starting with __rcu_read_lock():
> > > 
> > > o	If "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++"
> > > 	was ordered by the compiler after
> > > 	"ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1", then
> > > 	suppose an NMI/SMI happened after the rcu_read_lock_nesting but
> > > 	before the rcu_flipctr.
> > > 
> > > 	Then if there was an rcu_read_lock() in the SMI/NMI
> > > 	handler (which is perfectly legal), the nested rcu_read_lock()
> > > 	would believe that it could take the then-clause of the
> > > 	enclosing "if" statement.  But because the rcu_flipctr per-CPU
> > > 	variable had not yet been incremented, an RCU updater would
> > > 	be within its rights to assume that there were no RCU reads
> > > 	in progress, thus possibly yanking a data structure out from
> > > 	under the reader in the SMI/NMI function.
> > > 
> > > 	Fatal outcome.  Note that only one CPU is involved here
> > > 	because these are all either per-CPU or per-task variables.
> > 
> > Ok, so you don't care about CPU re-ordering. Still, I should let you know
> > that your ORDERED_WRT_IRQ() -- bad name, btw -- is still buggy. What you
> > want is a full compiler optimization barrier().
> 
> No.  See above.

True, *(volatile foo *)& _will_ work for this case.

But multiple calls to barrier() (granted, would invalidate all other
optimizations also) would work as well, would it not?

[ Interestingly, if you declared all those objects mentioned earlier as
  atomic_t, and x86(-64) switched to an __asm__ __volatile__ based variant
  for atomic_{read,set}_volatile(), the bugs you want to avoid would still
  be there. "volatile" the C language type-qualifier does have compiler
  re-ordering semantics you mentioned earlier, but the "volatile" that
  applies to inline asm()s gives no re-ordering guarantees. ]

> > > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1"
> > > 	was ordered by the compiler to follow the
> > > 	"ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx", and an NMI/SMI
> > > 	happened between the two, then an __rcu_read_lock() in the NMI/SMI
> > > 	would incorrectly take the "else" clause of the enclosing "if"
> > > 	statement.  If some other CPU flipped the rcu_ctrlblk.completed
> > > 	in the meantime, then the __rcu_read_lock() would (correctly)
> > > 	write the new value into rcu_flipctr_idx.
> > > 
> > > 	Well and good so far.  But the problem arises in
> > > 	__rcu_read_unlock(), which then decrements the wrong counter.
> > > 	Depending on exactly how subsequent events played out, this could
> > > 	result in either prematurely ending grace periods or never-ending
> > > 	grace periods, both of which are fatal outcomes.
> > > 
> > > And the following are not needed in the current version of the
> > > patch, but will be in a future version that either avoids disabling
> > > irqs or that dispenses with the smp_read_barrier_depends() that I
> > > have 99% convinced myself is unneeded:
> > > 
> > > o	nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting);
> > > 
> > > o	idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1;
> > > 
> > > Furthermore, in that future version, irq handlers can cause the same
> > > mischief that SMI/NMI handlers can in this version.

So don't remove the local_irq_save/restore, which is well-established and
well-understood for such cases (it doesn't help you with SMI/NMI,
admittedly). This isn't really about RCU or per-cpu vars as such, it's
just about racy code where you don't want to get hit by a concurrent
interrupt (it does turn out that doing things in a _particular order_ will
not cause fatal/buggy behaviour, but it's still a race issue, after all).

> > > Next, looking at __rcu_read_unlock():
> > > 
> > > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting - 1"
> > > 	was reordered by the compiler to follow the
> > > 	"ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])--",
> > > 	then if an NMI/SMI containing an rcu_read_lock() occurs between
> > > 	the two, this nested rcu_read_lock() would incorrectly believe
> > > 	that it was protected by an enclosing RCU read-side critical
> > > 	section as described in the first reversal discussed for
> > > 	__rcu_read_lock() above.  Again, fatal outcome.
> > > 
> > > This is what we have now.  It is not hard to imagine situations that
> > > interact with -both- interrupt handlers -and- other CPUs, as described
> > > earlier.

Unless somebody's going for a lockless implementation, such situations
normally use spin_lock_irqsave() based locking (or local_irq_save for
those who care only for current CPU) -- problem with the patch in question,
is that you want to prevent races with concurrent SMI/NMIs as well, which
is not something that a lot of code needs to consider.

[ Curiously, another thread is discussing something similar also:
  http://lkml.org/lkml/2007/8/15/393 "RFC: do get_rtc_time() correctly" ]

Anyway, I didn't look at the code in that patch very much in detail, but
why couldn't you implement some kind of synchronization variable that lets
rcu_read_lock() or rcu_read_unlock() -- when being called from inside an
NMI or SMI handler -- know that it has concurrently interrupted an ongoing
rcu_read_{un}lock() and so must do things differently ... (?)

I'm also wondering if there's other code that's not using locking in the
kernel that faces similar issues, and what they've done to deal with it
(if anything). Such bugs would be subtle, and difficult to diagnose.

Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 17:41                                         ` Segher Boessenkool
@ 2007-08-17 18:38                                           ` Satyam Sharma
  2007-08-17 23:17                                             ` Segher Boessenkool
  2007-09-10 18:59                                           ` Christoph Lameter
  1 sibling, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17 18:38 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang



On Fri, 17 Aug 2007, Segher Boessenkool wrote:

> > > atomic_dec() already has volatile behavior everywhere, so this is
> > > semantically
> > > okay, but this code (and any like it) should be calling cpu_relax() each
> > > iteration through the loop, unless there's a compelling reason not to.
> > > I'll
> > > allow that for some hardware drivers (possibly this one) such a compelling
> > > reason may exist, but hardware-independent core subsystems probably have
> > > no
> > > excuse.
> > 
> > No it does not have any volatile semantics. atomic_dec() can be reordered
> > at will by the compiler within the current basic unit if you do not add a
> > barrier.
> 
> "volatile" has nothing to do with reordering.

If you're talking of "volatile" the type-qualifier keyword, then
http://lkml.org/lkml/2007/8/16/231 (and sub-thread below it) shows
otherwise.

> atomic_dec() writes
> to memory, so it _does_ have "volatile semantics", implicitly, as
> long as the compiler cannot optimise the atomic variable away
> completely -- any store counts as a side effect.

I don't think an atomic_dec() implemented as an inline "asm volatile"
or one that uses a "forget" macro would have the same re-ordering
guarantees as an atomic_dec() that uses a volatile access cast.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 16:48                                                             ` Linus Torvalds
@ 2007-08-17 18:50                                                               ` Chris Friesen
  2007-08-17 18:54                                                                 ` Arjan van de Ven
  2007-08-17 19:08                                                                 ` Linus Torvalds
  2007-08-20 13:15                                                               ` Chris Snook
  2007-09-09 18:02                                                               ` Denys Vlasenko
  2 siblings, 2 replies; 1546+ messages in thread
From: Chris Friesen @ 2007-08-17 18:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, zlynx, rpjday, jesper.juhl, segher

Linus Torvalds wrote:

>  - in other words, the *only* possible meaning for "volatile" is a purely 
>    single-CPU meaning. And if you only have a single CPU involved in the 
>    process, the "volatile" is by definition pointless (because even 
>    without a volatile, the compiler is required to make the C code appear 
>    consistent as far as a single CPU is concerned).

I assume you mean "except for IO-related code and 'random' values like 
jiffies" as you mention later on?  I assume other values set in 
interrupt handlers would count as "random" from a volatility perspective?

> So anybody who argues for "volatile" fixing bugs is fundamentally 
> incorrect. It does NO SUCH THING. By arguing that, such people only show 
> that you have no idea what they are talking about.

What about reading values modified in interrupt handlers, as in your 
"random" case?  Or is this a bug where the user of atomic_read() is 
invalidly expecting a read each time it is called?

Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 18:50                                                               ` Chris Friesen
@ 2007-08-17 18:54                                                                 ` Arjan van de Ven
  2007-08-17 19:49                                                                   ` Paul E. McKenney
  2007-08-17 19:08                                                                 ` Linus Torvalds
  1 sibling, 1 reply; 1546+ messages in thread
From: Arjan van de Ven @ 2007-08-17 18:54 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, zlynx, rpjday,
	jesper.juhl, segher


On Fri, 2007-08-17 at 12:50 -0600, Chris Friesen wrote:
> Linus Torvalds wrote:
> 
> >  - in other words, the *only* possible meaning for "volatile" is a purely 
> >    single-CPU meaning. And if you only have a single CPU involved in the 
> >    process, the "volatile" is by definition pointless (because even 
> >    without a volatile, the compiler is required to make the C code appear 
> >    consistent as far as a single CPU is concerned).
> 
> I assume you mean "except for IO-related code and 'random' values like 
> jiffies" as you mention later on?  I assume other values set in 
> interrupt handlers would count as "random" from a volatility perspective?
> 
> > So anybody who argues for "volatile" fixing bugs is fundamentally 
> > incorrect. It does NO SUCH THING. By arguing that, such people only show 
> > that you have no idea what they are talking about.
> 
> What about reading values modified in interrupt handlers, as in your 
> "random" case?  Or is this a bug where the user of atomic_read() is 
> invalidly expecting a read each time it is called?

the interrupt handler case is an SMP case since you do not know
beforehand what cpu your interrupt handler will run on.




^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 18:31                                                       ` Satyam Sharma
@ 2007-08-17 18:56                                                         ` Paul E. McKenney
  0 siblings, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-17 18:56 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Sat, Aug 18, 2007 at 12:01:38AM +0530, Satyam Sharma wrote:
> 
> 
> On Fri, 17 Aug 2007, Paul E. McKenney wrote:
> 
> > On Fri, Aug 17, 2007 at 01:09:08PM +0530, Satyam Sharma wrote:
> > > 
> > > On Thu, 16 Aug 2007, Paul E. McKenney wrote:
> > > 
> > > > On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:
> > > > > 
> > > > > First of all, I think this illustrates that what you want
> > > > > here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
> > > > > macro occurs a lot more times in your patch than atomic
> > > > > reads/sets.  So *assuming* that it was necessary at all,
> > > > > then having an ordered variant of the atomic_read/atomic_set
> > > > > ops could do just as well.
> > > > 
> > > > Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
> > > > to maintain ordering, then I could just use them instead of having to
> > > > create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
> > > > different patch.)
> > > 
> > > +#define WHATEVER(x)	(*(volatile typeof(x) *)&(x))
> > > [...]
> > > Also, this gives *zero* "re-ordering" guarantees that your code wants
> > > as you've explained it below) -- neither w.r.t. CPU re-ordering (which
> > > probably you don't care about) *nor* w.r.t. compiler re-ordering
> > > (which you definitely _do_ care about).
> > 
> > You are correct about CPU re-ordering (and about the fact that this
> > example doesn't care about it), but not about compiler re-ordering.
> > 
> > The compiler is prohibited from moving a volatile access across a sequence
> > point.  One example of a sequence point is a statement boundary.  Because
> > all of the volatile accesses in this code are separated by statement
> > boundaries, a conforming compiler is prohibited from reordering them.
> 
> Yes, you're right, and I believe precisely this was discussed elsewhere
> as well today.
> 
> But I'd call attention to what Herbert mentioned there. You're using
> ORDERED_WRT_IRQ() on stuff that is _not_ defined to be an atomic_t at all:
> 
> * Member "completed" of struct rcu_ctrlblk is a long.
> * Per-cpu variable rcu_flipctr is an array of ints.
> * Members "rcu_read_lock_nesting" and "rcu_flipctr_idx" of
>   struct task_struct are ints.
> 
> So are you saying you're "having to use" this volatile-access macro
> because you *couldn't* declare all the above as atomic_t and thus just
> expect the right thing to happen by using the atomic ops API by default,
> because it lacks volatile access semantics (on x86)?
> 
> If so, then I wonder if using the volatile access cast is really the
> best way to achieve (at least in terms of code clarity) the kind of
> re-ordering guarantees it wants there. (there could be alternative
> solutions, such as using barrier(), or that at bottom of this mail)
> 
> What I mean is this: If you switch to atomic_t, and x86 switched to
> make atomic_t have "volatile" semantics by default, the statements
> would be simply a string of: atomic_inc(), atomic_add(), atomic_set(),
> and atomic_read() statements, and nothing in there that clearly makes
> it *explicit* that the code is correct (and not buggy) simply because
> of the re-ordering guarantees that the C "volatile" type-qualifier
> keyword gives us as per the standard. But now we're firmly in
> "subjective" territory, so you or anybody could legitimately disagree.

In any case, given Linus's note, it appears that atomic_read() and
atomic_set() won't consistently have volatile semantics, at least
not while the compiler generates such ugly code for volatile accesses.
So I will continue with my current approach.

In any case, I will not be using atomic_inc() or atomic_add() in this
code, as doing so would more than double the overhead, even on machines
that are the most efficient at implementing atomic operations.

> > > > Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
> > > > atomic_read() and atomic_set().  Starting with __rcu_read_lock():
> > > > 
> > > > o	If "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++"
> > > > 	was ordered by the compiler after
> > > > 	"ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1", then
> > > > 	suppose an NMI/SMI happened after the rcu_read_lock_nesting but
> > > > 	before the rcu_flipctr.
> > > > 
> > > > 	Then if there was an rcu_read_lock() in the SMI/NMI
> > > > 	handler (which is perfectly legal), the nested rcu_read_lock()
> > > > 	would believe that it could take the then-clause of the
> > > > 	enclosing "if" statement.  But because the rcu_flipctr per-CPU
> > > > 	variable had not yet been incremented, an RCU updater would
> > > > 	be within its rights to assume that there were no RCU reads
> > > > 	in progress, thus possibly yanking a data structure out from
> > > > 	under the reader in the SMI/NMI function.
> > > > 
> > > > 	Fatal outcome.  Note that only one CPU is involved here
> > > > 	because these are all either per-CPU or per-task variables.
> > > 
> > > Ok, so you don't care about CPU re-ordering. Still, I should let you know
> > > that your ORDERED_WRT_IRQ() -- bad name, btw -- is still buggy. What you
> > > want is a full compiler optimization barrier().
> > 
> > No.  See above.
> 
> True, *(volatile foo *)& _will_ work for this case.
> 
> But multiple calls to barrier() (granted, would invalidate all other
> optimizations also) would work as well, would it not?

They work, but are a bit slower.  So they do work, but not as well.

> [ Interestingly, if you declared all those objects mentioned earlier as
>   atomic_t, and x86(-64) switched to an __asm__ __volatile__ based variant
>   for atomic_{read,set}_volatile(), the bugs you want to avoid would still
>   be there. "volatile" the C language type-qualifier does have compiler
>   re-ordering semantics you mentioned earlier, but the "volatile" that
>   applies to inline asm()s gives no re-ordering guarantees. ]

Well, that certainly would be a point in favor of "volatile" over inline
asms.  ;-)

> > > > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1"
> > > > 	was ordered by the compiler to follow the
> > > > 	"ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx", and an NMI/SMI
> > > > 	happened between the two, then an __rcu_read_lock() in the NMI/SMI
> > > > 	would incorrectly take the "else" clause of the enclosing "if"
> > > > 	statement.  If some other CPU flipped the rcu_ctrlblk.completed
> > > > 	in the meantime, then the __rcu_read_lock() would (correctly)
> > > > 	write the new value into rcu_flipctr_idx.
> > > > 
> > > > 	Well and good so far.  But the problem arises in
> > > > 	__rcu_read_unlock(), which then decrements the wrong counter.
> > > > 	Depending on exactly how subsequent events played out, this could
> > > > 	result in either prematurely ending grace periods or never-ending
> > > > 	grace periods, both of which are fatal outcomes.
> > > > 
> > > > And the following are not needed in the current version of the
> > > > patch, but will be in a future version that either avoids disabling
> > > > irqs or that dispenses with the smp_read_barrier_depends() that I
> > > > have 99% convinced myself is unneeded:
> > > > 
> > > > o	nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting);
> > > > 
> > > > o	idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1;
> > > > 
> > > > Furthermore, in that future version, irq handlers can cause the same
> > > > mischief that SMI/NMI handlers can in this version.
> 
> So don't remove the local_irq_save/restore, which is well-established and
> well-understood for such cases (it doesn't help you with SMI/NMI,
> admittedly). This isn't really about RCU or per-cpu vars as such, it's
> just about racy code where you don't want to get hit by a concurrent
> interrupt (it does turn out that doing things in a _particular order_ will
> not cause fatal/buggy behaviour, but it's still a race issue, after all).

The local_irq_save/restore are something like 30% of the overhead of
these two functions, so will be looking hard at getting rid of them.
Doing so allows the scheduling-clock interrupt to get into the mix,
and also allows preemption.  The goal would be to find some trick that
suppresses preemption, fends off the grace-period-computation code
invoked from the the scheduling-clock interrupt, and otherwise keeps
things on an even keel.

> > > > Next, looking at __rcu_read_unlock():
> > > > 
> > > > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting - 1"
> > > > 	was reordered by the compiler to follow the
> > > > 	"ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])--",
> > > > 	then if an NMI/SMI containing an rcu_read_lock() occurs between
> > > > 	the two, this nested rcu_read_lock() would incorrectly believe
> > > > 	that it was protected by an enclosing RCU read-side critical
> > > > 	section as described in the first reversal discussed for
> > > > 	__rcu_read_lock() above.  Again, fatal outcome.
> > > > 
> > > > This is what we have now.  It is not hard to imagine situations that
> > > > interact with -both- interrupt handlers -and- other CPUs, as described
> > > > earlier.
> 
> Unless somebody's going for a lockless implementation, such situations
> normally use spin_lock_irqsave() based locking (or local_irq_save for
> those who care only for current CPU) -- problem with the patch in question,
> is that you want to prevent races with concurrent SMI/NMIs as well, which
> is not something that a lot of code needs to consider.

Or that needs to resolve similar races with IRQs without disabling them.
One reason to avoid disabling IRQs is to avoid degrading scheduling
latency.  In any case, I do agree that the amount of code that must
worry about this is quite small at the moment.  I believe that it
will become more common, but would imagine that this belief might not
be universal.  Yet, anyway.  ;-)

> [ Curiously, another thread is discussing something similar also:
>   http://lkml.org/lkml/2007/8/15/393 "RFC: do get_rtc_time() correctly" ]
> 
> Anyway, I didn't look at the code in that patch very much in detail, but
> why couldn't you implement some kind of synchronization variable that lets
> rcu_read_lock() or rcu_read_unlock() -- when being called from inside an
> NMI or SMI handler -- know that it has concurrently interrupted an ongoing
> rcu_read_{un}lock() and so must do things differently ... (?)

Given some low-level details of the current implementation, I could
imagine manipulating rcu_read_lock_nesting on entry to and exit from
all NMI/SMI handlers, but would like to avoid that kind of architecture
dependency.  I am not confident of locating all of them, for one thing...

> I'm also wondering if there's other code that's not using locking in the
> kernel that faces similar issues, and what they've done to deal with it
> (if anything). Such bugs would be subtle, and difficult to diagnose.

Agreed!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 18:50                                                               ` Chris Friesen
  2007-08-17 18:54                                                                 ` Arjan van de Ven
@ 2007-08-17 19:08                                                                 ` Linus Torvalds
  1 sibling, 0 replies; 1546+ messages in thread
From: Linus Torvalds @ 2007-08-17 19:08 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, zlynx, rpjday, jesper.juhl, segher

On Fri, 17 Aug 2007, Chris Friesen wrote:
> 
> I assume you mean "except for IO-related code and 'random' values like
> jiffies" as you mention later on?

Yes. There *are* valid uses for "volatile", but they have remained the 
same for the last few years:
 - "jiffies"
 - internal per-architecture IO implementations that can do them as normal 
   stores.

> I assume other values set in interrupt handlers would count as "random" 
> from a volatility perspective?

I don't really see any valid case. I can imagine that you have your own 
"jiffy" counter in a driver, but what's the point, really? I'd suggest not 
using volatile, and using barriers instead.

> 
> > So anybody who argues for "volatile" fixing bugs is fundamentally 
> > incorrect. It does NO SUCH THING. By arguing that, such people only 
> > show that you have no idea what they are talking about.

> What about reading values modified in interrupt handlers, as in your 
> "random" case?  Or is this a bug where the user of atomic_read() is 
> invalidly expecting a read each time it is called?

Quite frankly, the biggest reason for using "volatile" on jiffies was 
really historic. So even the "random" case is not really a very strong 
one. You'll notice that anybody who is actually careful will be using 
sequence locks for the jiffy accesses, if only because the *full* jiffy 
count is actually a 64-bit value, and so you cannot get it atomically on a 
32-bit architecture even on a single CPU (ie a timer interrupt might 
happen in between reading the low and the high word, so "volatile" is only 
used for the low 32 bits).

So even for jiffies, we actually have:

	extern u64 __jiffy_data jiffies_64;
	extern unsigned long volatile __jiffy_data jiffies;

where the *real* jiffies is not volatile: the volatile one is using linker 
tricks to alias the low 32 bits:

 - arch/i386/kernel/vmlinux.lds.S:

	...
	jiffies = jiffies_64;
	...

and the only reason we do all these games is (a) it works and (b) it's 
legacy.

Note how I do *not* say "(c) it's a good idea".

			Linus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 19:49                                                                   ` Paul E. McKenney
@ 2007-08-17 19:49                                                                     ` Arjan van de Ven
  2007-08-17 20:12                                                                       ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Arjan van de Ven @ 2007-08-17 19:49 UTC (permalink / raw)
  To: paulmck
  Cc: Chris Friesen, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, zlynx, rpjday,
	jesper.juhl, segher


On Fri, 2007-08-17 at 12:49 -0700, Paul E. McKenney wrote:
> > > What about reading values modified in interrupt handlers, as in your 
> > > "random" case?  Or is this a bug where the user of atomic_read() is 
> > > invalidly expecting a read each time it is called?
> > 
> > the interrupt handler case is an SMP case since you do not know
> > beforehand what cpu your interrupt handler will run on.
> 
> With the exception of per-CPU variables, yes.

if you're spinning waiting for a per-CPU variable to get changed by an
interrupt handler... you have bigger problems than "volatile" ;-)

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 18:54                                                                 ` Arjan van de Ven
@ 2007-08-17 19:49                                                                   ` Paul E. McKenney
  2007-08-17 19:49                                                                     ` Arjan van de Ven
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-17 19:49 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Chris Friesen, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, zlynx, rpjday,
	jesper.juhl, segher

On Fri, Aug 17, 2007 at 11:54:33AM -0700, Arjan van de Ven wrote:
> 
> On Fri, 2007-08-17 at 12:50 -0600, Chris Friesen wrote:
> > Linus Torvalds wrote:
> > 
> > >  - in other words, the *only* possible meaning for "volatile" is a purely 
> > >    single-CPU meaning. And if you only have a single CPU involved in the 
> > >    process, the "volatile" is by definition pointless (because even 
> > >    without a volatile, the compiler is required to make the C code appear 
> > >    consistent as far as a single CPU is concerned).
> > 
> > I assume you mean "except for IO-related code and 'random' values like 
> > jiffies" as you mention later on?  I assume other values set in 
> > interrupt handlers would count as "random" from a volatility perspective?
> > 
> > > So anybody who argues for "volatile" fixing bugs is fundamentally 
> > > incorrect. It does NO SUCH THING. By arguing that, such people only show 
> > > that you have no idea what they are talking about.
> > 
> > What about reading values modified in interrupt handlers, as in your 
> > "random" case?  Or is this a bug where the user of atomic_read() is 
> > invalidly expecting a read each time it is called?
> 
> the interrupt handler case is an SMP case since you do not know
> beforehand what cpu your interrupt handler will run on.

With the exception of per-CPU variables, yes.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 19:49                                                                     ` Arjan van de Ven
@ 2007-08-17 20:12                                                                       ` Paul E. McKenney
  0 siblings, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-17 20:12 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Chris Friesen, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, zlynx, rpjday,
	jesper.juhl, segher

On Fri, Aug 17, 2007 at 12:49:00PM -0700, Arjan van de Ven wrote:
> 
> On Fri, 2007-08-17 at 12:49 -0700, Paul E. McKenney wrote:
> > > > What about reading values modified in interrupt handlers, as in your 
> > > > "random" case?  Or is this a bug where the user of atomic_read() is 
> > > > invalidly expecting a read each time it is called?
> > > 
> > > the interrupt handler case is an SMP case since you do not know
> > > beforehand what cpu your interrupt handler will run on.
> > 
> > With the exception of per-CPU variables, yes.
> 
> if you're spinning waiting for a per-CPU variable to get changed by an
> interrupt handler... you have bigger problems than "volatile" ;-)

That would be true, if you were doing that.  But you might instead be
simply making sure that the mainline actions were seen in order by the
interrupt handler.  My current example is the NMI-save rcu_read_lock()
implementation for realtime.  Not the common case, I will admit, but
still real.  ;-)

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:03                                           ` Linus Torvalds
  2007-08-17  3:43                                             ` Paul Mackerras
@ 2007-08-17 22:09                                             ` Segher Boessenkool
  1 sibling, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-17 22:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Satyam Sharma, Ilpo J?rvinen,
	Linux Kernel Mailing List, David Miller, Paul E. McKenney, ak,
	Netdev, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, schwidefsky, Chris Snook, Herbert Xu, wensong, wjiang

> Of course, since *normal* accesses aren't necessarily limited wrt
> re-ordering, the question then becomes one of "with regard to *what* 
> does
> it limit re-ordering?".
>
> A C compiler that re-orders two different volatile accesses that have a
> sequence point in between them is pretty clearly a buggy compiler. So 
> at a
> minimum, it limits re-ordering wrt other volatiles (assuming sequence
> points exists). It also means that the compiler cannot move it
> speculatively across conditionals, but other than that it's starting to
> get fuzzy.

This is actually really well-defined in C, not fuzzy at all.
"Volatile accesses" are a side effect, and no side effects can
be reordered with respect to sequence points.  The side effects
that matter in the kernel environment are: 1) accessing a volatile
object; 2) modifying an object; 3) volatile asm(); 4) calling a
function that does any of these.

We certainly should avoid volatile whenever possible, but "because
it's fuzzy wrt reordering" is not a reason -- all alternatives have
exactly the same issues.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:15                                               ` Nick Piggin
  2007-08-17  4:02                                                 ` Paul Mackerras
  2007-08-17  7:25                                                 ` Stefan Richter
@ 2007-08-17 22:14                                                 ` Segher Boessenkool
  2 siblings, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-17 22:14 UTC (permalink / raw)
  To: Nick Piggin
  Cc: paulmck, Christoph Lameter, Paul Mackerras, heiko.carstens,
	Stefan Richter, horms, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, jesper.juhl, linux-arch,
	Andrew Morton, zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

> (and yes, it is perfectly legitimate to
> want a non-volatile read for a data type that you also want to do
> atomic RMW operations on)

...which is undefined behaviour in C (and GCC) when that data is
declared volatile, which is a good argument against implementing
atomics that way in itself.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:42                       ` Linus Torvalds
                                           ` (3 preceding siblings ...)
  2007-08-17  8:52                         ` Andi Kleen
@ 2007-08-17 22:29                         ` Segher Boessenkool
  4 siblings, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-17 22:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Mackerras, heiko.carstens, horms, linux-kernel, rpjday, ak,
	netdev, cfriesen, akpm, Nick Piggin, linux-arch, jesper.juhl,
	satyam, zlynx, clameter, schwidefsky, Chris Snook, Herbert Xu,
	davem, wensong, wjiang

> In a reasonable world, gcc should just make that be (on x86)
>
> 	addl $1,i(%rip)
>
> on x86-64, which is indeed what it does without the volatile. But with 
> the
> volatile, the compiler gets really nervous, and doesn't dare do it in 
> one
> instruction, and thus generates crap like
>
>         movl    i(%rip), %eax
>         addl    $1, %eax
>         movl    %eax, i(%rip)
>
> instead. For no good reason, except that "volatile" just doesn't have 
> any
> good/clear semantics for the compiler, so most compilers will just 
> make it
> be "I will not touch this access in any way, shape, or form". Including
> even trivially correct instruction optimization/combination.

It's just a (target-specific, perhaps) missed-optimisation kind
of bug in GCC.  Care to file a bug report?

> but is
> (again) something that gcc doesn't dare do, since "i" is volatile.

Just nobody taught it it can do this; perhaps no one wanted to
add optimisations like that, maybe with a reasoning like "people
who hit the go-slow-in-unspecified-ways button should get what
they deserve" ;-)


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  4:24               ` Satyam Sharma
@ 2007-08-17 22:34                 ` Segher Boessenkool
  0 siblings, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-17 22:34 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Bill Fink, Linux Kernel Mailing List, Paul E. McKenney, netdev,
	ak, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, davids, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

> Now the second wording *IS* technically correct, but come on, it's
> 24 words long whereas the original one was 3 -- and hopefully anybody
> reading the shorter phrase *would* have known anyway what was meant,
> without having to be pedantic about it :-)

Well you were talking pretty formal (and detailed) stuff, so
IMHO it's good to have that exactly correct.  Sure it's nicer
to use small words most of the time :-)


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  4:32                 ` Satyam Sharma
@ 2007-08-17 22:38                   ` Segher Boessenkool
  2007-08-18 14:42                     ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-17 22:38 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Bill Fink, Linux Kernel Mailing List, Paul E. McKenney, netdev,
	ak, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

>>> Here, I should obviously admit that the semantics of *(volatile int 
>>> *)&
>>> aren't any neater or well-defined in the _language standard_ at all. 
>>> The
>>> standard does say (verbatim) "precisely what constitutes as access to
>>> object of volatile-qualified type is implementation-defined", but GCC
>>> does help us out here by doing the right thing.
>>
>> Where do you get that idea?
>
> Try a testcase (experimentally verify).

That doesn't prove anything.  Experiments can only disprove
things.

>> GCC manual, section 6.1, "When
>> is a Volatile Object Accessed?" doesn't say anything of the
>> kind.
>
> True, "implementation-defined" as per the C standard _is_ supposed to 
> mean
> "unspecified behaviour where each implementation documents how the 
> choice
> is made". So ok, probably GCC isn't "documenting" this
> implementation-defined behaviour which it is supposed to, but can't 
> really
> fault them much for this, probably.

GCC _is_ documenting this, namely in this section 6.1.  It doesn't
mention volatile-casted stuff.  Draw your own conclusions.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  5:56                         ` Satyam Sharma
  2007-08-17  7:26                           ` Nick Piggin
@ 2007-08-17 22:49                           ` Segher Boessenkool
  2007-08-17 23:51                             ` Satyam Sharma
  1 sibling, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-17 22:49 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Paul Mackerras, heiko.carstens, horms, Linux Kernel Mailing List,
	rpjday, ak, netdev, cfriesen, Nick Piggin, linux-arch,
	jesper.juhl, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

> #define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))
>
> [ This is exactly equivalent to using "+m" in the constraints, as 
> recently
>   explained on a GCC list somewhere, in response to the patch in my 
> bitops
>   series a few weeks back where I thought "+m" was bogus. ]

[It wasn't explained on a GCC list in response to your patch, as
far as I can see -- if I missed it, please point me to an archived
version of it].

One last time: it isn't equivalent on older (but still supported
by Linux) versions of GCC.  Current versions of GCC allow it, but
it has no documented behaviour at all, so use it at your own risk.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 18:38                                           ` Satyam Sharma
@ 2007-08-17 23:17                                             ` Segher Boessenkool
  2007-08-17 23:55                                               ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-17 23:17 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

>>> No it does not have any volatile semantics. atomic_dec() can be 
>>> reordered
>>> at will by the compiler within the current basic unit if you do not 
>>> add a
>>> barrier.
>>
>> "volatile" has nothing to do with reordering.
>
> If you're talking of "volatile" the type-qualifier keyword, then
> http://lkml.org/lkml/2007/8/16/231 (and sub-thread below it) shows
> otherwise.

I'm not sure what in that mail you mean, but anyway...

Yes, of course, the fact that "volatile" creates a side effect
prevents certain things from being reordered wrt the atomic_dec();
but the atomic_dec() has a side effect *already* so the volatile
doesn't change anything.

>> atomic_dec() writes
>> to memory, so it _does_ have "volatile semantics", implicitly, as
>> long as the compiler cannot optimise the atomic variable away
>> completely -- any store counts as a side effect.
>
> I don't think an atomic_dec() implemented as an inline "asm volatile"
> or one that uses a "forget" macro would have the same re-ordering
> guarantees as an atomic_dec() that uses a volatile access cast.

The "asm volatile" implementation does have exactly the same
reordering guarantees as the "volatile cast" thing, if that is
implemented by GCC in the "obvious" way.  Even a "plain" asm()
will do the same.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 22:49                           ` Segher Boessenkool
@ 2007-08-17 23:51                             ` Satyam Sharma
  2007-08-17 23:55                               ` Segher Boessenkool
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17 23:51 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Paul Mackerras, heiko.carstens, horms, Linux Kernel Mailing List,
	rpjday, ak, netdev, cfriesen, Nick Piggin, linux-arch,
	jesper.juhl, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang



On Sat, 18 Aug 2007, Segher Boessenkool wrote:

> > #define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))
> > 
> > [ This is exactly equivalent to using "+m" in the constraints, as recently
> >   explained on a GCC list somewhere, in response to the patch in my bitops
> >   series a few weeks back where I thought "+m" was bogus. ]
> 
> [It wasn't explained on a GCC list in response to your patch, as
> far as I can see -- if I missed it, please point me to an archived
> version of it].

http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01758.html

is a follow-up in the thread on the gcc-patches@gcc.gnu.org mailing list,
which began with:

http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01677.html

that was posted by Jan Kubicka, as he quotes in that initial posting,
after I had submitted:

http://lkml.org/lkml/2007/7/23/252

which was a (wrong) patch to "rectify" what I thought was the "bogus"
"+m" constraint, as per the quoted extract from gcc docs (that was
given in that (wrong) patch's changelog).

That's when _I_ came to know how GCC interprets "+m", but probably
this has been explained on those lists multiple times. Who cares,
anyway?


> One last time: it isn't equivalent on older (but still supported
> by Linux) versions of GCC.  Current versions of GCC allow it, but
> it has no documented behaviour at all, so use it at your own risk.

True.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 23:17                                             ` Segher Boessenkool
@ 2007-08-17 23:55                                               ` Satyam Sharma
  2007-08-18  0:04                                                 ` Segher Boessenkool
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-17 23:55 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang



On Sat, 18 Aug 2007, Segher Boessenkool wrote:

> > > > No it does not have any volatile semantics. atomic_dec() can be
> > > > reordered
> > > > at will by the compiler within the current basic unit if you do not add
> > > > a
> > > > barrier.
> > > 
> > > "volatile" has nothing to do with reordering.
> > 
> > If you're talking of "volatile" the type-qualifier keyword, then
> > http://lkml.org/lkml/2007/8/16/231 (and sub-thread below it) shows
> > otherwise.
> 
> I'm not sure what in that mail you mean, but anyway...
> 
> Yes, of course, the fact that "volatile" creates a side effect
> prevents certain things from being reordered wrt the atomic_dec();
> but the atomic_dec() has a side effect *already* so the volatile
> doesn't change anything.

That's precisely what that sub-thread (read down to the last mail
there, and not the first mail only) shows. So yes, "volatile" does
have something to do with re-ordering (as guaranteed by the C
standard).


> > > atomic_dec() writes
> > > to memory, so it _does_ have "volatile semantics", implicitly, as
> > > long as the compiler cannot optimise the atomic variable away
> > > completely -- any store counts as a side effect.
> > 
> > I don't think an atomic_dec() implemented as an inline "asm volatile"
> > or one that uses a "forget" macro would have the same re-ordering
> > guarantees as an atomic_dec() that uses a volatile access cast.
> 
> The "asm volatile" implementation does have exactly the same
> reordering guarantees as the "volatile cast" thing,

I don't think so.

> if that is
> implemented by GCC in the "obvious" way.  Even a "plain" asm()
> will do the same.

Read the relevant GCC documentation.

[ of course, if the (latest) GCC documentation is *yet again*
  wrong, then alright, not much I can do about it, is there. ]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 23:51                             ` Satyam Sharma
@ 2007-08-17 23:55                               ` Segher Boessenkool
  0 siblings, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-17 23:55 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Paul Mackerras, heiko.carstens, horms, Linux Kernel Mailing List,
	rpjday, ak, netdev, cfriesen, Nick Piggin, linux-arch,
	jesper.juhl, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

>>> #define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))
>>>
>>> [ This is exactly equivalent to using "+m" in the constraints, as 
>>> recently
>>>   explained on a GCC list somewhere, in response to the patch in my 
>>> bitops
>>>   series a few weeks back where I thought "+m" was bogus. ]
>>
>> [It wasn't explained on a GCC list in response to your patch, as
>> far as I can see -- if I missed it, please point me to an archived
>> version of it].
>
> http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01758.html

Ah yes, that old thread, thank you.

> That's when _I_ came to know how GCC interprets "+m", but probably
> this has been explained on those lists multiple times. Who cares,
> anyway?

I just couldn't find the thread you meant, I thought I missed
have it, that's all :-)


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:50                         ` Linus Torvalds
@ 2007-08-17 23:59                           ` Paul E. McKenney
  2007-08-18  0:09                             ` Herbert Xu
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-17 23:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nick Piggin, Paul Mackerras, Segher Boessenkool, heiko.carstens,
	horms, linux-kernel, rpjday, ak, netdev, cfriesen, akpm,
	jesper.juhl, linux-arch, zlynx, satyam, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, wensong, wjiang

On Thu, Aug 16, 2007 at 08:50:30PM -0700, Linus Torvalds wrote:
> Just try it yourself:
> 
> 	volatile int i;
> 	int j;
> 
> 	int testme(void)
> 	{
> 	        return i <= 1;
> 	}
> 
> 	int testme2(void)
> 	{
> 	        return j <= 1;
> 	}
> 
> and compile with all the optimizations you can.
> 
> I get:
> 
> 	testme:
> 	        movl    i(%rip), %eax
> 	        subl    $1, %eax
> 	        setle   %al
> 	        movzbl  %al, %eax
> 	        ret
> 
> vs
> 
> 	testme2:
> 	        xorl    %eax, %eax
> 	        cmpl    $1, j(%rip)
> 	        setle   %al
> 	        ret
> 
> (now, whether that "xorl + setle" is better than "setle + movzbl", I don't 
> really know - maybe it is. But that's not thepoint. The point is the 
> difference between
> 
>                 movl    i(%rip), %eax
>                 subl    $1, %eax
> 
> and
> 
>                 cmpl    $1, j(%rip)

gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 23:55                                               ` Satyam Sharma
@ 2007-08-18  0:04                                                 ` Segher Boessenkool
  2007-08-18  1:56                                                   ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-18  0:04 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

>>>> atomic_dec() writes
>>>> to memory, so it _does_ have "volatile semantics", implicitly, as
>>>> long as the compiler cannot optimise the atomic variable away
>>>> completely -- any store counts as a side effect.
>>>
>>> I don't think an atomic_dec() implemented as an inline "asm volatile"
>>> or one that uses a "forget" macro would have the same re-ordering
>>> guarantees as an atomic_dec() that uses a volatile access cast.
>>
>> The "asm volatile" implementation does have exactly the same
>> reordering guarantees as the "volatile cast" thing,
>
> I don't think so.

"asm volatile" creates a side effect.  Side effects aren't
allowed to be reordered wrt sequence points.  This is exactly
the same reason as why "volatile accesses" cannot be reordered.

>> if that is
>> implemented by GCC in the "obvious" way.  Even a "plain" asm()
>> will do the same.
>
> Read the relevant GCC documentation.

I did, yes.

> [ of course, if the (latest) GCC documentation is *yet again*
>   wrong, then alright, not much I can do about it, is there. ]

There was (and is) nothing wrong about the "+m" documentation, if
that is what you are talking about.  It could be extended now, to
allow "+m" -- but that takes more than just "fixing" the documentation.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 23:59                           ` Paul E. McKenney
@ 2007-08-18  0:09                             ` Herbert Xu
  2007-08-18  1:08                               ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-18  0:09 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Linus Torvalds, Nick Piggin, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, davem, wensong, wjiang

On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:
>
> gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)

I had totally forgotten that I'd already filed that bug more
than six years ago until they just closed yours as a duplicate
of mine :)

Good luck in getting it fixed!

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  0:09                             ` Herbert Xu
@ 2007-08-18  1:08                               ` Paul E. McKenney
  2007-08-18  1:24                                 ` Christoph Lameter
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-18  1:08 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Linus Torvalds, Nick Piggin, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, davem, wensong, wjiang

On Sat, Aug 18, 2007 at 08:09:13AM +0800, Herbert Xu wrote:
> On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:
> >
> > gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)
> 
> I had totally forgotten that I'd already filed that bug more
> than six years ago until they just closed yours as a duplicate
> of mine :)
> 
> Good luck in getting it fixed!

Well, just got done re-opening it for the third time.  And a local
gcc community member advised me not to give up too easily.  But I
must admit that I am impressed with the speed that it was identified
as duplicate.

Should be entertaining!  ;-)

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  1:08                               ` Paul E. McKenney
@ 2007-08-18  1:24                                 ` Christoph Lameter
  2007-08-18  1:41                                   ` Satyam Sharma
                                                     ` (2 more replies)
  0 siblings, 3 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-18  1:24 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Herbert Xu, Linus Torvalds, Nick Piggin, Paul Mackerras,
	Segher Boessenkool, heiko.carstens, horms, linux-kernel, rpjday,
	ak, netdev, cfriesen, akpm, jesper.juhl, linux-arch, zlynx,
	satyam, schwidefsky, Chris Snook, davem, wensong, wjiang

On Fri, 17 Aug 2007, Paul E. McKenney wrote:

> On Sat, Aug 18, 2007 at 08:09:13AM +0800, Herbert Xu wrote:
> > On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:
> > >
> > > gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)
> > 
> > I had totally forgotten that I'd already filed that bug more
> > than six years ago until they just closed yours as a duplicate
> > of mine :)
> > 
> > Good luck in getting it fixed!
> 
> Well, just got done re-opening it for the third time.  And a local
> gcc community member advised me not to give up too easily.  But I
> must admit that I am impressed with the speed that it was identified
> as duplicate.
> 
> Should be entertaining!  ;-)

Right. ROTFL... volatile actually breaks atomic_t instead of making it 
safe. x++ becomes a register load, increment and a register store. Without 
volatile we can increment the memory directly. It seems that volatile 
requires that the variable is loaded into a register first and then 
operated upon. Understandable when you think about volatile being used to 
access memory mapped I/O registers where a RMW operation could be 
problematic.

See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3506

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  1:24                                 ` Christoph Lameter
@ 2007-08-18  1:41                                   ` Satyam Sharma
  2007-08-18  4:13                                     ` Linus Torvalds
  2007-08-18 21:56                                   ` Paul E. McKenney
  2007-08-20 13:31                                   ` Chris Snook
  2 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-18  1:41 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Herbert Xu, Linus Torvalds, Nick Piggin,
	Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, schwidefsky, Chris Snook, davem, wensong,
	wjiang



On Fri, 17 Aug 2007, Christoph Lameter wrote:

> On Fri, 17 Aug 2007, Paul E. McKenney wrote:
> 
> > On Sat, Aug 18, 2007 at 08:09:13AM +0800, Herbert Xu wrote:
> > > On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:
> > > >
> > > > gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)
> > > 
> > > I had totally forgotten that I'd already filed that bug more
> > > than six years ago until they just closed yours as a duplicate
> > > of mine :)
> > > 
> > > Good luck in getting it fixed!
> > 
> > Well, just got done re-opening it for the third time.  And a local
> > gcc community member advised me not to give up too easily.  But I
> > must admit that I am impressed with the speed that it was identified
> > as duplicate.
> > 
> > Should be entertaining!  ;-)
> 
> Right. ROTFL... volatile actually breaks atomic_t instead of making it 
> safe. x++ becomes a register load, increment and a register store. Without 
> volatile we can increment the memory directly.

No code does (or would do, or should do):

	x.counter++;

on an "atomic_t x;" anyway.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  0:04                                                 ` Segher Boessenkool
@ 2007-08-18  1:56                                                   ` Satyam Sharma
  2007-08-18  2:15                                                     ` Segher Boessenkool
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-18  1:56 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

On Sat, 18 Aug 2007, Segher Boessenkool wrote:

> > > > > atomic_dec() writes
> > > > > to memory, so it _does_ have "volatile semantics", implicitly, as
> > > > > long as the compiler cannot optimise the atomic variable away
> > > > > completely -- any store counts as a side effect.
> > > > 
> > > > I don't think an atomic_dec() implemented as an inline "asm volatile"
> > > > or one that uses a "forget" macro would have the same re-ordering
> > > > guarantees as an atomic_dec() that uses a volatile access cast.
> > > 
> > > The "asm volatile" implementation does have exactly the same
> > > reordering guarantees as the "volatile cast" thing,
> > 
> > I don't think so.
> 
> "asm volatile" creates a side effect.

Yeah.

> Side effects aren't
> allowed to be reordered wrt sequence points.

Yeah.

> This is exactly
> the same reason as why "volatile accesses" cannot be reordered.

No, the code in that sub-thread I earlier pointed you at *WAS* written
such that there was a sequence point after all the uses of that volatile
access cast macro, and _therefore_ we were safe from re-ordering
(behaviour guaranteed by the C standard).

But you seem to be missing the simple and basic fact that:

	(something_that_has_side_effects || statement)
			!= something_that_is_a_sequence_point

Now, one cannot fantasize that "volatile asms" are also sequence points.
In fact such an argument would be sadly mistaken, because "sequence
points" are defined by the C standard and it'd be horribly wrong to
even _try_ claiming that the C standard knows about "volatile asms".

> > > if that is
> > > implemented by GCC in the "obvious" way.  Even a "plain" asm()
> > > will do the same.
> > 
> > Read the relevant GCC documentation.
> 
> I did, yes.

No, you didn't read:

http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html

Read the bit about the need for artificial dependencies, and the example
given there:

	asm volatile("mtfsf 255,%0" : : "f" (fpenv));
	sum = x + y;

The docs explicitly say the addition can be moved before the "volatile
asm". Hopefully, as you know, (x + y) is an C "expression" and hence
a "sequence point" as defined by the standard. So the "volatile asm"
should've happened before it, right? Wrong.

I know there is also stuff written about "side-effects" there which
_could_ give the same semantic w.r.t. sequence points as the volatile
access casts, but hey, it's GCC's own documentation, you obviously can't
find fault with _me_ if there's wrong stuff written in there. Say that
to GCC ...

See, "volatile" C keyword, for all it's ill-definition and dodgy
semantics, is still at least given somewhat of a treatment in the C
standard (whose quality is ... ummm, sadly not always good and clear,
but unsurprisingly, still about 5,482 orders-of-magnitude times
better than GCC docs). Semantics of "volatile" as applies to inline
asm, OTOH? You're completely relying on the compiler for that ...

> > [ of course, if the (latest) GCC documentation is *yet again*
> >   wrong, then alright, not much I can do about it, is there. ]
> 
> There was (and is) nothing wrong about the "+m" documentation, if
> that is what you are talking about.  It could be extended now, to
> allow "+m" -- but that takes more than just "fixing" the documentation.

No, there was (and is) _everything_ wrong about the "+" documentation as
applies to memory-constrained operands. I don't give a whit if it's
some workaround in their gimplifier, or the other, that makes it possible
to use "+m" (like the current kernel code does). The docs suggest
otherwise, so there's obviously a clear disconnect between the docs and
actual GCC behaviour.

[ You seem to often take issue with _amazingly_ petty and pedantic things,
  by the way :-) ]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 12:56                                                               ` Nick Piggin
@ 2007-08-18  2:15                                                                 ` Satyam Sharma
  0 siblings, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-18  2:15 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Stefan Richter, paulmck, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher



On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> 
> > I didn't quite understand what you said here, so I'll tell what I think:
> > 
> > * foo() is a compiler barrier if the definition of foo() is invisible to
> >  the compiler at a callsite.
> > 
> > * foo() is also a compiler barrier if the definition of foo() includes
> >  a barrier, and it is inlined at the callsite.
> > 
> > If the above is wrong, or if there's something else at play as well,
> > do let me know.
> 
> [...]
> If a function is not completely visible to the compiler (so it can't
> determine whether a barrier could be in it or not), then it must always
> assume it will contain a barrier so it always does the right thing.

Yup, that's what I'd said just a few sentences above, as you can see. I
was actually asking for "elaboration" on "how a compiler determines that
function foo() (say foo == schedule), even when it cannot see that it has
a barrier(), as you'd mentioned, is a 'sleeping' function" actually, and
whether compilers have a "notion of sleep to automatically assume a
compiler barrier whenever such a sleeping function foo() is called". But
I think you've already qualified the discussion to this kernel, so okay,
I shouldn't nit-pick anymore.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  1:56                                                   ` Satyam Sharma
@ 2007-08-18  2:15                                                     ` Segher Boessenkool
  2007-08-18  3:33                                                       ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-18  2:15 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

>>>> The "asm volatile" implementation does have exactly the same
>>>> reordering guarantees as the "volatile cast" thing,
>>>
>>> I don't think so.
>>
>> "asm volatile" creates a side effect.
>
> Yeah.
>
>> Side effects aren't
>> allowed to be reordered wrt sequence points.
>
> Yeah.
>
>> This is exactly
>> the same reason as why "volatile accesses" cannot be reordered.
>
> No, the code in that sub-thread I earlier pointed you at *WAS* written
> such that there was a sequence point after all the uses of that 
> volatile
> access cast macro, and _therefore_ we were safe from re-ordering
> (behaviour guaranteed by the C standard).

And exactly the same is true for the "asm" version.

> Now, one cannot fantasize that "volatile asms" are also sequence 
> points.

Sure you can do that.  I don't though.

> In fact such an argument would be sadly mistaken, because "sequence
> points" are defined by the C standard and it'd be horribly wrong to
> even _try_ claiming that the C standard knows about "volatile asms".

That's nonsense.  GCC can extend the C standard any way they
bloody well please -- witness the fact that they added an
extra class of side effects...

>>> Read the relevant GCC documentation.
>>
>> I did, yes.
>
> No, you didn't read:
>
> http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
>
> Read the bit about the need for artificial dependencies, and the 
> example
> given there:
>
> 	asm volatile("mtfsf 255,%0" : : "f" (fpenv));
> 	sum = x + y;
>
> The docs explicitly say the addition can be moved before the "volatile
> asm". Hopefully, as you know, (x + y) is an C "expression" and hence
> a "sequence point" as defined by the standard.

The _end of a full expression_ is a sequence point, not every
expression.  And that is irrelevant here anyway.

It is perfectly fine to compute x+y any time before the
assignment; the C compiler is allowed to compute it _after_
the assignment even, if it could figure out how ;-)

x+y does not contain a side effect, you know.

> I know there is also stuff written about "side-effects" there which
> _could_ give the same semantic w.r.t. sequence points as the volatile
> access casts,

s/could/does/

> but hey, it's GCC's own documentation, you obviously can't
> find fault with _me_ if there's wrong stuff written in there. Say that
> to GCC ...

There's nothing wrong there.

> See, "volatile" C keyword, for all it's ill-definition and dodgy
> semantics, is still at least given somewhat of a treatment in the C
> standard (whose quality is ... ummm, sadly not always good and clear,
> but unsurprisingly, still about 5,482 orders-of-magnitude times
> better than GCC docs).

If you find any problems/shortcomings in the GCC documentation,
please file a PR, don't go whine on some unrelated mailing lists.
Thank you.

> Semantics of "volatile" as applies to inline
> asm, OTOH? You're completely relying on the compiler for that ...

Yes, and?  GCC promises the behaviour it has documented.

>>> [ of course, if the (latest) GCC documentation is *yet again*
>>>   wrong, then alright, not much I can do about it, is there. ]
>>
>> There was (and is) nothing wrong about the "+m" documentation, if
>> that is what you are talking about.  It could be extended now, to
>> allow "+m" -- but that takes more than just "fixing" the 
>> documentation.
>
> No, there was (and is) _everything_ wrong about the "+" documentation 
> as
> applies to memory-constrained operands. I don't give a whit if it's
> some workaround in their gimplifier, or the other, that makes it 
> possible
> to use "+m" (like the current kernel code does). The docs suggest
> otherwise, so there's obviously a clear disconnect between the docs and
> actual GCC behaviour.

The documentation simply doesn't say "+m" is allowed.  The code to
allow it was added for the benefit of people who do not read the
documentation.  Documentation for "+m" might get added later if it
is decided this [the code, not the documentation] is a sane thing
to have (which isn't directly obvious).

> [ You seem to often take issue with _amazingly_ petty and pedantic 
> things,
>   by the way :-) ]

If you're talking details, you better get them right.  Handwaving is
fine with me as long as you're not purporting you're not.

And I simply cannot stand false assertions.

You can always ignore me if _you_ take issue with _that_ :-)


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  2:15                                                     ` Segher Boessenkool
@ 2007-08-18  3:33                                                       ` Satyam Sharma
  2007-08-18  5:18                                                         ` Segher Boessenkool
  0 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-18  3:33 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang



On Sat, 18 Aug 2007, Segher Boessenkool wrote:

> > > > > The "asm volatile" implementation does have exactly the same
> > > > > reordering guarantees as the "volatile cast" thing,
> > > > 
> > > > I don't think so.
> > > 
> > > "asm volatile" creates a side effect.
> > 
> > Yeah.
> > 
> > > Side effects aren't
> > > allowed to be reordered wrt sequence points.
> > 
> > Yeah.
> > 
> > > This is exactly
> > > the same reason as why "volatile accesses" cannot be reordered.
> > 
> > No, the code in that sub-thread I earlier pointed you at *WAS* written
> > such that there was a sequence point after all the uses of that volatile
> > access cast macro, and _therefore_ we were safe from re-ordering
> > (behaviour guaranteed by the C standard).
> 
> And exactly the same is true for the "asm" version.
> 
> > Now, one cannot fantasize that "volatile asms" are also sequence points.
> 
> Sure you can do that.  I don't though.
> 
> > In fact such an argument would be sadly mistaken, because "sequence
> > points" are defined by the C standard and it'd be horribly wrong to
> > even _try_ claiming that the C standard knows about "volatile asms".
> 
> That's nonsense.  GCC can extend the C standard any way they
> bloody well please -- witness the fact that they added an
> extra class of side effects...
> 
> > > > Read the relevant GCC documentation.
> > > 
> > > I did, yes.
> > 
> > No, you didn't read:
> > 
> > http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
> > 
> > Read the bit about the need for artificial dependencies, and the example
> > given there:
> > 
> > 	asm volatile("mtfsf 255,%0" : : "f" (fpenv));
> > 	sum = x + y;
> > 
> > The docs explicitly say the addition can be moved before the "volatile
> > asm". Hopefully, as you know, (x + y) is an C "expression" and hence
> > a "sequence point" as defined by the standard.
> 
> The _end of a full expression_ is a sequence point, not every
> expression.  And that is irrelevant here anyway.
> 
> It is perfectly fine to compute x+y any time before the
> assignment; the C compiler is allowed to compute it _after_
> the assignment even, if it could figure out how ;-)
> 
> x+y does not contain a side effect, you know.
> 
> > I know there is also stuff written about "side-effects" there which
> > _could_ give the same semantic w.r.t. sequence points as the volatile
> > access casts,
> 
> s/could/does/
> 
> > but hey, it's GCC's own documentation, you obviously can't
> > find fault with _me_ if there's wrong stuff written in there. Say that
> > to GCC ...
> 
> There's nothing wrong there.
> 
> > See, "volatile" C keyword, for all it's ill-definition and dodgy
> > semantics, is still at least given somewhat of a treatment in the C
> > standard (whose quality is ... ummm, sadly not always good and clear,
> > but unsurprisingly, still about 5,482 orders-of-magnitude times
> > better than GCC docs).
> 
> If you find any problems/shortcomings in the GCC documentation,
> please file a PR, don't go whine on some unrelated mailing lists.
> Thank you.
> 
> > Semantics of "volatile" as applies to inline
> > asm, OTOH? You're completely relying on the compiler for that ...
> 
> Yes, and?  GCC promises the behaviour it has documented.

LOTS there, which obviously isn't correct, but which I'll reply to later,
easier stuff first. (call this "handwaving" if you want, but don't worry,
I /will/ bother myself to reply)


> > > > [ of course, if the (latest) GCC documentation is *yet again*
> > > >   wrong, then alright, not much I can do about it, is there. ]
> > > 
> > > There was (and is) nothing wrong about the "+m" documentation, if
> > > that is what you are talking about.  It could be extended now, to
> > > allow "+m" -- but that takes more than just "fixing" the documentation.
> > 
> > No, there was (and is) _everything_ wrong about the "+" documentation as
> > applies to memory-constrained operands. I don't give a whit if it's
> > some workaround in their gimplifier, or the other, that makes it possible
> > to use "+m" (like the current kernel code does). The docs suggest
> > otherwise, so there's obviously a clear disconnect between the docs and
> > actual GCC behaviour.
> 
> The documentation simply doesn't say "+m" is allowed.  The code to
> allow it was added for the benefit of people who do not read the
> documentation.  Documentation for "+m" might get added later if it
> is decided this [the code, not the documentation] is a sane thing
> to have (which isn't directly obvious).

Huh?

"If the (current) documentation doesn't match up with the (current)
code, then _at least one_ of them has to be (as of current) wrong."

I wonder how could you even try to disagree with that.

And I didn't go whining about this ... you asked me. (I think I'd said
something to the effect of GCC docs are often wrong, which is true,
but probably you feel saying that is "not allowed" on non-gcc lists?)

As for the "PR" you're requesting me to file with GCC for this, that
gcc-patches@ thread did precisely that and more (submitted a patch to
said documentation -- and no, saying "documentation might get added
later" is totally bogus and nonsensical -- documentation exists to
document current behaviour, not past). But come on, this is wholly
petty. I wouldn't have replied, really, if you weren't so provoking.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  1:41                                   ` Satyam Sharma
@ 2007-08-18  4:13                                     ` Linus Torvalds
  2007-08-18 13:36                                       ` Satyam Sharma
                                                         ` (2 more replies)
  0 siblings, 3 replies; 1546+ messages in thread
From: Linus Torvalds @ 2007-08-18  4:13 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Paul E. McKenney, Herbert Xu, Nick Piggin,
	Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, schwidefsky, Chris Snook, davem, wensong,
	wjiang

On Sat, 18 Aug 2007, Satyam Sharma wrote:
> 
> No code does (or would do, or should do):
> 
> 	x.counter++;
> 
> on an "atomic_t x;" anyway.

That's just an example of a general problem.

No, you don't use "x.counter++". But you *do* use

	if (atomic_read(&x) <= 1)

and loading into a register is stupid and pointless, when you could just 
do it as a regular memory-operand to the cmp instruction.

And as far as the compiler is concerned, the problem is the 100% same: 
combining operations with the volatile memop.

The fact is, a compiler that thinks that

	movl mem,reg
	cmpl $val,reg

is any better than

	cmpl $val,mem

is just not a very good compiler. But when talking about "volatile", 
that's exactly what ytou always get (and always have gotten - this is 
not a regression, and I doubt gcc is alone in this).

			Linus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  3:33                                                       ` Satyam Sharma
@ 2007-08-18  5:18                                                         ` Segher Boessenkool
  2007-08-18 13:20                                                           ` Satyam Sharma
  0 siblings, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-18  5:18 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

>> The documentation simply doesn't say "+m" is allowed.  The code to
>> allow it was added for the benefit of people who do not read the
>> documentation.  Documentation for "+m" might get added later if it
>> is decided this [the code, not the documentation] is a sane thing
>> to have (which isn't directly obvious).
>
> Huh?
>
> "If the (current) documentation doesn't match up with the (current)
> code, then _at least one_ of them has to be (as of current) wrong."
>
> I wonder how could you even try to disagree with that.

Easy.

The GCC documentation you're referring to is the user's manual.
See the blurb on the first page:

"This manual documents how to use the GNU compilers, as well as their
features and incompatibilities, and how to report bugs.  It corresponds
to GCC version 4.3.0.  The internals of the GNU compilers, including
how to port them to new targets and some information about how to write
front ends for new languages, are documented in a separate manual."

_How to use_.  This documentation doesn't describe in minute detail
everything the compiler does (see the source code for that -- no, it
isn't described in the internals manual either).

If it doesn't tell you how to use "+m", and even tells you _not_ to
use it, maybe that is what it means to say?  It doesn't mean "+m"
doesn't actually do something.  It also doesn't mean it does what
you think it should do.  It might do just that of course.  But treating
writing C code as an empirical science isn't such a smart idea.

> And I didn't go whining about this ... you asked me. (I think I'd said
> something to the effect of GCC docs are often wrong,

No need to guess at what you said, even if you managed to delete
your own mail already, there are plenty of free web-based archives
around.  You said:

> See, "volatile" C keyword, for all it's ill-definition and dodgy
> semantics, is still at least given somewhat of a treatment in the C
> standard (whose quality is ... ummm, sadly not always good and clear,
> but unsurprisingly, still about 5,482 orders-of-magnitude times
> better than GCC docs).

and that to me reads as complaining that the ISO C standard "isn't
very good" and that the GCC documentation is 10**5482 times worse
even.  Which of course is hyperbole and cannot be true.  It also
isn't helpful in any way or form for anyone on this list.  I call
that whining.

> which is true,

Yes, documentation of that size often has shortcomings.  No surprise
there.  However, great effort is made to make it better documentation,
and especially to keep it up to date; if you find any errors or
omissions, please report them.  There are many ways how to do that,
see the GCC homepage.</end-of-marketing-blurb>

> but probably you feel saying that is "not allowed" on non-gcc lists?)

You're allowed to say whatever you want.  Let's have a quote again
shall we?  I said:

> If you find any problems/shortcomings in the GCC documentation,
> please file a PR, don't go whine on some unrelated mailing lists.
> Thank you.

I read that as a friendly request, not a prohibition.  Well maybe
not actually friendly, more a bit angry.  A request, either way.

> As for the "PR"

"Problem report", a bugzilla ticket.  Sorry for using terminology
unknown to you.

> you're requesting me to file with GCC for this, that
> gcc-patches@ thread did precisely that

Actually not -- PRs make sure issues aren't forgotten (although
they might gather dust, sure).  But yes, submitting patches is a
Great Thing(tm).

> and more (submitted a patch to
> said documentation -- and no, saying "documentation might get added
> later" is totally bogus and nonsensical -- documentation exists to
> document current behaviour, not past).

When code like you want to write becomes a supported feature, that
will be reflected in the user manual.  It is completely nonsensical
to expect everything that is *not* a supported feature to be mentioned
there.

> I wouldn't have replied, really, if you weren't so provoking.

Hey, maybe that character trait is good for something, then.
Now to build a business plan around it...

Segher

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  5:18                                                         ` Segher Boessenkool
@ 2007-08-18 13:20                                                           ` Satyam Sharma
  0 siblings, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-18 13:20 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

[ LOL, you _are_ shockingly petty! ]


On Sat, 18 Aug 2007, Segher Boessenkool wrote:

> > > The documentation simply doesn't say "+m" is allowed.  The code to
> > > allow it was added for the benefit of people who do not read the
> > > documentation.  Documentation for "+m" might get added later if it
> > > is decided this [the code, not the documentation] is a sane thing
> > > to have (which isn't directly obvious).
> > 
> > Huh?
> > 
> > "If the (current) documentation doesn't match up with the (current)
> > code, then _at least one_ of them has to be (as of current) wrong."
> > 
> > I wonder how could you even try to disagree with that.
> 
> Easy.
> 
> The GCC documentation you're referring to is the user's manual.
> See the blurb on the first page:
> 
> "This manual documents how to use the GNU compilers, as well as their
> features and incompatibilities, and how to report bugs.  It corresponds
> to GCC version 4.3.0.  The internals of the GNU compilers, including
> how to port them to new targets and some information about how to write
> front ends for new languages, are documented in a separate manual."
> 
> _How to use_.  This documentation doesn't describe in minute detail
> everything the compiler does (see the source code for that -- no, it
> isn't described in the internals manual either).

Wow, now that's a nice "disclaimer". By your (poor) standards of writing
documentation, one can as well write any factually incorrect stuff that
one wants in a document once you've got such a blurb in place :-)


> If it doesn't tell you how to use "+m", and even tells you _not_ to
> use it, maybe that is what it means to say?  It doesn't mean "+m"
> doesn't actually do something.  It also doesn't mean it does what
> you think it should do.  It might do just that of course.  But treating
> writing C code as an empirical science isn't such a smart idea.

Oh, really? Considering how much is (left out of being) documented, often
one would reasonably have to experimentally see (with testcases) how the
compiler behaves for some given code. Well, at least _I_ do it often
(several others on this list do as well), and I think there's everything
smart about it rather than having to read gcc sources -- I'd be surprised
(unless you have infinite free time on your hands, which does look like
teh case actually) if someone actually prefers reading gcc sources first
to know what/how gcc does something for some given code, rather than
simply write it out, compile and look the generated code (saves time for
those who don't have an infinite amount of it).


> > And I didn't go whining about this ... you asked me. (I think I'd said
> > something to the effect of GCC docs are often wrong,
> 
> No need to guess at what you said, even if you managed to delete
> your own mail already, there are plenty of free web-based archives
> around.  You said:
> 
> > See, "volatile" C keyword, for all it's ill-definition and dodgy
> > semantics, is still at least given somewhat of a treatment in the C
> > standard (whose quality is ... ummm, sadly not always good and clear,
> > but unsurprisingly, still about 5,482 orders-of-magnitude times
> > better than GCC docs).

Try _reading_ what I said there, for a change, dude. I'd originally only
said "unless GCC's docs is yet again wrong" ... then _you_ asked me what,
after which this discussion began and I wrote the above [which I fully
agree with -- so what if I used hyperbole in my sentence (yup, that was
intended, and obviously, exaggeration), am I not even allowed to do that?
Man, you're a Nazi or what ...] I didn't go whining about on my own as
you'd had earlier suggested, until _you_ asked me.

[ Ick, I somehow managed to reply this ... this is such a ...
  *disgustingly* petty argument you made here. ]


> > which is true,
> 
> Yes, documentation of that size often has shortcomings.  No surprise
> there.  However, great effort is made to make it better documentation,
> and especially to keep it up to date; if you find any errors or
> omissions, please report them.  There are many ways how to do that,
> see the GCC homepage.</end-of-marketing-blurb>
                         ^^^^^^^^^^^^^^^^^^^^^^

Looks like you even get paid :-)


> > but probably you feel saying that is "not allowed" on non-gcc lists?)
> 
> [amazingly pointless stuff snipped]
> 
> > As for the "PR"
> > you're requesting me to file with GCC for this, that
> > gcc-patches@ thread did precisely that
> 
> [more amazingly pointless stuff snipped]
> 
> > and more (submitted a patch to
> > said documentation -- and no, saying "documentation might get added
> > later" is totally bogus and nonsensical -- documentation exists to
> > document current behaviour, not past).
> 
> When code like you want to write becomes a supported feature, that
> will be reflected in the user manual.  It is completely nonsensical
> to expect everything that is *not* a supported feature to be mentioned
> there.

What crap. It is _perfectly reasonable_ to expect (current) documentation
to keep up with (current) code behaviour. In fact trying to justify such
a state is completely bogus and nonsensical.


Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  4:13                                     ` Linus Torvalds
@ 2007-08-18 13:36                                       ` Satyam Sharma
  2007-08-18 21:54                                       ` Paul E. McKenney
  2007-08-24 12:19                                       ` Denys Vlasenko
  2 siblings, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-18 13:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Lameter, Paul E. McKenney, Herbert Xu, Nick Piggin,
	Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, schwidefsky, Chris Snook, davem, wensong,
	wjiang



On Fri, 17 Aug 2007, Linus Torvalds wrote:

> On Sat, 18 Aug 2007, Satyam Sharma wrote:
> > 
> > No code does (or would do, or should do):
> > 
> > 	x.counter++;
> > 
> > on an "atomic_t x;" anyway.
> 
> That's just an example of a general problem.
> 
> No, you don't use "x.counter++". But you *do* use
> 
> 	if (atomic_read(&x) <= 1)
> 
> and loading into a register is stupid and pointless, when you could just 
> do it as a regular memory-operand to the cmp instruction.

True, but that makes this a bad/poor code generation issue with the
compiler, not something that affects the _correctness_ of atomic ops if
"volatile" is used for that counter object (as was suggested), because
we'd always use the atomic_inc() etc primitives to do increments, which
are always (should be!) implemented to be atomic.


> And as far as the compiler is concerned, the problem is the 100% same: 
> combining operations with the volatile memop.
> 
> The fact is, a compiler that thinks that
> 
> 	movl mem,reg
> 	cmpl $val,reg
> 
> is any better than
> 
> 	cmpl $val,mem
> 
> is just not a very good compiler.

Absolutely, this is definitely a bug report worth opening with gcc. And
what you've said to explain this previously sounds definitely correct --
seeing "volatile" for any access does appear to just scare the hell out
of gcc and makes it generate such (poor) code.


Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* LDD3 pitfalls (was Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures)
  2007-08-17  8:06                                                   ` Nick Piggin
  2007-08-17  8:58                                                     ` Satyam Sharma
  2007-08-17 10:48                                                     ` Stefan Richter
@ 2007-08-18 14:35                                                     ` Stefan Richter
  2007-08-20 13:28                                                       ` Chris Snook
  2 siblings, 1 reply; 1546+ messages in thread
From: Stefan Richter @ 2007-08-18 14:35 UTC (permalink / raw)
  To: Jonathan Corbet, Greg Kroah-Hartman
  Cc: Nick Piggin, paulmck, Herbert Xu, Paul Mackerras, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Nick Piggin wrote:
> Stefan Richter wrote:
>> Nick Piggin wrote:
>>
>>> I don't know why people would assume volatile of atomics. AFAIK, most
>>> of the documentation is pretty clear that all the atomic stuff can be
>>> reordered etc. except for those that modify and return a value.
>>
>>
>> Which documentation is there?
> 
> Documentation/atomic_ops.txt
> 
> 
>> For driver authors, there is LDD3.  It doesn't specifically cover
>> effects of optimization on accesses to atomic_t.
>>
>> For architecture port authors, there is Documentation/atomic_ops.txt.
>> Driver authors also can learn something from that document, as it
>> indirectly documents the atomic_t and bitops APIs.
>>
> 
> "Semantics and Behavior of Atomic and Bitmask Operations" is
> pretty direct :)
> 
> Sure, it says that it's for arch maintainers, but there is no
> reason why users can't make use of it.


Note, LDD3 page 238 says:  "It is worth noting that most of the other
kernel primitives dealing with synchronization, such as spinlock and
atomic_t operations, also function as memory barriers."

I don't know about Linux 2.6.10 against which LDD3 was written, but
currently only _some_ atomic_t operations function as memory barriers.

Besides, judging from some posts in this thread, saying that atomic_t
operations dealt with synchronization may not be entirely precise.
-- 
Stefan Richter
-=====-=-=== =--- =--=-
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 22:38                   ` Segher Boessenkool
@ 2007-08-18 14:42                     ` Satyam Sharma
  0 siblings, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-18 14:42 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Bill Fink, Linux Kernel Mailing List, Paul E. McKenney, netdev,
	ak, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang



On Sat, 18 Aug 2007, Segher Boessenkool wrote:

> > > GCC manual, section 6.1, "When
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > is a Volatile Object Accessed?" doesn't say anything of the
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > kind.
      ^^^^^

> > True, "implementation-defined" as per the C standard _is_ supposed to mean
    ^^^^^

> > "unspecified behaviour where each implementation documents how the choice
> > is made". So ok, probably GCC isn't "documenting" this
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> > implementation-defined behaviour which it is supposed to, but can't really
> > fault them much for this, probably.
> 
> GCC _is_ documenting this, namely in this section 6.1.

(Again totally petty, but) Yes, but ...

> It doesn't
  ^^^^^^^^^^
> mention volatile-casted stuff.  Draw your own conclusions.
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

... exactly. So that's why I said "GCC isn't documenting _this_".

Man, try _reading_ mails before replying to them ...

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  4:13                                     ` Linus Torvalds
  2007-08-18 13:36                                       ` Satyam Sharma
@ 2007-08-18 21:54                                       ` Paul E. McKenney
  2007-08-18 22:41                                         ` Linus Torvalds
  2007-08-24 12:19                                       ` Denys Vlasenko
  2 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-18 21:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Satyam Sharma, Christoph Lameter, Herbert Xu, Nick Piggin,
	Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, schwidefsky, Chris Snook, davem, wensong,
	wjiang

On Fri, Aug 17, 2007 at 09:13:35PM -0700, Linus Torvalds wrote:
> 
> 
> On Sat, 18 Aug 2007, Satyam Sharma wrote:
> > 
> > No code does (or would do, or should do):
> > 
> > 	x.counter++;
> > 
> > on an "atomic_t x;" anyway.
> 
> That's just an example of a general problem.
> 
> No, you don't use "x.counter++". But you *do* use
> 
> 	if (atomic_read(&x) <= 1)
> 
> and loading into a register is stupid and pointless, when you could just 
> do it as a regular memory-operand to the cmp instruction.
> 
> And as far as the compiler is concerned, the problem is the 100% same: 
> combining operations with the volatile memop.
> 
> The fact is, a compiler that thinks that
> 
> 	movl mem,reg
> 	cmpl $val,reg
> 
> is any better than
> 
> 	cmpl $val,mem
> 
> is just not a very good compiler. But when talking about "volatile", 
> that's exactly what ytou always get (and always have gotten - this is 
> not a regression, and I doubt gcc is alone in this).

One of the gcc guys claimed that he thought that the two-instruction
sequence would be faster on some x86 machines.  I pointed out that
there might be a concern about code size.  I chose not to point out
that people might also care about the other x86 machines.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  1:24                                 ` Christoph Lameter
  2007-08-18  1:41                                   ` Satyam Sharma
@ 2007-08-18 21:56                                   ` Paul E. McKenney
  2007-08-20 13:31                                   ` Chris Snook
  2 siblings, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-18 21:56 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Herbert Xu, Linus Torvalds, Nick Piggin, Paul Mackerras,
	Segher Boessenkool, heiko.carstens, horms, linux-kernel, rpjday,
	ak, netdev, cfriesen, akpm, jesper.juhl, linux-arch, zlynx,
	satyam, schwidefsky, Chris Snook, davem, wensong, wjiang

On Fri, Aug 17, 2007 at 06:24:15PM -0700, Christoph Lameter wrote:
> On Fri, 17 Aug 2007, Paul E. McKenney wrote:
> 
> > On Sat, Aug 18, 2007 at 08:09:13AM +0800, Herbert Xu wrote:
> > > On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:
> > > >
> > > > gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)
> > > 
> > > I had totally forgotten that I'd already filed that bug more
> > > than six years ago until they just closed yours as a duplicate
> > > of mine :)
> > > 
> > > Good luck in getting it fixed!
> > 
> > Well, just got done re-opening it for the third time.  And a local
> > gcc community member advised me not to give up too easily.  But I
> > must admit that I am impressed with the speed that it was identified
> > as duplicate.
> > 
> > Should be entertaining!  ;-)
> 
> Right. ROTFL... volatile actually breaks atomic_t instead of making it 
> safe. x++ becomes a register load, increment and a register store. Without 
> volatile we can increment the memory directly. It seems that volatile 
> requires that the variable is loaded into a register first and then 
> operated upon. Understandable when you think about volatile being used to 
> access memory mapped I/O registers where a RMW operation could be 
> problematic.
> 
> See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3506

Yep.  The initial reaction was in fact to close my bug as a duplicate
of 3506.  But I was not asking for atomicity, but rather for smaller
code to be generated, so I reopened it.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18 21:54                                       ` Paul E. McKenney
@ 2007-08-18 22:41                                         ` Linus Torvalds
  2007-08-18 23:19                                           ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Linus Torvalds @ 2007-08-18 22:41 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Satyam Sharma, Christoph Lameter, Herbert Xu, Nick Piggin,
	Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, schwidefsky, Chris Snook, davem, wensong,
	wjiang

On Sat, 18 Aug 2007, Paul E. McKenney wrote:
> 
> One of the gcc guys claimed that he thought that the two-instruction
> sequence would be faster on some x86 machines.  I pointed out that
> there might be a concern about code size.  I chose not to point out
> that people might also care about the other x86 machines.  ;-)

Some (very few) x86 uarchs do tend to prefer "load-store" like code 
generation, and doing a "mov [mem],reg + op reg" instead of "op [mem]" can 
actually be faster on some of them. Not any that are relevant today, 
though.

Also, that has nothing to do with volatile, and should be controlled by 
optimization flags (like -mtune). In fact, I thought there was a separate 
flag to do that (ie something like "-mload-store"), but I can't find it, 
so maybe that's just my fevered brain..

			Linus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18 22:41                                         ` Linus Torvalds
@ 2007-08-18 23:19                                           ` Paul E. McKenney
  0 siblings, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-18 23:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Satyam Sharma, Christoph Lameter, Herbert Xu, Nick Piggin,
	Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, schwidefsky, Chris Snook, davem, wensong,
	wjiang

On Sat, Aug 18, 2007 at 03:41:13PM -0700, Linus Torvalds wrote:
> 
> 
> On Sat, 18 Aug 2007, Paul E. McKenney wrote:
> > 
> > One of the gcc guys claimed that he thought that the two-instruction
> > sequence would be faster on some x86 machines.  I pointed out that
> > there might be a concern about code size.  I chose not to point out
> > that people might also care about the other x86 machines.  ;-)
> 
> Some (very few) x86 uarchs do tend to prefer "load-store" like code 
> generation, and doing a "mov [mem],reg + op reg" instead of "op [mem]" can 
> actually be faster on some of them. Not any that are relevant today, 
> though.

;-)

> Also, that has nothing to do with volatile, and should be controlled by 
> optimization flags (like -mtune). In fact, I thought there was a separate 
> flag to do that (ie something like "-mload-store"), but I can't find it, 
> so maybe that's just my fevered brain..

Good point, will suggest this if the need arises.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 16:48                                                             ` Linus Torvalds
  2007-08-17 18:50                                                               ` Chris Friesen
@ 2007-08-20 13:15                                                               ` Chris Snook
  2007-08-20 13:32                                                                 ` Herbert Xu
  2007-08-21  5:46                                                                 ` Linus Torvalds
  2007-09-09 18:02                                                               ` Denys Vlasenko
  2 siblings, 2 replies; 1546+ messages in thread
From: Chris Snook @ 2007-08-20 13:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

Linus Torvalds wrote:
> So the only reason to add back "volatile" to the atomic_read() sequence is 
> not to fix bugs, but to _hide_ the bugs better. They're still there, they 
> are just a lot harder to trigger, and tend to be a lot subtler.

What about barrier removal?  With consistent semantics we could optimize a fair 
amount of code.  Whether or not that constitutes "premature" optimization is 
open to debate, but there's no question we could reduce our register wiping in 
some places.

	-- Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: LDD3 pitfalls (was Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures)
  2007-08-18 14:35                                                     ` LDD3 pitfalls (was Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures) Stefan Richter
@ 2007-08-20 13:28                                                       ` Chris Snook
  0 siblings, 0 replies; 1546+ messages in thread
From: Chris Snook @ 2007-08-20 13:28 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Nick Piggin, paulmck,
	Herbert Xu, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

Stefan Richter wrote:
> Nick Piggin wrote:
>> Stefan Richter wrote:
>>> Nick Piggin wrote:
>>>
>>>> I don't know why people would assume volatile of atomics. AFAIK, most
>>>> of the documentation is pretty clear that all the atomic stuff can be
>>>> reordered etc. except for those that modify and return a value.
>>>
>>> Which documentation is there?
>> Documentation/atomic_ops.txt
>>
>>
>>> For driver authors, there is LDD3.  It doesn't specifically cover
>>> effects of optimization on accesses to atomic_t.
>>>
>>> For architecture port authors, there is Documentation/atomic_ops.txt.
>>> Driver authors also can learn something from that document, as it
>>> indirectly documents the atomic_t and bitops APIs.
>>>
>> "Semantics and Behavior of Atomic and Bitmask Operations" is
>> pretty direct :)
>>
>> Sure, it says that it's for arch maintainers, but there is no
>> reason why users can't make use of it.
> 
> 
> Note, LDD3 page 238 says:  "It is worth noting that most of the other
> kernel primitives dealing with synchronization, such as spinlock and
> atomic_t operations, also function as memory barriers."
> 
> I don't know about Linux 2.6.10 against which LDD3 was written, but
> currently only _some_ atomic_t operations function as memory barriers.
> 
> Besides, judging from some posts in this thread, saying that atomic_t
> operations dealt with synchronization may not be entirely precise.

atomic_t is often used as the basis for implementing more sophisticated 
synchronization mechanisms, such as rwlocks.  Whether or not they are designed 
for that purpose, the atomic_* operations are de facto synchronization primitives.

	-- Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  1:24                                 ` Christoph Lameter
  2007-08-18  1:41                                   ` Satyam Sharma
  2007-08-18 21:56                                   ` Paul E. McKenney
@ 2007-08-20 13:31                                   ` Chris Snook
  2007-08-20 22:04                                     ` Segher Boessenkool
  2 siblings, 1 reply; 1546+ messages in thread
From: Chris Snook @ 2007-08-20 13:31 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Herbert Xu, Linus Torvalds, Nick Piggin,
	Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, satyam, schwidefsky, davem, wensong, wjiang

Christoph Lameter wrote:
> On Fri, 17 Aug 2007, Paul E. McKenney wrote:
> 
>> On Sat, Aug 18, 2007 at 08:09:13AM +0800, Herbert Xu wrote:
>>> On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:
>>>> gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)
>>> I had totally forgotten that I'd already filed that bug more
>>> than six years ago until they just closed yours as a duplicate
>>> of mine :)
>>>
>>> Good luck in getting it fixed!
>> Well, just got done re-opening it for the third time.  And a local
>> gcc community member advised me not to give up too easily.  But I
>> must admit that I am impressed with the speed that it was identified
>> as duplicate.
>>
>> Should be entertaining!  ;-)
> 
> Right. ROTFL... volatile actually breaks atomic_t instead of making it 
> safe. x++ becomes a register load, increment and a register store. Without 
> volatile we can increment the memory directly. It seems that volatile 
> requires that the variable is loaded into a register first and then 
> operated upon. Understandable when you think about volatile being used to 
> access memory mapped I/O registers where a RMW operation could be 
> problematic.

So, if we want consistent behavior, we're pretty much screwed unless we use 
inline assembler everywhere?

	-- Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 13:15                                                               ` Chris Snook
@ 2007-08-20 13:32                                                                 ` Herbert Xu
  2007-08-20 13:38                                                                   ` Chris Snook
  2007-08-21  5:46                                                                 ` Linus Torvalds
  1 sibling, 1 reply; 1546+ messages in thread
From: Herbert Xu @ 2007-08-20 13:32 UTC (permalink / raw)
  To: Chris Snook
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Paul Mackerras,
	Christoph Lameter, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On Mon, Aug 20, 2007 at 09:15:11AM -0400, Chris Snook wrote:
> Linus Torvalds wrote:
> >So the only reason to add back "volatile" to the atomic_read() sequence is 
> >not to fix bugs, but to _hide_ the bugs better. They're still there, they 
> >are just a lot harder to trigger, and tend to be a lot subtler.
> 
> What about barrier removal?  With consistent semantics we could optimize a 
> fair amount of code.  Whether or not that constitutes "premature" 
> optimization is open to debate, but there's no question we could reduce our 
> register wiping in some places.

If you've been reading all of Linus's emails you should be
thinking about adding memory barriers, and not removing
compiler barriers.

He's just told you that code of the kind

	while (!atomic_read(cond))
		;

	do_something()

probably needs a memory barrier (not just compiler) so that
do_something() doesn't see stale cache content that occured
before cond flipped.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 13:32                                                                 ` Herbert Xu
@ 2007-08-20 13:38                                                                   ` Chris Snook
  2007-08-20 22:07                                                                     ` Segher Boessenkool
  0 siblings, 1 reply; 1546+ messages in thread
From: Chris Snook @ 2007-08-20 13:38 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Paul Mackerras,
	Christoph Lameter, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

Herbert Xu wrote:
> On Mon, Aug 20, 2007 at 09:15:11AM -0400, Chris Snook wrote:
>> Linus Torvalds wrote:
>>> So the only reason to add back "volatile" to the atomic_read() sequence is 
>>> not to fix bugs, but to _hide_ the bugs better. They're still there, they 
>>> are just a lot harder to trigger, and tend to be a lot subtler.
>> What about barrier removal?  With consistent semantics we could optimize a 
>> fair amount of code.  Whether or not that constitutes "premature" 
>> optimization is open to debate, but there's no question we could reduce our 
>> register wiping in some places.
> 
> If you've been reading all of Linus's emails you should be
> thinking about adding memory barriers, and not removing
> compiler barriers.
> 
> He's just told you that code of the kind
> 
> 	while (!atomic_read(cond))
> 		;
> 
> 	do_something()
> 
> probably needs a memory barrier (not just compiler) so that
> do_something() doesn't see stale cache content that occured
> before cond flipped.

Such code generally doesn't care precisely when it gets the update, just that 
the update is atomic, and it doesn't loop forever.  Regardless, I'm convinced we 
just need to do it all in assembly.

	-- Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 13:31                                   ` Chris Snook
@ 2007-08-20 22:04                                     ` Segher Boessenkool
  2007-08-20 22:48                                       ` Russell King
  0 siblings, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-20 22:04 UTC (permalink / raw)
  To: Chris Snook
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	linux-kernel, Paul E. McKenney, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

>> Right. ROTFL... volatile actually breaks atomic_t instead of making 
>> it safe. x++ becomes a register load, increment and a register store. 
>> Without volatile we can increment the memory directly. It seems that 
>> volatile requires that the variable is loaded into a register first 
>> and then operated upon. Understandable when you think about volatile 
>> being used to access memory mapped I/O registers where a RMW 
>> operation could be problematic.
>
> So, if we want consistent behavior, we're pretty much screwed unless 
> we use inline assembler everywhere?

Nah, this whole argument is flawed -- "without volatile" we still
*cannot* "increment the memory directly".  On x86, you need a lock
prefix; on other archs, some other mechanism to make the memory
increment an *atomic* memory increment.

And no, RMW on MMIO isn't "problematic" at all, either.

An RMW op is a read op, a modify op, and a write op, all rolled
into one opcode.  But three actual operations.


The advantages of asm code for atomic_{read,set} are:
1) all the other atomic ops are implemented that way already;
2) you have full control over the asm insns selected, in particular,
    you can guarantee you *do* get an atomic op;
3) you don't need to use "volatile <data>" which generates
    not-all-that-good code on archs like x86, and we want to get rid
    of it anyway since it is problematic in many ways;
4) you don't need to use *(volatile <type>*)&<data>, which a) doesn't
    exist in C; b) isn't documented or supported in GCC; c) has a recent
    history of bugginess; d) _still uses volatile objects_; e) _still_
    is problematic in almost all those same ways as in 3);
5) you can mix atomic and non-atomic accesses to the atomic_t, which
    you cannot with the other alternatives.

The only disadvantage I know of is potentially slightly worse
instruction scheduling.  This is a generic asm() problem: GCC
cannot see what actual insns are inside the asm() block.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 13:38                                                                   ` Chris Snook
@ 2007-08-20 22:07                                                                     ` Segher Boessenkool
  0 siblings, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-20 22:07 UTC (permalink / raw)
  To: Chris Snook
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, Stefan Richter,
	horms, Satyam Sharma, Ilpo Jarvinen, Linux Kernel Mailing List,
	David Miller, Paul E. McKenney, ak, Netdev, cfriesen, rpjday,
	jesper.juhl, linux-arch, Andrew Morton, zlynx, schwidefsky,
	Herbert Xu, Linus Torvalds, wensong, Nick Piggin, wjiang

> Such code generally doesn't care precisely when it gets the update, 
> just that the update is atomic, and it doesn't loop forever.

Yes, it _does_ care that it gets the update _at all_, and preferably
as early as possible.

> Regardless, I'm convinced we just need to do it all in assembly.

So do you want "volatile asm" or "plain asm", for atomic_read()?
The asm version has two ways to go about it too...


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 22:04                                     ` Segher Boessenkool
@ 2007-08-20 22:48                                       ` Russell King
  2007-08-20 23:02                                         ` Segher Boessenkool
  0 siblings, 1 reply; 1546+ messages in thread
From: Russell King @ 2007-08-20 22:48 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Chris Snook, Christoph Lameter, Paul Mackerras, heiko.carstens,
	horms, linux-kernel, Paul E. McKenney, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

On Tue, Aug 21, 2007 at 12:04:17AM +0200, Segher Boessenkool wrote:
> And no, RMW on MMIO isn't "problematic" at all, either.
> 
> An RMW op is a read op, a modify op, and a write op, all rolled
> into one opcode.  But three actual operations.

Maybe for some CPUs, but not all.  ARM for instance can't use the
load exclusive and store exclusive instructions to MMIO space.

This means placing atomic_t or bitops into MMIO space is a definite
no-go on ARM.  It breaks.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 22:48                                       ` Russell King
@ 2007-08-20 23:02                                         ` Segher Boessenkool
  2007-08-21  0:05                                           ` Paul E. McKenney
  2007-08-21  7:05                                           ` Russell King
  0 siblings, 2 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-20 23:02 UTC (permalink / raw)
  To: Russell King
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	linux-kernel, Paul E. McKenney, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>> And no, RMW on MMIO isn't "problematic" at all, either.
>>
>> An RMW op is a read op, a modify op, and a write op, all rolled
>> into one opcode.  But three actual operations.
>
> Maybe for some CPUs, but not all.  ARM for instance can't use the
> load exclusive and store exclusive instructions to MMIO space.

Sure, your CPU doesn't have RMW instructions -- how to emulate
those if you don't have them is a totally different thing.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 23:02                                         ` Segher Boessenkool
@ 2007-08-21  0:05                                           ` Paul E. McKenney
  2007-08-21  7:08                                             ` Russell King
  2007-08-21  7:05                                           ` Russell King
  1 sibling, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-21  0:05 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Russell King, Christoph Lameter, Paul Mackerras, heiko.carstens,
	horms, linux-kernel, ak, netdev, cfriesen, akpm, rpjday,
	Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

On Tue, Aug 21, 2007 at 01:02:01AM +0200, Segher Boessenkool wrote:
> >>And no, RMW on MMIO isn't "problematic" at all, either.
> >>
> >>An RMW op is a read op, a modify op, and a write op, all rolled
> >>into one opcode.  But three actual operations.
> >
> >Maybe for some CPUs, but not all.  ARM for instance can't use the
> >load exclusive and store exclusive instructions to MMIO space.
> 
> Sure, your CPU doesn't have RMW instructions -- how to emulate
> those if you don't have them is a totally different thing.

I thought that ARM's load exclusive and store exclusive instructions
were its equivalent of LL and SC, which RISC machines typically use to
build atomic sequences of instructions -- and which normally cannot be
applied to MMIO space.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 13:15                                                               ` Chris Snook
  2007-08-20 13:32                                                                 ` Herbert Xu
@ 2007-08-21  5:46                                                                 ` Linus Torvalds
  2007-08-21  7:04                                                                   ` David Miller
  1 sibling, 1 reply; 1546+ messages in thread
From: Linus Torvalds @ 2007-08-21  5:46 UTC (permalink / raw)
  To: Chris Snook
  Cc: Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher



On Mon, 20 Aug 2007, Chris Snook wrote:
>
> What about barrier removal?  With consistent semantics we could optimize a
> fair amount of code.  Whether or not that constitutes "premature" optimization
> is open to debate, but there's no question we could reduce our register wiping
> in some places.

Why do people think that barriers are expensive? They really aren't. 
Especially the regular compiler barrier is basically zero cost. Any 
reasonable compiler will just flush the stuff it holds in registers that 
isn't already automatic local variables, and for regular kernel code, that 
tends to basically be nothing at all.

Ie a "barrier()" is likely _cheaper_ than the code generation downside 
from using "volatile".

		Linus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21  5:46                                                                 ` Linus Torvalds
@ 2007-08-21  7:04                                                                   ` David Miller
  2007-08-21 13:50                                                                     ` Chris Snook
  0 siblings, 1 reply; 1546+ messages in thread
From: David Miller @ 2007-08-21  7:04 UTC (permalink / raw)
  To: torvalds
  Cc: csnook, piggin, satyam, herbert, paulus, clameter, ilpo.jarvinen,
	paulmck, stefanr, linux-kernel, linux-arch, netdev, akpm, ak,
	heiko.carstens, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 20 Aug 2007 22:46:47 -0700 (PDT)

> Ie a "barrier()" is likely _cheaper_ than the code generation downside 
> from using "volatile".

Assuming GCC were ever better about the code generation badness
with volatile that has been discussed here, I much prefer
we tell GCC "this memory piece changed" rather than "every
piece of memory has changed" which is what the barrier() does.

I happened to have been scanning a lot of assembler lately to
track down a gcc-4.2 miscompilation on sparc64, and the barriers
do hurt quite a bit in some places.  Instead of keeping unrelated
variables around cached in local registers, it reloads everything.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 23:02                                         ` Segher Boessenkool
  2007-08-21  0:05                                           ` Paul E. McKenney
@ 2007-08-21  7:05                                           ` Russell King
  2007-08-21  9:33                                             ` Paul Mackerras
  2007-08-21 14:39                                             ` Segher Boessenkool
  1 sibling, 2 replies; 1546+ messages in thread
From: Russell King @ 2007-08-21  7:05 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	linux-kernel, Paul E. McKenney, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

On Tue, Aug 21, 2007 at 01:02:01AM +0200, Segher Boessenkool wrote:
> >>And no, RMW on MMIO isn't "problematic" at all, either.
> >>
> >>An RMW op is a read op, a modify op, and a write op, all rolled
> >>into one opcode.  But three actual operations.
> >
> >Maybe for some CPUs, but not all.  ARM for instance can't use the
> >load exclusive and store exclusive instructions to MMIO space.
> 
> Sure, your CPU doesn't have RMW instructions -- how to emulate
> those if you don't have them is a totally different thing.

Let me say it more clearly: On ARM, it is impossible to perform atomic
operations on MMIO space.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21  0:05                                           ` Paul E. McKenney
@ 2007-08-21  7:08                                             ` Russell King
  0 siblings, 0 replies; 1546+ messages in thread
From: Russell King @ 2007-08-21  7:08 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Segher Boessenkool, Christoph Lameter, Paul Mackerras,
	heiko.carstens, horms, linux-kernel, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

On Mon, Aug 20, 2007 at 05:05:18PM -0700, Paul E. McKenney wrote:
> On Tue, Aug 21, 2007 at 01:02:01AM +0200, Segher Boessenkool wrote:
> > >>And no, RMW on MMIO isn't "problematic" at all, either.
> > >>
> > >>An RMW op is a read op, a modify op, and a write op, all rolled
> > >>into one opcode.  But three actual operations.
> > >
> > >Maybe for some CPUs, but not all.  ARM for instance can't use the
> > >load exclusive and store exclusive instructions to MMIO space.
> > 
> > Sure, your CPU doesn't have RMW instructions -- how to emulate
> > those if you don't have them is a totally different thing.
> 
> I thought that ARM's load exclusive and store exclusive instructions
> were its equivalent of LL and SC, which RISC machines typically use to
> build atomic sequences of instructions -- and which normally cannot be
> applied to MMIO space.

Absolutely correct.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21  7:05                                           ` Russell King
@ 2007-08-21  9:33                                             ` Paul Mackerras
  2007-08-21 11:37                                               ` Andi Kleen
  2007-08-21 14:48                                               ` Segher Boessenkool
  2007-08-21 14:39                                             ` Segher Boessenkool
  1 sibling, 2 replies; 1546+ messages in thread
From: Paul Mackerras @ 2007-08-21  9:33 UTC (permalink / raw)
  To: Russell King
  Cc: Segher Boessenkool, Christoph Lameter, heiko.carstens, horms,
	linux-kernel, Paul E. McKenney, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

Russell King writes:

> Let me say it more clearly: On ARM, it is impossible to perform atomic
> operations on MMIO space.

Actually, no one is suggesting that we try to do that at all.

The discussion about RMW ops on MMIO space started with a comment
attributed to the gcc developers that one reason why gcc on x86
doesn't use instructions that do RMW ops on volatile variables is that
volatile is used to mark MMIO addresses, and there was some
uncertainty about whether (non-atomic) RMW ops on x86 could be used on
MMIO.  This is in regard to the question about why gcc on x86 always
moves a volatile variable into a register before doing anything to it.

So the whole discussion is irrelevant to ARM, PowerPC and any other
architecture except x86[-64].

Paul.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21  9:33                                             ` Paul Mackerras
@ 2007-08-21 11:37                                               ` Andi Kleen
  2007-08-21 14:48                                               ` Segher Boessenkool
  1 sibling, 0 replies; 1546+ messages in thread
From: Andi Kleen @ 2007-08-21 11:37 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Russell King, Segher Boessenkool, Christoph Lameter,
	heiko.carstens, horms, linux-kernel, Paul E. McKenney, ak, netdev,
	cfriesen, akpm, rpjday, Nick Piggin, linux-arch, jesper.juhl,
	satyam, zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

On Tue, Aug 21, 2007 at 07:33:49PM +1000, Paul Mackerras wrote:
> So the whole discussion is irrelevant to ARM, PowerPC and any other
> architecture except x86[-64].

It's even irrelevant on x86 because all modifying operations on atomic_t 
are coded in inline assembler and will always be RMW no matter
if atomic_t is volatile or not.

[ignoring atomic_set(x, atomic_read(x) + 1) which nobody should do]

The only issue is if atomic_t should have a implicit barrier or not.
My personal opinion is yes -- better safe than sorry. And any code
impact it may have is typically dwarved by the next cache miss anyways,
so it doesn't matter much.

-Andi


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21  7:04                                                                   ` David Miller
@ 2007-08-21 13:50                                                                     ` Chris Snook
  2007-08-21 14:59                                                                       ` Segher Boessenkool
                                                                                         ` (2 more replies)
  0 siblings, 3 replies; 1546+ messages in thread
From: Chris Snook @ 2007-08-21 13:50 UTC (permalink / raw)
  To: David Miller
  Cc: torvalds, piggin, satyam, herbert, paulus, clameter,
	ilpo.jarvinen, paulmck, stefanr, linux-kernel, linux-arch, netdev,
	akpm, ak, heiko.carstens, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

David Miller wrote:
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Mon, 20 Aug 2007 22:46:47 -0700 (PDT)
> 
>> Ie a "barrier()" is likely _cheaper_ than the code generation downside 
>> from using "volatile".
> 
> Assuming GCC were ever better about the code generation badness
> with volatile that has been discussed here, I much prefer
> we tell GCC "this memory piece changed" rather than "every
> piece of memory has changed" which is what the barrier() does.
> 
> I happened to have been scanning a lot of assembler lately to
> track down a gcc-4.2 miscompilation on sparc64, and the barriers
> do hurt quite a bit in some places.  Instead of keeping unrelated
> variables around cached in local registers, it reloads everything.

Moore's law is definitely working against us here.  Register counts, 
pipeline depths, core counts, and clock multipliers are all increasing 
in the long run.  At some point in the future, barrier() will be 
universally regarded as a hammer too big for most purposes.  Whether or 
not removing it now constitutes premature optimization is arguable, but 
I think we should allow such optimization to happen (or not happen) in 
architecture-dependent code, and provide a consistent API that doesn't 
require the use of such things in arch-independent code where it might 
turn into a totally superfluous performance killer depending on what 
hardware it gets compiled for.

	-- Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21  7:05                                           ` Russell King
  2007-08-21  9:33                                             ` Paul Mackerras
@ 2007-08-21 14:39                                             ` Segher Boessenkool
  1 sibling, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-21 14:39 UTC (permalink / raw)
  To: Russell King
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	linux-kernel, Paul E. McKenney, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>>>> And no, RMW on MMIO isn't "problematic" at all, either.
>>>>
>>>> An RMW op is a read op, a modify op, and a write op, all rolled
>>>> into one opcode.  But three actual operations.
>>>
>>> Maybe for some CPUs, but not all.  ARM for instance can't use the
>>> load exclusive and store exclusive instructions to MMIO space.
>>
>> Sure, your CPU doesn't have RMW instructions -- how to emulate
>> those if you don't have them is a totally different thing.
>
> Let me say it more clearly: On ARM, it is impossible to perform atomic
> operations on MMIO space.

It's all completely beside the point, see the other subthread, but...

Yeah, you can't do LL/SC to MMIO space; ARM isn't alone in that.
You could still implement atomic operations on MMIO space by taking
a lock elsewhere, in normal cacheable memory space.  Why you would
do this is a separate question, you probably don't want it :-)


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21  9:33                                             ` Paul Mackerras
  2007-08-21 11:37                                               ` Andi Kleen
@ 2007-08-21 14:48                                               ` Segher Boessenkool
  2007-08-21 16:16                                                 ` Paul E. McKenney
  1 sibling, 1 reply; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-21 14:48 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Russell King, Christoph Lameter, heiko.carstens, horms,
	linux-kernel, Paul E. McKenney, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>> Let me say it more clearly: On ARM, it is impossible to perform atomic
>> operations on MMIO space.
>
> Actually, no one is suggesting that we try to do that at all.
>
> The discussion about RMW ops on MMIO space started with a comment
> attributed to the gcc developers that one reason why gcc on x86
> doesn't use instructions that do RMW ops on volatile variables is that
> volatile is used to mark MMIO addresses, and there was some
> uncertainty about whether (non-atomic) RMW ops on x86 could be used on
> MMIO.  This is in regard to the question about why gcc on x86 always
> moves a volatile variable into a register before doing anything to it.

This question is GCC PR33102, which was incorrectly closed as a 
duplicate
of PR3506 -- and *that* PR was closed because its reporter seemed to
claim the GCC generated code for an increment on a volatile (namely, 
three
machine instructions: load, modify, store) was incorrect, and it has to
be one machine instruction.

> So the whole discussion is irrelevant to ARM, PowerPC and any other
> architecture except x86[-64].

And even there, it's not something the kernel can take advantage of
before GCC 4.4 is in widespread use, if then.  Let's move on.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21 13:50                                                                     ` Chris Snook
@ 2007-08-21 14:59                                                                       ` Segher Boessenkool
  2007-08-21 16:31                                                                       ` Satyam Sharma
  2007-08-21 16:43                                                                       ` Linus Torvalds
  2 siblings, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-08-21 14:59 UTC (permalink / raw)
  To: Chris Snook
  Cc: paulmck, heiko.carstens, ilpo.jarvinen, horms, linux-kernel,
	David Miller, rpjday, netdev, ak, piggin, akpm, torvalds,
	cfriesen, jesper.juhl, linux-arch, paulus, herbert, satyam,
	clameter, stefanr, schwidefsky, zlynx, wensong, wjiang

> At some point in the future, barrier() will be universally regarded as 
> a hammer too big for most purposes.  Whether or not removing it now

You can't just remove it, it is needed in some places; you want to
replace it in most places with a more fine-grained "compiler barrier",
I presume?

> constitutes premature optimization is arguable, but I think we should 
> allow such optimization to happen (or not happen) in 
> architecture-dependent code, and provide a consistent API that doesn't 
> require the use of such things in arch-independent code where it might 
> turn into a totally superfluous performance killer depending on what 
> hardware it gets compiled for.

Explicit barrier()s won't be too hard to replace -- but what to do
about the implicit barrier()s in rmb() etc. etc. -- *those* will be
hard to get rid of, if only because it is hard enough to teach driver
authors about how to use those primitives *already*.  It is far from
clear what a good interface like that would look like, anyway.

Probably we should first start experimenting with a forget()-style
micro-barrier (but please, find a better name), and see if a nice
usage pattern shows up that can be turned into an API.

Segher

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21 14:48                                               ` Segher Boessenkool
@ 2007-08-21 16:16                                                 ` Paul E. McKenney
  2007-08-21 22:51                                                   ` Valdis.Kletnieks
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-21 16:16 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Paul Mackerras, Russell King, Christoph Lameter, heiko.carstens,
	horms, linux-kernel, ak, netdev, cfriesen, akpm, rpjday,
	Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

On Tue, Aug 21, 2007 at 04:48:51PM +0200, Segher Boessenkool wrote:
> >>Let me say it more clearly: On ARM, it is impossible to perform atomic
> >>operations on MMIO space.
> >
> >Actually, no one is suggesting that we try to do that at all.
> >
> >The discussion about RMW ops on MMIO space started with a comment
> >attributed to the gcc developers that one reason why gcc on x86
> >doesn't use instructions that do RMW ops on volatile variables is that
> >volatile is used to mark MMIO addresses, and there was some
> >uncertainty about whether (non-atomic) RMW ops on x86 could be used on
> >MMIO.  This is in regard to the question about why gcc on x86 always
> >moves a volatile variable into a register before doing anything to it.
> 
> This question is GCC PR33102, which was incorrectly closed as a 
> duplicate
> of PR3506 -- and *that* PR was closed because its reporter seemed to
> claim the GCC generated code for an increment on a volatile (namely, 
> three
> machine instructions: load, modify, store) was incorrect, and it has to
> be one machine instruction.
> 
> >So the whole discussion is irrelevant to ARM, PowerPC and any other
> >architecture except x86[-64].
> 
> And even there, it's not something the kernel can take advantage of
> before GCC 4.4 is in widespread use, if then.  Let's move on.

I agree that instant gratification is hard to come by when synching
up compiler and kernel versions.  Nonetheless, it should be possible
to create APIs that are are conditioned on the compiler version.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21 13:50                                                                     ` Chris Snook
  2007-08-21 14:59                                                                       ` Segher Boessenkool
@ 2007-08-21 16:31                                                                       ` Satyam Sharma
  2007-08-21 16:43                                                                       ` Linus Torvalds
  2 siblings, 0 replies; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-21 16:31 UTC (permalink / raw)
  To: Chris Snook
  Cc: David Miller, Linus Torvalds, piggin, herbert, paulus, clameter,
	ilpo.jarvinen, paulmck, stefanr, Linux Kernel Mailing List,
	linux-arch, netdev, Andrew Morton, ak, heiko.carstens,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Tue, 21 Aug 2007, Chris Snook wrote:

> David Miller wrote:
> > From: Linus Torvalds <torvalds@linux-foundation.org>
> > Date: Mon, 20 Aug 2007 22:46:47 -0700 (PDT)
> > 
> > > Ie a "barrier()" is likely _cheaper_ than the code generation downside
> > > from using "volatile".
> > 
> > Assuming GCC were ever better about the code generation badness
> > with volatile that has been discussed here, I much prefer
> > we tell GCC "this memory piece changed" rather than "every
> > piece of memory has changed" which is what the barrier() does.
> > 
> > I happened to have been scanning a lot of assembler lately to
> > track down a gcc-4.2 miscompilation on sparc64, and the barriers
> > do hurt quite a bit in some places.  Instead of keeping unrelated
> > variables around cached in local registers, it reloads everything.
> 
> Moore's law is definitely working against us here.  Register counts, pipeline
> depths, core counts, and clock multipliers are all increasing in the long run.
> At some point in the future, barrier() will be universally regarded as a
> hammer too big for most purposes.

I do agree, and the important point to note is that the benefits of a
/lighter/ compiler barrier, such as what David referred to above, _can_
be had without having to do anything with the "volatile" keyword at all.
And such a primitive has already been mentioned/proposed on this thread.

But this is all tangential to the core question at hand -- whether to have
implicit (compiler, possibly "light-weight" of the kind referred above)
barrier semantics in atomic ops that do not have them, or not.

I was lately looking in the kernel for _actual_ code that uses atomic_t
and benefits from the lack of any implicit barrier, with the compiler
being free to cache the atomic_t in a register. Now that often does _not_
happen, because all other ops (implemented in asm with LOCK prefix on x86)
_must_ therefore constrain the atomic_t to memory anyway. So typically all
atomic ops code sequences end up operating on memory.

Then I did locate sched.c:select_nohz_load_balancer() -- it repeatedly
references the same atomic_t object, and the code that I saw generated
(with CC_OPTIMIZE_FOR_SIZE=y) did cache it in a register for a sequence of
instructions. It uses atomic_cmpxchg, thereby not requiring explicit
memory barriers anywhere in the code, and is an example of an atomic_t
user that is safe, and yet benefits from its memory loads/stores being
elided/coalesced by the compiler.

# at this point, %%eax holds num_online_cpus() and
# %%ebx holds cpus_weight(nohz.cpu_mask)
# the variable "cpu" is in %esi

0xc1018e1d:      cmp    %eax,%ebx		# if No.A.
0xc1018e1f:      mov    0xc134d900,%eax		# first atomic_read()
0xc1018e24:      jne    0xc1018e36
0xc1018e26:      cmp    %esi,%eax		# if No.B.
0xc1018e28:      jne    0xc1018e80		# returns with 0
0xc1018e2a:      movl   $0xffffffff,0xc134d900	# atomic_set(-1), and ...
0xc1018e34:      jmp    0xc1018e80		# ... returns with 0
0xc1018e36:      cmp    $0xffffffff,%eax	# if No.C. (NOTE!)
0xc1018e39:      jne    0xc1018e46
0xc1018e3b:      lock cmpxchg %esi,0xc134d900	# atomic_cmpxchg()
0xc1018e43:      inc    %eax
0xc1018e44:      jmp    0xc1018e48
0xc1018e46:      cmp    %esi,%eax		# if No.D. (NOTE!)
0xc1018e48:      jne    0xc1018e80		# if !=, default return 0 (if No.E.)
0xc1018e4a:      jmp    0xc1018e84		# otherwise (==) returns with 1

The above is:

	if (cpus_weight(nohz.cpu_mask) == num_online_cpus()) {	/* if No.A. */
		if (atomic_read(&nohz.load_balancer) == cpu)	/* if No.B. */
			atomic_set(&nohz.load_balancer, -1);	/* XXX */
		return 0;
	}
	if (atomic_read(&nohz.load_balancer) == -1) {		/* if No.C. */
		/* make me the ilb owner */
		if (atomic_cmpxchg(&nohz.load_balancer, -1, cpu) == -1)	/* if No.E. */
			return 1;
	} else if (atomic_read(&nohz.load_balancer) == cpu)	/* if No.D. */
		return 1;
	...
	...
	return 0; /* default return from function */

As you can see, the atomic_read()'s of "if"s Nos. B, C, and D, were _all_
coalesced into a single memory reference "mov    0xc134d900,%eax" at the
top of the function, and then "if"s Nos. C and D simply used the value
from %%eax itself. But that's perfectly safe, such is the logic of this
function. It uses cmpxchg _whenever_ updating the value in the memory
atomic_t and then returns appropriately. The _only_ point that a casual
reader may find racy is that marked /* XXX */ above -- atomic_read()
followed by atomic_set() with no barrier in between. But even that is ok,
because if one thread ever finds that condition to succeed, it is 100%
guaranteed no other thread on any other CPU will find _any_ condition
to be true, thereby avoiding any race in the modification of that value.

BTW it does sound reasonable that a lot of atomic_t users that want a
compiler barrier probably also want a memory barrier. Do we make _that_
implicit too? Quite clearly, making _either_ one of those implicit in
atomic_{read,set} (in any form of implementation -- a forget() macro
based, *(volatile int *)& based, or inline asm based) would end up
harming code such as that cited above.

Lastly, the most obvious reason that should be considered against implicit
barriers in atomic ops is that it isn't "required" -- atomicity does not
imply any barrier after all, and making such a distinction would actually
be a healthy separation that helps people think more clearly when writing
lockless code.

[ But the "authors' expectations" / heisenbugs argument also holds some
  water ... for that, we can have a _variant_ in the API for atomic ops
  that has implicit compiler/memory barriers, to make it easier on those
  who want that behaviour. But let us not penalize code that knows what
  it is doing by changing the default to that, please. ]

Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21 13:50                                                                     ` Chris Snook
  2007-08-21 14:59                                                                       ` Segher Boessenkool
  2007-08-21 16:31                                                                       ` Satyam Sharma
@ 2007-08-21 16:43                                                                       ` Linus Torvalds
  2 siblings, 0 replies; 1546+ messages in thread
From: Linus Torvalds @ 2007-08-21 16:43 UTC (permalink / raw)
  To: Chris Snook
  Cc: David Miller, piggin, satyam, herbert, paulus, clameter,
	ilpo.jarvinen, paulmck, stefanr, linux-kernel, linux-arch, netdev,
	akpm, ak, heiko.carstens, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Tue, 21 Aug 2007, Chris Snook wrote:
> 
> Moore's law is definitely working against us here.  Register counts, pipeline
> depths, core counts, and clock multipliers are all increasing in the long run.
> At some point in the future, barrier() will be universally regarded as a
> hammer too big for most purposes.

Note that "barrier()" is purely a compiler barrier. It has zero impact on 
the CPU pipeline itself, and also has zero impact on anything that gcc 
knows isn't visible in memory (ie local variables that don't have their 
address taken), so barrier() really is pretty cheap.

Now, it's possible that gcc messes up in some circumstances, and that the 
memory clobber will cause gcc to also do things like flush local registers 
unnecessarily to their stack slots, but quite frankly, if that happens, 
it's a gcc problem, and I also have to say that I've not seen that myself.

So in a very real sense, "barrier()" will just make sure that there is a 
stronger sequence point for the compiler where things are stable. In most 
cases it has absolutely zero performance impact - apart from the 
-intended- impact of making sure that the compiler doesn't re-order or 
cache stuff around it.

And sure, we could make it more finegrained, and also introduce a 
per-variable barrier, but the fact is, people _already_ have problems with 
thinking about these kinds of things, and adding new abstraction issues 
with subtle semantics is the last thing we want.

So I really think you'd want to show a real example of real code that 
actually gets noticeably slower or bigger.

In removing "volatile", we have shown that. It may not have made a big 
difference on powerpc, but it makes a real difference on x86 - and more 
importantly, it removes something that people clearly don't know how it 
works, and incorrectly expect to just fix bugs.

[ There are *other* barriers - the ones that actually add memory barriers 
  to the CPU - that really can be quite expensive. The good news is that 
  the expense is going down rather than up: both Intel and AMD are not 
  only removing the need for some of them (ie "smp_rmb()" will become a 
  compiler-only barrier), but we're _also_ seeing the whole "pipeline 
  flush" approach go away, and be replaced by the CPU itself actually 
  being better - so even the actual CPU pipeline barriers are getting
  cheaper, not more expensive. ]

For example, did anybody even _test_ how expensive "barrier()" is? Just 
as a lark, I did

	#undef barrier
	#define barrier() do { } while (0)

in kernel/sched.c (which only has three of them in it, but hey, that's 
more than most files), and there were _zero_ code generation downsides. 
One instruction was moved (and a few line numbers changed), so it wasn't 
like the assembly language was identical, but the point is, barrier() 
simply doesn't have the same kinds of downsides that "volatile" has.

(That may not be true on other architectures or in other source files, of 
course. This *does* depend on code generation details. But anybody who 
thinks that "barrier()" is fundamentally expensive is simply incorrect. It 
is *fundamnetally* a no-op).

		Linus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21 16:16                                                 ` Paul E. McKenney
@ 2007-08-21 22:51                                                   ` Valdis.Kletnieks
  2007-08-22  0:50                                                     ` Paul E. McKenney
  2007-08-22 21:38                                                     ` Adrian Bunk
  0 siblings, 2 replies; 1546+ messages in thread
From: Valdis.Kletnieks @ 2007-08-21 22:51 UTC (permalink / raw)
  To: paulmck
  Cc: Segher Boessenkool, Paul Mackerras, Russell King,
	Christoph Lameter, heiko.carstens, horms, linux-kernel, ak,
	netdev, cfriesen, akpm, rpjday, Nick Piggin, linux-arch,
	jesper.juhl, satyam, zlynx, schwidefsky, Chris Snook, Herbert Xu,
	davem, Linus Torvalds, wensong, wjiang

[-- Attachment #1: Type: text/plain, Size: 557 bytes --]

On Tue, 21 Aug 2007 09:16:43 PDT, "Paul E. McKenney" said:

> I agree that instant gratification is hard to come by when synching
> up compiler and kernel versions.  Nonetheless, it should be possible
> to create APIs that are are conditioned on the compiler version.

We've tried that, sort of.  See the mess surrounding the whole
extern/static/inline/__whatever boondogle, which seems to have
changed semantics in every single gcc release since 2.95 or so.

And recently mention was made that gcc4.4 will have *new* semantics
in this area. Yee. Hah.






[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21 22:51                                                   ` Valdis.Kletnieks
@ 2007-08-22  0:50                                                     ` Paul E. McKenney
  2007-08-22 21:38                                                     ` Adrian Bunk
  1 sibling, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-08-22  0:50 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Segher Boessenkool, Paul Mackerras, Russell King,
	Christoph Lameter, heiko.carstens, horms, linux-kernel, ak,
	netdev, cfriesen, akpm, rpjday, Nick Piggin, linux-arch,
	jesper.juhl, satyam, zlynx, schwidefsky, Chris Snook, Herbert Xu,
	davem, Linus Torvalds, wensong, wjiang

On Tue, Aug 21, 2007 at 06:51:16PM -0400, Valdis.Kletnieks@vt.edu wrote:
> On Tue, 21 Aug 2007 09:16:43 PDT, "Paul E. McKenney" said:
> 
> > I agree that instant gratification is hard to come by when synching
> > up compiler and kernel versions.  Nonetheless, it should be possible
> > to create APIs that are are conditioned on the compiler version.
> 
> We've tried that, sort of.  See the mess surrounding the whole
> extern/static/inline/__whatever boondogle, which seems to have
> changed semantics in every single gcc release since 2.95 or so.
> 
> And recently mention was made that gcc4.4 will have *new* semantics
> in this area. Yee. Hah.

;-)

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21 22:51                                                   ` Valdis.Kletnieks
  2007-08-22  0:50                                                     ` Paul E. McKenney
@ 2007-08-22 21:38                                                     ` Adrian Bunk
  1 sibling, 0 replies; 1546+ messages in thread
From: Adrian Bunk @ 2007-08-22 21:38 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: paulmck, Segher Boessenkool, Paul Mackerras, Russell King,
	Christoph Lameter, heiko.carstens, horms, linux-kernel, ak,
	netdev, cfriesen, akpm, rpjday, Nick Piggin, linux-arch,
	jesper.juhl, satyam, zlynx, schwidefsky, Chris Snook, Herbert Xu,
	davem, Linus Torvalds, wensong, wjiang

On Tue, Aug 21, 2007 at 06:51:16PM -0400, Valdis.Kletnieks@vt.edu wrote:
> On Tue, 21 Aug 2007 09:16:43 PDT, "Paul E. McKenney" said:
> 
> > I agree that instant gratification is hard to come by when synching
> > up compiler and kernel versions.  Nonetheless, it should be possible
> > to create APIs that are are conditioned on the compiler version.
> 
> We've tried that, sort of.  See the mess surrounding the whole
> extern/static/inline/__whatever boondogle, which seems to have
> changed semantics in every single gcc release since 2.95 or so.
>...

There is exactly one semantics change in gcc in this area, and that is 
the change of the "extern inline" semantics in gcc 4.3 to the
C99 semantics.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-16  0:39           ` [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert() Satyam Sharma
@ 2007-08-24 11:59             ` Denys Vlasenko
  2007-08-24 12:07               ` Andi Kleen
                                 ` (3 more replies)
  0 siblings, 4 replies; 1546+ messages in thread
From: Denys Vlasenko @ 2007-08-24 11:59 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thursday 16 August 2007 01:39, Satyam Sharma wrote:
>
>  static inline void wait_for_init_deassert(atomic_t *deassert)
>  {
> -	while (!atomic_read(deassert));
> +	while (!atomic_read(deassert))
> +		cpu_relax();
>  	return;
>  }

For less-than-briliant people like me, it's totally non-obvious that
cpu_relax() is needed for correctness here, not just to make P4 happy.

IOW: "atomic_read" name quite unambiguously means "I will read
this variable from main memory". Which is not true and creates
potential for confusion and bugs.
--
vda

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 11:59             ` Denys Vlasenko
@ 2007-08-24 12:07               ` Andi Kleen
  2007-08-24 12:12                 ` Kenn Humborg
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 1546+ messages in thread
From: Andi Kleen @ 2007-08-24 12:07 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Satyam Sharma, Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Friday 24 August 2007 13:59:32 Denys Vlasenko wrote:
> On Thursday 16 August 2007 01:39, Satyam Sharma wrote:
> >
> >  static inline void wait_for_init_deassert(atomic_t *deassert)
> >  {
> > -	while (!atomic_read(deassert));
> > +	while (!atomic_read(deassert))
> > +		cpu_relax();
> >  	return;
> >  }
> 
> For less-than-briliant people like me, it's totally non-obvious that
> cpu_relax() is needed for correctness here, not just to make P4 happy.

I find it also non obvious. It would be really better to have a barrier
or equivalent (volatile or variable clobber) in the atomic_read()
 
> IOW: "atomic_read" name quite unambiguously means "I will read
> this variable from main memory". Which is not true and creates
> potential for confusion and bugs.

Agreed.

-Andi

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 11:59             ` Denys Vlasenko
@ 2007-08-24 12:12                 ` Kenn Humborg
  2007-08-24 12:12                 ` Kenn Humborg
                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 1546+ messages in thread
From: Kenn Humborg @ 2007-08-24 12:12 UTC (permalink / raw)
  To: Denys Vlasenko, Satyam Sharma
  Cc: Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

> On Thursday 16 August 2007 01:39, Satyam Sharma wrote:
> >
> >  static inline void wait_for_init_deassert(atomic_t *deassert)
> >  {
> > -	while (!atomic_read(deassert));
> > +	while (!atomic_read(deassert))
> > +		cpu_relax();
> >  	return;
> >  }
> 
> For less-than-briliant people like me, it's totally non-obvious that
> cpu_relax() is needed for correctness here, not just to make P4 happy.
> 
> IOW: "atomic_read" name quite unambiguously means "I will read
> this variable from main memory". Which is not true and creates
> potential for confusion and bugs.

To me, "atomic_read" means a read which is synchronized with other 
changes to the variable (using the atomic_XXX functions) in such 
a way that I will always only see the "before" or "after"
state of the variable - never an intermediate state while a 
modification is happening.  It doesn't imply that I have to 
see the "after" state immediately after another thread modifies
it.

Perhaps the Linux atomic_XXX functions work like that, or used
to work like that, but it's counter-intuitive to me that "atomic"
should imply a memory read.

Later,
Kenn


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
@ 2007-08-24 12:12                 ` Kenn Humborg
  0 siblings, 0 replies; 1546+ messages in thread
From: Kenn Humborg @ 2007-08-24 12:12 UTC (permalink / raw)
  To: Denys Vlasenko, Satyam Sharma
  Cc: Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

> On Thursday 16 August 2007 01:39, Satyam Sharma wrote:
> >
> >  static inline void wait_for_init_deassert(atomic_t *deassert)
> >  {
> > -	while (!atomic_read(deassert));
> > +	while (!atomic_read(deassert))
> > +		cpu_relax();
> >  	return;
> >  }
> 
> For less-than-briliant people like me, it's totally non-obvious that
> cpu_relax() is needed for correctness here, not just to make P4 happy.
> 
> IOW: "atomic_read" name quite unambiguously means "I will read
> this variable from main memory". Which is not true and creates
> potential for confusion and bugs.

To me, "atomic_read" means a read which is synchronized with other 
changes to the variable (using the atomic_XXX functions) in such 
a way that I will always only see the "before" or "after"
state of the variable - never an intermediate state while a 
modification is happening.  It doesn't imply that I have to 
see the "after" state immediately after another thread modifies
it.

Perhaps the Linux atomic_XXX functions work like that, or used
to work like that, but it's counter-intuitive to me that "atomic"
should imply a memory read.

Later,
Kenn

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  4:13                                     ` Linus Torvalds
  2007-08-18 13:36                                       ` Satyam Sharma
  2007-08-18 21:54                                       ` Paul E. McKenney
@ 2007-08-24 12:19                                       ` Denys Vlasenko
  2007-08-24 17:19                                         ` Linus Torvalds
  2 siblings, 1 reply; 1546+ messages in thread
From: Denys Vlasenko @ 2007-08-24 12:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney, Herbert Xu,
	Nick Piggin, Paul Mackerras, Segher Boessenkool, heiko.carstens,
	horms, linux-kernel, rpjday, ak, netdev, cfriesen, akpm,
	jesper.juhl, linux-arch, zlynx, schwidefsky, Chris Snook, davem,
	wensong, wjiang

On Saturday 18 August 2007 05:13, Linus Torvalds wrote:
> On Sat, 18 Aug 2007, Satyam Sharma wrote:
> > No code does (or would do, or should do):
> >
> > 	x.counter++;
> >
> > on an "atomic_t x;" anyway.
>
> That's just an example of a general problem.
>
> No, you don't use "x.counter++". But you *do* use
>
> 	if (atomic_read(&x) <= 1)
>
> and loading into a register is stupid and pointless, when you could just
> do it as a regular memory-operand to the cmp instruction.

It doesn't mean that (volatile int*) cast is bad, it means that current gcc
is bad (or "not good enough"). IOW: instead of avoiding volatile cast,
it's better to fix the compiler.

> And as far as the compiler is concerned, the problem is the 100% same:
> combining operations with the volatile memop.
>
> The fact is, a compiler that thinks that
>
> 	movl mem,reg
> 	cmpl $val,reg
>
> is any better than
>
> 	cmpl $val,mem
>
> is just not a very good compiler.

Linus, in all honesty gcc has many more cases of suboptimal code,
case of "volatile" is just one of many.

Off the top of my head:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417

unsigned v;
void f(unsigned A) { v = ((unsigned long long)A) * 365384439 >> (27+32); }

gcc-4.1.1 -S -Os -fomit-frame-pointer t.c

f:
        movl    $365384439, %eax
        mull    4(%esp)
        movl    %edx, %eax <===== ?
        shrl    $27, %eax
        movl    %eax, v
        ret

Why is it moving %edx to %eax?

gcc-4.2.1 -S -Os -fomit-frame-pointer t.c

f:
        movl    $365384439, %eax
        mull    4(%esp)
        movl    %edx, %eax <===== ?
        xorl    %edx, %edx <===== ??!
        shrl    $27, %eax
        movl    %eax, v
        ret

Progress... Now we also zero out %edx afterwards for no apparent reason.
--
vda

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 23:22         ` Paul Mackerras
  2007-08-16  0:26           ` Christoph Lameter
@ 2007-08-24 12:50           ` Denys Vlasenko
  2007-08-24 17:15             ` Christoph Lameter
  1 sibling, 1 reply; 1546+ messages in thread
From: Denys Vlasenko @ 2007-08-24 12:50 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Satyam Sharma, Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

On Thursday 16 August 2007 00:22, Paul Mackerras wrote:
> Satyam Sharma writes:
> In the kernel we use atomic variables in precisely those situations
> where a variable is potentially accessed concurrently by multiple
> CPUs, and where each CPU needs to see updates done by other CPUs in a
> timely fashion.  That is what they are for.  Therefore the compiler
> must not cache values of atomic variables in registers; each
> atomic_read must result in a load and each atomic_set must result in a
> store.  Anything else will just lead to subtle bugs.

Amen.
--
vda

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 11:59             ` Denys Vlasenko
  2007-08-24 12:07               ` Andi Kleen
  2007-08-24 12:12                 ` Kenn Humborg
@ 2007-08-24 13:30               ` Satyam Sharma
  2007-08-24 17:06                 ` Christoph Lameter
  2007-08-24 16:19                 ` Luck, Tony
  3 siblings, 1 reply; 1546+ messages in thread
From: Satyam Sharma @ 2007-08-24 13:30 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Nick Piggin

Hi Denys,

On Fri, 24 Aug 2007, Denys Vlasenko wrote:

> On Thursday 16 August 2007 01:39, Satyam Sharma wrote:
> >
> >  static inline void wait_for_init_deassert(atomic_t *deassert)
> >  {
> > -	while (!atomic_read(deassert));
> > +	while (!atomic_read(deassert))
> > +		cpu_relax();
> >  	return;
> >  }
> 
> For less-than-briliant people like me, it's totally non-obvious that
> cpu_relax() is needed for correctness here, not just to make P4 happy.

This thread has been round-and-round with exactly the same discussions
:-) I had proposed few such variants to make a compiler barrier implicit
in atomic_{read,set} myself, but frankly, at least personally speaking
(now that I know better), I'm not so much in favour of implicit barriers
(compiler, memory or both) in atomic_{read,set}.

This might sound like an about-turn if you read my own postings to Nick
Piggin from a week back, but I do agree with most his opinions on the
matter now -- separation of barriers from atomic ops is actually good,
beneficial to certain code that knows what it's doing, explicit usage
of barriers stands out more clearly (most people here who deal with it
do know cpu_relax() is an explicit compiler barrier) compared to an
implicit usage in an atomic_read() or such variant ...

> IOW: "atomic_read" name quite unambiguously means "I will read
> this variable from main memory". Which is not true and creates
> potential for confusion and bugs.

I'd have to disagree here -- atomic ops are all about _atomicity_ of
memory accesses, not _making_ them happen (or visible to other CPUs)
_then and there_ itself. The latter are the job of barriers.

The behaviour (and expectations) are quite comprehensively covered in
atomic_ops.txt -- let alone atomic_{read,set}, even atomic_{inc,dec}
are permitted by archs' implementations to _not_ have any memory
barriers, for that matter. [It is unrelated that on x86 making them
SMP-safe requires the use of the LOCK prefix that also happens to be
an implicit memory barrier.]

An argument was also made about consistency of atomic_{read,set} w.r.t.
the other atomic ops -- but clearly, they are all already consistent!
All of them are atomic :-) The fact that atomic_{read,set} do _not_
require any inline asm or LOCK prefix whereas the others do, has to do
with the fact that unlike all others, atomic_{read,set} are not RMW ops
and hence guaranteed to be atomic just as they are in plain & simple C.

But if people do seem to have a mixed / confused notion of atomicity
and barriers, and if there's consensus, then as I'd said earlier, I
have no issues in going with the consensus (eg. having API variants).
Linus would be more difficult to convince, however, I suspect :-)

Satyam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 12:12                 ` Kenn Humborg
  (?)
@ 2007-08-24 14:25                 ` Denys Vlasenko
  2007-08-24 17:34                   ` Linus Torvalds
  -1 siblings, 1 reply; 1546+ messages in thread
From: Denys Vlasenko @ 2007-08-24 14:25 UTC (permalink / raw)
  To: Kenn Humborg
  Cc: Satyam Sharma, Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Friday 24 August 2007 13:12, Kenn Humborg wrote:
> > On Thursday 16 August 2007 01:39, Satyam Sharma wrote:
> > >  static inline void wait_for_init_deassert(atomic_t *deassert)
> > >  {
> > > -	while (!atomic_read(deassert));
> > > +	while (!atomic_read(deassert))
> > > +		cpu_relax();
> > >  	return;
> > >  }
> >
> > For less-than-briliant people like me, it's totally non-obvious that
> > cpu_relax() is needed for correctness here, not just to make P4 happy.
> >
> > IOW: "atomic_read" name quite unambiguously means "I will read
> > this variable from main memory". Which is not true and creates
> > potential for confusion and bugs.
>
> To me, "atomic_read" means a read which is synchronized with other
> changes to the variable (using the atomic_XXX functions) in such
> a way that I will always only see the "before" or "after"
> state of the variable - never an intermediate state while a
> modification is happening.  It doesn't imply that I have to
> see the "after" state immediately after another thread modifies
> it.

So you are ok with compiler propagating n1 to n2 here:

n1 += atomic_read(x);
other_variable++;
n2 += atomic_read(x);

without accessing x second time. What's the point? Any sane coder
will say that explicitly anyway:

tmp = atomic_read(x);
n1 += tmp;
other_variable++;
n2 += tmp;

if only for the sake of code readability. Because first code
is definitely hinting that it reads RAM twice, and it's actively *bad*
for code readability when in fact it's not the case!

Locking, compiler and CPU barriers are complicated enough already,
please don't make them even harder to understand.
--
vda

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 11:59             ` Denys Vlasenko
@ 2007-08-24 16:19                 ` Luck, Tony
  2007-08-24 12:12                 ` Kenn Humborg
                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 1546+ messages in thread
From: Luck, Tony @ 2007-08-24 16:19 UTC (permalink / raw)
  To: Denys Vlasenko, Satyam Sharma
  Cc: Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

>>  static inline void wait_for_init_deassert(atomic_t *deassert)
>>  {
>> -	while (!atomic_read(deassert));
>> +	while (!atomic_read(deassert))
>> +		cpu_relax();
>>  	return;
>>  }
>
> For less-than-briliant people like me, it's totally non-obvious that
> cpu_relax() is needed for correctness here, not just to make P4 happy.

Not just P4 ... there are other threaded cpus where it is useful to
let the core know that this is a busy loop so it would be a good thing
to let other threads have priority.

Even on a non-threaded cpu the cpu_relax() could be useful in the
future to hint to the cpu that it could drop into a lower power
hogging state.

But I agree with your main point that the loop without the cpu_relax()
looks like it ought to work because atomic_read() ought to actually
go out and read memory each time around the loop.

-Tony

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
@ 2007-08-24 16:19                 ` Luck, Tony
  0 siblings, 0 replies; 1546+ messages in thread
From: Luck, Tony @ 2007-08-24 16:19 UTC (permalink / raw)
  To: Denys Vlasenko, Satyam Sharma
  Cc: Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

>>  static inline void wait_for_init_deassert(atomic_t *deassert)
>>  {
>> -	while (!atomic_read(deassert));
>> +	while (!atomic_read(deassert))
>> +		cpu_relax();
>>  	return;
>>  }
>
> For less-than-briliant people like me, it's totally non-obvious that
> cpu_relax() is needed for correctness here, not just to make P4 happy.

Not just P4 ... there are other threaded cpus where it is useful to
let the core know that this is a busy loop so it would be a good thing
to let other threads have priority.

Even on a non-threaded cpu the cpu_relax() could be useful in the
future to hint to the cpu that it could drop into a lower power
hogging state.

But I agree with your main point that the loop without the cpu_relax()
looks like it ought to work because atomic_read() ought to actually
go out and read memory each time around the loop.

-Tony

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 13:30               ` Satyam Sharma
@ 2007-08-24 17:06                 ` Christoph Lameter
  2007-08-24 20:26                   ` Denys Vlasenko
  0 siblings, 1 reply; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-24 17:06 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Denys Vlasenko, Heiko Carstens, Herbert Xu, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Nick Piggin

On Fri, 24 Aug 2007, Satyam Sharma wrote:

> But if people do seem to have a mixed / confused notion of atomicity
> and barriers, and if there's consensus, then as I'd said earlier, I
> have no issues in going with the consensus (eg. having API variants).
> Linus would be more difficult to convince, however, I suspect :-)

The confusion may be the result of us having barrier semantics in 
atomic_read. If we take that out then we may avoid future confusions.


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-24 12:50           ` Denys Vlasenko
@ 2007-08-24 17:15             ` Christoph Lameter
  2007-08-24 20:21               ` Denys Vlasenko
  0 siblings, 1 reply; 1546+ messages in thread
From: Christoph Lameter @ 2007-08-24 17:15 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

On Fri, 24 Aug 2007, Denys Vlasenko wrote:

> On Thursday 16 August 2007 00:22, Paul Mackerras wrote:
> > Satyam Sharma writes:
> > In the kernel we use atomic variables in precisely those situations
> > where a variable is potentially accessed concurrently by multiple
> > CPUs, and where each CPU needs to see updates done by other CPUs in a
> > timely fashion.  That is what they are for.  Therefore the compiler
> > must not cache values of atomic variables in registers; each
> > atomic_read must result in a load and each atomic_set must result in a
> > store.  Anything else will just lead to subtle bugs.
> 
> Amen.

A "timely" fashion? One cannot rely on something like that when coding. 
The visibility of updates is insured by barriers and not by some fuzzy 
notion of "timeliness".

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-24 12:19                                       ` Denys Vlasenko
@ 2007-08-24 17:19                                         ` Linus Torvalds
  0 siblings, 0 replies; 1546+ messages in thread
From: Linus Torvalds @ 2007-08-24 17:19 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney, Herbert Xu,
	Nick Piggin, Paul Mackerras, Segher Boessenkool, heiko.carstens,
	horms, linux-kernel, rpjday, ak, netdev, cfriesen, akpm,
	jesper.juhl, linux-arch, zlynx, schwidefsky, Chris Snook, davem,
	wensong, wjiang

On Fri, 24 Aug 2007, Denys Vlasenko wrote:
>
> > No, you don't use "x.counter++". But you *do* use
> >
> > 	if (atomic_read(&x) <= 1)
> >
> > and loading into a register is stupid and pointless, when you could just
> > do it as a regular memory-operand to the cmp instruction.
> 
> It doesn't mean that (volatile int*) cast is bad, it means that current gcc
> is bad (or "not good enough"). IOW: instead of avoiding volatile cast,
> it's better to fix the compiler.

I would agree that fixing the compiler in this case would be a good thing, 
even quite regardless of any "atomic_read()" discussion.

I just have a strong suspicion that "volatile" performance is so low down 
the list of any C compiler persons interest, that it's never going to 
happen. And quite frankly, I cannot blame the gcc guys for it.

That's especially as "volatile" really isn't a very good feature of the C 
language, and is likely to get *less* interesting rather than more (as 
user space starts to be more and more threaded, "volatile" gets less and 
less useful.

[ Ie, currently, I think you can validly use "volatile" in a "sigatomic_t" 
  kind of way, where there is a single thread, but with asynchronous 
  events. In that kind of situation, I think it's probably useful. But 
  once you get multiple threads, it gets pointless.

  Sure: you could use "volatile" together with something like Dekker's or 
  Peterson's algorithm that doesn't depend on cache coherency (that's 
  basically what the C "volatile" keyword approximates: not atomic 
  accesses, but *uncached* accesses! But let's face it, that's way past 
  insane. ]

So I wouldn't expect "volatile" to ever really generate better code. It 
might happen as a side effect of other improvements (eg, I might hope that 
the SSA work would eventually lead to gcc having a much better defined 
model of valid optimizations, and maybe better code generation for 
volatile accesses fall out cleanly out of that), but in the end, it's such 
an ugly special case in C, and so seldom used, that I wouldn't depend on 
it.

> Linus, in all honesty gcc has many more cases of suboptimal code,
> case of "volatile" is just one of many.

Well, the thing is, quite often, many of those "suboptimal code" 
generations fall into two distinct classes:

 - complex C code. I can't really blame the compiler too much for this. 
   Some things are *hard* to optimize, and for various scalability 
   reasons, you often end up having limits in the compiler where it 
   doesn't even _try_ doing certain optimizations if you have excessive 
   complexity.

 - bad register allocation. Register allocation really is hard, and 
   sometimes gcc just does the "obviously wrong" thing, and you end up 
   having totally unnecessary spills.

> Off the top of my head:

Yes, "unsigned long long" with x86 has always generated atrocious code. In 
fact, I would say that historically it was really *really* bad. These 
days, gcc actually does a pretty good job, but I'm not surprised that it's 
still quite possible to find cases where it did some optimization (in this 
case, apparently noticing that "shift by >= 32 bits" causes the low 
register to be pointless) and then missed *another* optimization (better 
register use) because that optimization had been done *before* the first 
optimization was done.

That's a *classic* example of compiler code generation issues, and quite 
frankly, I think that's very different from the issue of "volatile".

Quite frankly, I'd like there to be more competition in the open source 
compiler game, and that might cause some upheavals, but on the whole, gcc 
actually does a pretty damn good job. 

			Linus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 14:25                 ` Denys Vlasenko
@ 2007-08-24 17:34                   ` Linus Torvalds
  0 siblings, 0 replies; 1546+ messages in thread
From: Linus Torvalds @ 2007-08-24 17:34 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Kenn Humborg, Satyam Sharma, Heiko Carstens, Herbert Xu,
	Chris Snook, clameter, Linux Kernel Mailing List, linux-arch,
	netdev, Andrew Morton, ak, davem, schwidefsky, wensong, horms,
	wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, 24 Aug 2007, Denys Vlasenko wrote:
> 
> So you are ok with compiler propagating n1 to n2 here:
> 
> n1 += atomic_read(x);
> other_variable++;
> n2 += atomic_read(x);
> 
> without accessing x second time. What's the point? Any sane coder
> will say that explicitly anyway:

No.

This is a common mistake, and it's total crap.

Any "sane coder" will often use inline functions, macros, etc helpers to 
do certain abstract things. Those things may contain "atomic_read()" 
calls.

The biggest reason for compilers doing CSE is exactly the fact that many 
opportunities for CSE simple *are*not*visible* on a source code level. 

That is true of things like atomic_read() equally as to things like shared 
offsets inside structure member accesses. No difference what-so-ever.

Yes, we have, traditionally, tried to make it *easy* for the compiler to 
generate good code. So when we can, and when we look at performance for 
some really hot path, we *will* write the source code so that the compiler 
doesn't even have the option to screw it up, and that includes things like 
doing CSE at a source code level so that we don't see the compiler 
re-doing accesses unnecessarily.

And I'm not saying we shouldn't do that. But "performance" is not an 
either-or kind of situation, and we should:

 - spend the time at a source code level: make it reasonably easy for the 
   compiler to generate good code, and use the right algorithms at a 
   higher level (and order structures etc so that they have good cache 
   behaviour).

 - .. *and* expect the compiler to handle the cases we didn't do by hand
   pretty well anyway. In particular, quite often, abstraction levels at a 
   software level means that we give compilers "stupid" code, because some 
   function may have a certain high-level abstraction rule, but then on a 
   particular architecture it's actually a no-op, and the compiler should 
   get to "untangle" our stupid code and generate good end results.

 - .. *and* expect the hardware to be sane and do a good job even when the 
   compiler didn't generate perfect code or there were unlucky cache miss
   patterns etc.

and if we do all of that, we'll get good performance. But you really do 
want all three levels. It's not enough to be good at any one level (or 
even any two).

			Linus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-24 17:15             ` Christoph Lameter
@ 2007-08-24 20:21               ` Denys Vlasenko
  0 siblings, 0 replies; 1546+ messages in thread
From: Denys Vlasenko @ 2007-08-24 20:21 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

On Friday 24 August 2007 18:15, Christoph Lameter wrote:
> On Fri, 24 Aug 2007, Denys Vlasenko wrote:
> > On Thursday 16 August 2007 00:22, Paul Mackerras wrote:
> > > Satyam Sharma writes:
> > > In the kernel we use atomic variables in precisely those situations
> > > where a variable is potentially accessed concurrently by multiple
> > > CPUs, and where each CPU needs to see updates done by other CPUs in a
> > > timely fashion.  That is what they are for.  Therefore the compiler
> > > must not cache values of atomic variables in registers; each
> > > atomic_read must result in a load and each atomic_set must result in a
> > > store.  Anything else will just lead to subtle bugs.
> >
> > Amen.
>
> A "timely" fashion? One cannot rely on something like that when coding.
> The visibility of updates is insured by barriers and not by some fuzzy
> notion of "timeliness".

But here you do have some notion of time:

	while (atomic_read(&x))
		continue;

"continue when other CPU(s) decrement it down to zero".
If "read" includes an insn which accesses RAM, you will
see "new" value sometime after other CPU decrements it.
"Sometime after" is on the order of nanoseconds here.
It is a valid concept of time, right?

The whole confusion is about whether atomic_read implies
"read from RAM" or not. I am in a camp which thinks it does.
You are in an opposite one.

We just need a less ambiguous name.

What about this:

/**
 * atomic_read - read atomic variable
 * @v: pointer of type atomic_t
 *
 * Atomically reads the value of @v.
 * No compiler barrier implied.
 */
#define atomic_read(v)          ((v)->counter)

+/**
+ * atomic_read_uncached - read atomic variable from memory
+ * @v: pointer of type atomic_t
+ *
+ * Atomically reads the value of @v. This is guaranteed to emit an insn
+ * which accesses memory, atomically. No ordering guarantees!
+ */
+#define atomic_read_uncached(v)  asm_or_volatile_ptr_magic(v)

I was thinking of s/atomic_read/atomic_get/ too, but it implies "taking"
atomic a-la get_cpu()...
--
vda

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 17:06                 ` Christoph Lameter
@ 2007-08-24 20:26                   ` Denys Vlasenko
  2007-08-24 20:34                     ` Chris Snook
  0 siblings, 1 reply; 1546+ messages in thread
From: Denys Vlasenko @ 2007-08-24 20:26 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Satyam Sharma, Heiko Carstens, Herbert Xu, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Nick Piggin

On Friday 24 August 2007 18:06, Christoph Lameter wrote:
> On Fri, 24 Aug 2007, Satyam Sharma wrote:
> > But if people do seem to have a mixed / confused notion of atomicity
> > and barriers, and if there's consensus, then as I'd said earlier, I
> > have no issues in going with the consensus (eg. having API variants).
> > Linus would be more difficult to convince, however, I suspect :-)
>
> The confusion may be the result of us having barrier semantics in
> atomic_read. If we take that out then we may avoid future confusions.

I think better name may help. Nuke atomic_read() altogether.

n = atomic_value(x);	// doesnt hint as strongly at reading as "atomic_read"
n = atomic_fetch(x);	// yes, we _do_ touch RAM
n = atomic_read_uncached(x); // or this

How does that sound?
--
vda

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 20:26                   ` Denys Vlasenko
@ 2007-08-24 20:34                     ` Chris Snook
  0 siblings, 0 replies; 1546+ messages in thread
From: Chris Snook @ 2007-08-24 20:34 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Christoph Lameter, Satyam Sharma, Heiko Carstens, Herbert Xu,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Nick Piggin

Denys Vlasenko wrote:
> On Friday 24 August 2007 18:06, Christoph Lameter wrote:
>> On Fri, 24 Aug 2007, Satyam Sharma wrote:
>>> But if people do seem to have a mixed / confused notion of atomicity
>>> and barriers, and if there's consensus, then as I'd said earlier, I
>>> have no issues in going with the consensus (eg. having API variants).
>>> Linus would be more difficult to convince, however, I suspect :-)
>> The confusion may be the result of us having barrier semantics in
>> atomic_read. If we take that out then we may avoid future confusions.
> 
> I think better name may help. Nuke atomic_read() altogether.
> 
> n = atomic_value(x);	// doesnt hint as strongly at reading as "atomic_read"
> n = atomic_fetch(x);	// yes, we _do_ touch RAM
> n = atomic_read_uncached(x); // or this
> 
> How does that sound?

atomic_value() vs. atomic_fetch() should be rather unambiguous. 
atomic_read_uncached() begs the question of precisely which cache we are 
avoiding, and could itself cause confusion.

So, if I were writing atomic.h from scratch, knowing what I know now, I think 
I'd use atomic_value() and atomic_fetch().  The problem is that there are a lot 
of existing users of atomic_read(), and we can't write a script to correctly 
guess their intent.  I'm not sure auditing all uses of atomic_read() is really 
worth the comparatively miniscule benefits.

We could play it safe and convert them all to atomic_fetch(), or we could 
acknowledge that changing the semantics 8 months ago was not at all disastrous, 
and make them all atomic_value(), allowing people to use atomic_fetch() where 
they really care.

	-- Chris

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 16:48                                                             ` Linus Torvalds
  2007-08-17 18:50                                                               ` Chris Friesen
  2007-08-20 13:15                                                               ` Chris Snook
@ 2007-09-09 18:02                                                               ` Denys Vlasenko
  2007-09-09 18:18                                                                 ` Arjan van de Ven
  2 siblings, 1 reply; 1546+ messages in thread
From: Denys Vlasenko @ 2007-09-09 18:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On Friday 17 August 2007 17:48, Linus Torvalds wrote:
> 
> On Fri, 17 Aug 2007, Nick Piggin wrote:
> > 
> > That's not obviously just taste to me. Not when the primitive has many
> > (perhaps, the majority) of uses that do not require said barriers. And
> > this is not solely about the code generation (which, as Paul says, is
> > relatively minor even on x86). I prefer people to think explicitly
> > about barriers in their lockless code.
> 
> Indeed.
> 
> I think the important issues are:
> 
>  - "volatile" itself is simply a badly/weakly defined issue. The semantics 
>    of it as far as the compiler is concerned are really not very good, and 
>    in practice tends to boil down to "I will generate so bad code that 
>    nobody can accuse me of optimizing anything away".
> 
>  - "volatile" - regardless of how well or badly defined it is - is purely 
>    a compiler thing. It has absolutely no meaning for the CPU itself, so 
>    it at no point implies any CPU barriers. As a result, even if the 
>    compiler generates crap code and doesn't re-order anything, there's 
>    nothing that says what the CPU will do.
> 
>  - in other words, the *only* possible meaning for "volatile" is a purely 
>    single-CPU meaning. And if you only have a single CPU involved in the 
>    process, the "volatile" is by definition pointless (because even 
>    without a volatile, the compiler is required to make the C code appear 
>    consistent as far as a single CPU is concerned).
> 
> So, let's take the example *buggy* code where we use "volatile" to wait 
> for other CPU's:
> 
> 	atomic_set(&var, 0);
> 	while (!atomic_read(&var))
> 		/* nothing */;
> 
> 
> which generates an endless loop if we don't have atomic_read() imply 
> volatile.
> 
> The point here is that it's buggy whether the volatile is there or not! 
> Exactly because the user expects multi-processing behaviour, but 
> "volatile" doesn't actually give any real guarantees about it. Another CPU 
> may have done:
> 
> 	external_ptr = kmalloc(..);
> 	/* Setup is now complete, inform the waiter */
> 	atomic_inc(&var);
> 
> but the fact is, since the other CPU isn't serialized in any way, the 
> "while-loop" (even in the presense of "volatile") doesn't actually work 
> right! Whatever the "atomic_read()" was waiting for may not have 
> completed, because we have no barriers!

Why is all this fixation on "volatile"? I don't think
people want "volatile" keyword per se, they want atomic_read(&x) to
_always_ compile into an memory-accessing instruction, not register access.
--
vda

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-09 18:02                                                               ` Denys Vlasenko
@ 2007-09-09 18:18                                                                 ` Arjan van de Ven
  2007-09-10 10:56                                                                   ` Denys Vlasenko
  0 siblings, 1 reply; 1546+ messages in thread
From: Arjan van de Ven @ 2007-09-09 18:18 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher

On Sun, 9 Sep 2007 19:02:54 +0100
Denys Vlasenko <vda.linux@googlemail.com> wrote:

> Why is all this fixation on "volatile"? I don't think
> people want "volatile" keyword per se, they want atomic_read(&x) to
> _always_ compile into an memory-accessing instruction, not register
> access.

and ... why is that?
is there any valid, non-buggy code sequence that makes that a
reasonable requirement?

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-09 18:18                                                                 ` Arjan van de Ven
@ 2007-09-10 10:56                                                                   ` Denys Vlasenko
  2007-09-10 11:15                                                                     ` Herbert Xu
                                                                                       ` (2 more replies)
  0 siblings, 3 replies; 1546+ messages in thread
From: Denys Vlasenko @ 2007-09-10 10:56 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher

On Sunday 09 September 2007 19:18, Arjan van de Ven wrote:
> On Sun, 9 Sep 2007 19:02:54 +0100
> Denys Vlasenko <vda.linux@googlemail.com> wrote:
> 
> > Why is all this fixation on "volatile"? I don't think
> > people want "volatile" keyword per se, they want atomic_read(&x) to
> > _always_ compile into an memory-accessing instruction, not register
> > access.
> 
> and ... why is that?
> is there any valid, non-buggy code sequence that makes that a
> reasonable requirement?

Well, if you insist on having it again:

Waiting for atomic value to be zero:

        while (atomic_read(&x))
                continue;

gcc may happily convert it into:

        reg = atomic_read(&x);
        while (reg)
                continue;

Expecting every driver writer to remember that atomic_read is not in fact
a "read from memory" is naive. That won't happen. Face it, majority of
driver authors are a bit less talented than Ingo Molnar or Arjan van de Ven ;)
The name of the macro is saying that it's a read.
We are confusing users here.

It's doubly confusing that cpy_relax(), which says _nothing_ about barriers
in its name, is actually a barrier you need to insert here.
--
vda

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 10:56                                                                   ` Denys Vlasenko
@ 2007-09-10 11:15                                                                     ` Herbert Xu
  2007-09-10 12:22                                                                     ` Kyle Moffett
  2007-09-10 14:51                                                                     ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Arjan van de Ven
  2 siblings, 0 replies; 1546+ messages in thread
From: Herbert Xu @ 2007-09-10 11:15 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Arjan van de Ven, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher

On Mon, Sep 10, 2007 at 11:56:29AM +0100, Denys Vlasenko wrote:
> 
> Expecting every driver writer to remember that atomic_read is not in fact
> a "read from memory" is naive. That won't happen. Face it, majority of
> driver authors are a bit less talented than Ingo Molnar or Arjan van de Ven ;)
> The name of the macro is saying that it's a read.
> We are confusing users here.

For driver authors who're too busy to learn the intricacies
of atomic operations, we have the plain old spin lock which
then lets you use normal data structures such as u32 safely.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 10:56                                                                   ` Denys Vlasenko
  2007-09-10 11:15                                                                     ` Herbert Xu
@ 2007-09-10 12:22                                                                     ` Kyle Moffett
  2007-09-10 13:38                                                                       ` Denys Vlasenko
  2007-09-10 14:51                                                                     ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Arjan van de Ven
  2 siblings, 1 reply; 1546+ messages in thread
From: Kyle Moffett @ 2007-09-10 12:22 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Arjan van de Ven, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Sep 10, 2007, at 06:56:29, Denys Vlasenko wrote:
> On Sunday 09 September 2007 19:18, Arjan van de Ven wrote:
>> On Sun, 9 Sep 2007 19:02:54 +0100
>> Denys Vlasenko <vda.linux@googlemail.com> wrote:
>>
>>> Why is all this fixation on "volatile"? I don't think people want  
>>> "volatile" keyword per se, they want atomic_read(&x) to _always_  
>>> compile into an memory-accessing instruction, not register access.
>>
>> and ... why is that?  is there any valid, non-buggy code sequence  
>> that makes that a reasonable requirement?
>
> Well, if you insist on having it again:
>
> Waiting for atomic value to be zero:
>
>         while (atomic_read(&x))
>                 continue;
>
> gcc may happily convert it into:
>
>         reg = atomic_read(&x);
>         while (reg)
>                 continue;

Bzzt.  Even if you fixed gcc to actually convert it to a busy loop on  
a memory variable, you STILL HAVE A BUG as it may *NOT* be gcc that  
does the conversion, it may be that the CPU does the caching of the  
memory value.  GCC has no mechanism to do cache-flushes or memory- 
barriers except through our custom inline assembly.  Also, you  
probably want a cpu_relax() in there somewhere to avoid overheating  
the CPU.  Thirdly, on a large system it may take some arbitrarily  
large amount of time for cache-propagation to update the value of the  
variable in your local CPU cache.  Finally, if atomics are based on  
based on spinlock+interrupt-disable then you will sit in a tight busy- 
loop of spin_lock_irqsave()->spin_unlock_irqrestore().  Depending on  
your system's internal model this may practically lock up your core  
because the spin_lock() will take the cacheline for exclusive access  
and doing that in a loop can prevent any other CPU from doing any  
operation on it!  Since your IRQs are disabled you even have a very  
small window that an IRQ will come along and free it up long enough  
for the update to take place.

The earlier code segment of:
> while(atomic_read(&x) > 0)
> 	atomic_dec(&x);
is *completely* buggy because you could very easily have 4 CPUs doing  
this on an atomic variable with a value of 1 and end up with it at  
negative 3 by the time you are done.  Moreover all the alternatives  
are also buggy, with the sole exception of this rather obvious- 
seeming one:
> atomic_set(&x, 0);

You simply CANNOT use an atomic_t as your sole synchronizing  
primitive, it doesn't work!  You virtually ALWAYS want to use an  
atomic_t in the following types of situations:

(A) As an object refcount.  The value is never read except as part of  
an atomic_dec_return().  Why aren't you using "struct kref"?

(B) As an atomic value counter (number of processes, for example).   
Just "reading" the value is racy anyways, if you want to enforce a  
limit or something then use atomic_inc_return(), check the result,  
and use atomic_dec() if it's too big.  If you just want to return the  
statistics then you are going to be instantaneous-point-in-time anyways.

(C) As an optimization value (statistics-like, but exact accuracy  
isn't important).

Atomics are NOT A REPLACEMENT for the proper kernel subsystem, like  
completions, mutexes, semaphores, spinlocks, krefs, etc.  It's not  
useful for synchronization, only for keeping track of simple integer  
RMW values.  Note that atomic_read() and atomic_set() aren't very  
useful RMW primitives (read-nomodify-nowrite and read-set-zero- 
write).  Code which assumes anything else is probably buggy in other  
ways too.

So while I see no real reason for the "volatile" on the atomics, I  
also see no real reason why it's terribly harmful.  Regardless of the  
"volatile" on the operation the CPU is perfectly happy to cache it  
anyways so it doesn't buy you any actual "always-access-memory"  
guarantees.  If you are just interested in it as an optimization you  
could probably just read the properly-aligned integer counter  
directly, an atomic read on most CPUs.

If you really need it to hit main memory *every* *single* *time*  
(Why?  Are you using it instead of the proper kernel subsystem?)   
then you probably need a custom inline assembly helper anyways.

Cheers,
Kyle Moffett

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 12:22                                                                     ` Kyle Moffett
@ 2007-09-10 13:38                                                                       ` Denys Vlasenko
  2007-09-10 14:16                                                                         ` Denys Vlasenko
  0 siblings, 1 reply; 1546+ messages in thread
From: Denys Vlasenko @ 2007-09-10 13:38 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Arjan van de Ven, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Monday 10 September 2007 13:22, Kyle Moffett wrote:
> On Sep 10, 2007, at 06:56:29, Denys Vlasenko wrote:
> > On Sunday 09 September 2007 19:18, Arjan van de Ven wrote:
> >> On Sun, 9 Sep 2007 19:02:54 +0100
> >> Denys Vlasenko <vda.linux@googlemail.com> wrote:
> >>
> >>> Why is all this fixation on "volatile"? I don't think people want  
> >>> "volatile" keyword per se, they want atomic_read(&x) to _always_  
> >>> compile into an memory-accessing instruction, not register access.
> >>
> >> and ... why is that?  is there any valid, non-buggy code sequence  
> >> that makes that a reasonable requirement?
> >
> > Well, if you insist on having it again:
> >
> > Waiting for atomic value to be zero:
> >
> >         while (atomic_read(&x))
> >                 continue;
> >
> > gcc may happily convert it into:
> >
> >         reg = atomic_read(&x);
> >         while (reg)
> >                 continue;
> 
> Bzzt.  Even if you fixed gcc to actually convert it to a busy loop on  
> a memory variable, you STILL HAVE A BUG as it may *NOT* be gcc that  
> does the conversion, it may be that the CPU does the caching of the  
> memory value.  GCC has no mechanism to do cache-flushes or memory- 
> barriers except through our custom inline assembly.

CPU can cache the value all right, but it cannot use that cached value
*forever*, it has to react to invalidate cycles on the shared bus
and re-fetch new data.

IOW: atomic_read(&x) which compiles down to memory accessor
will work properly.

> the CPU.  Thirdly, on a large system it may take some arbitrarily  
> large amount of time for cache-propagation to update the value of the  
> variable in your local CPU cache.

Yes, but "arbitrarily large amount of time" is actually measured
in nanoseconds here. Let's say 1000ns max for hundreds of CPUs?

> Also, you   
> probably want a cpu_relax() in there somewhere to avoid overheating  
> the CPU.

Yes, but 
1. CPU shouldn't overheat (in a sense that it gets damaged),
   it will only use more power than needed.
2. cpu_relax() just throttles down my CPU, so it's performance
   optimization only. Wait, it isn't, it's a barrier too.
   Wow, "cpu_relax" is a barrier? How am I supposed to know
   that without reading lkml flamewars and/or header files?

Let's try reading headers. asm-x86_64/processor.h:

#define cpu_relax()   rep_nop()

So, is it a barrier? No clue yet.

/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
static inline void rep_nop(void)
{
        __asm__ __volatile__("rep;nop": : :"memory");
}

Comment explicitly says that it is "a good thing" (doesn't say
that it is mandatory) and says NOTHING about barriers!

Barrier-ness is not mentioned and is hidden in "memory" clobber.

Do you think it's obvious enough for average driver writer?
I think not, especially that it's unlikely for him to even start
suspecting that it is a memory barrier based on the "cpu_relax"
name.

> You simply CANNOT use an atomic_t as your sole synchronizing
> primitive, it doesn't work!  You virtually ALWAYS want to use an  
> atomic_t in the following types of situations:
> 
> (A) As an object refcount.  The value is never read except as part of  
> an atomic_dec_return().  Why aren't you using "struct kref"?
> 
> (B) As an atomic value counter (number of processes, for example).   
> Just "reading" the value is racy anyways, if you want to enforce a  
> limit or something then use atomic_inc_return(), check the result,  
> and use atomic_dec() if it's too big.  If you just want to return the  
> statistics then you are going to be instantaneous-point-in-time anyways.
> 
> (C) As an optimization value (statistics-like, but exact accuracy  
> isn't important).
> 
> Atomics are NOT A REPLACEMENT for the proper kernel subsystem, like  
> completions, mutexes, semaphores, spinlocks, krefs, etc.  It's not  
> useful for synchronization, only for keeping track of simple integer  
> RMW values.  Note that atomic_read() and atomic_set() aren't very  
> useful RMW primitives (read-nomodify-nowrite and read-set-zero- 
> write).  Code which assumes anything else is probably buggy in other  
> ways too.

You are basically trying to educate me how to use atomic properly.
You don't need to do it, as I am (currently) not a driver author.

I am saying that people who are already using atomic_read()
(and who unfortunately did not read your explanation above)
will still sometimes use atomic_read() as a way to read atomic value
*from memory*, and will create nasty heisenbugs for you to debug.
--
vda

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 13:38                                                                       ` Denys Vlasenko
@ 2007-09-10 14:16                                                                         ` Denys Vlasenko
  2007-09-10 15:09                                                                           ` Linus Torvalds
  0 siblings, 1 reply; 1546+ messages in thread
From: Denys Vlasenko @ 2007-09-10 14:16 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Arjan van de Ven, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Monday 10 September 2007 14:38, Denys Vlasenko wrote:
> You are basically trying to educate me how to use atomic properly.
> You don't need to do it, as I am (currently) not a driver author.
> 
> I am saying that people who are already using atomic_read()
> (and who unfortunately did not read your explanation above)
> will still sometimes use atomic_read() as a way to read atomic value
> *from memory*, and will create nasty heisenbugs for you to debug.

static inline int
qla2x00_wait_for_loop_ready(scsi_qla_host_t *ha)
{
        int      return_status = QLA_SUCCESS;
        unsigned long loop_timeout ;
        scsi_qla_host_t *pha = to_qla_parent(ha);

        /* wait for 5 min at the max for loop to be ready */
        loop_timeout = jiffies + (MAX_LOOP_TIMEOUT * HZ);

        while ((!atomic_read(&pha->loop_down_timer) &&
            atomic_read(&pha->loop_state) == LOOP_DOWN) ||
            atomic_read(&pha->loop_state) != LOOP_READY) {
                if (atomic_read(&pha->loop_state) == LOOP_DEAD) {
                        return_status = QLA_FUNCTION_FAILED;
                        break;
                }
                msleep(1000);
                if (time_after_eq(jiffies, loop_timeout)) {
                        return_status = QLA_FUNCTION_FAILED;
                        break;
                }
        }
        return (return_status);
}

Is above correct or buggy? Correct, because msleep is a barrier.
Is it obvious? No.

static void
qla2x00_rst_aen(scsi_qla_host_t *ha)
{
        if (ha->flags.online && !ha->flags.reset_active &&
            !atomic_read(&ha->loop_down_timer) &&
            !(test_bit(ABORT_ISP_ACTIVE, &ha->dpc_flags))) {
                do {
                        clear_bit(RESET_MARKER_NEEDED, &ha->dpc_flags);

                        /*
                         * Issue marker command only when we are going to start
                         * the I/O.
                         */
                        ha->marker_needed = 1;
                } while (!atomic_read(&ha->loop_down_timer) &&
                    (test_bit(RESET_MARKER_NEEDED, &ha->dpc_flags)));
        }
}

Is above correct? I honestly don't know. Correct, because set_bit is
a barrier on _all _memory_? Will it break if set_bit will be changed
to be a barrier only on its operand? Probably yes.

drivers/kvm/kvm_main.c

        while (atomic_read(&completed) != needed) {
                cpu_relax();
                barrier();
        }

Obviously author did not know that cpu_relax is already a barrier.
See why I think driver authors will be confused?

arch/x86_64/kernel/crash.c

static void nmi_shootdown_cpus(void)
{
...
        msecs = 1000; /* Wait at most a second for the other cpus to stop */
        while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
                mdelay(1);
                msecs--;
        }
...
}

Is mdelay(1) a barrier? Yes, because it is a function on x86_64.
Absolutely the same code will be buggy on an arch where
mdelay(1) == udelay(1000), and udelay is implemented
as inline busy-wait.

arch/sparc64/kernel/smp.c

        /* Wait for response */
        while (atomic_read(&data.finished) != cpus)
                cpu_relax();
...later in the same file...
                while (atomic_read(&smp_capture_registry) != ncpus)
                        rmb();

I'm confused. Do we need cpu_relax() or rmb()? Does cpu_relax() imply rmb()?
(No it doesn't). Which of those two while loops needs correcting?
--
vda

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 14:51                                                                     ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Arjan van de Ven
@ 2007-09-10 14:38                                                                       ` Denys Vlasenko
  2007-09-10 17:02                                                                         ` Arjan van de Ven
  0 siblings, 1 reply; 1546+ messages in thread
From: Denys Vlasenko @ 2007-09-10 14:38 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher

On Monday 10 September 2007 15:51, Arjan van de Ven wrote:
> On Mon, 10 Sep 2007 11:56:29 +0100
> Denys Vlasenko <vda.linux@googlemail.com> wrote:
> 
> > 
> > Well, if you insist on having it again:
> > 
> > Waiting for atomic value to be zero:
> > 
> >         while (atomic_read(&x))
> >                 continue;
> > 
> 
> and this I would say is buggy code all the way.
>
> Not from a pure C level semantics, but from a "busy waiting is buggy"
> semantics level and a "I'm inventing my own locking" semantics level.

After inspecting arch/*, I cannot agree with you.
Otherwise almost all major architectures use
"conceptually buggy busy-waiting":

arch/alpha
arch/i386
arch/ia64
arch/m32r
arch/mips
arch/parisc
arch/powerpc
arch/sh
arch/sparc64
arch/um
arch/x86_64

All of the above contain busy-waiting on atomic_read.

Including these loops without barriers:

arch/mips/kernel/smtc.c
			while (atomic_read(&idle_hook_initialized) < 1000)
				;
arch/mips/sgi-ip27/ip27-nmi.c
	while (atomic_read(&nmied_cpus) != num_online_cpus());

[Well maybe num_online_cpus() is a barrier, I didn't check]

arch/sh/kernel/smp.c
	if (wait)
		while (atomic_read(&smp_fn_call.finished) != (nr_cpus - 1));

Bugs?
--
vda

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 10:56                                                                   ` Denys Vlasenko
  2007-09-10 11:15                                                                     ` Herbert Xu
  2007-09-10 12:22                                                                     ` Kyle Moffett
@ 2007-09-10 14:51                                                                     ` Arjan van de Ven
  2007-09-10 14:38                                                                       ` Denys Vlasenko
  2 siblings, 1 reply; 1546+ messages in thread
From: Arjan van de Ven @ 2007-09-10 14:51 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher

On Mon, 10 Sep 2007 11:56:29 +0100
Denys Vlasenko <vda.linux@googlemail.com> wrote:

> 
> Well, if you insist on having it again:
> 
> Waiting for atomic value to be zero:
> 
>         while (atomic_read(&x))
>                 continue;
> 

and this I would say is buggy code all the way.

Not from a pure C level semantics, but from a "busy waiting is buggy"
semantics level and a "I'm inventing my own locking" semantics level.


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 14:16                                                                         ` Denys Vlasenko
@ 2007-09-10 15:09                                                                           ` Linus Torvalds
  2007-09-10 16:46                                                                             ` Denys Vlasenko
                                                                                               ` (2 more replies)
  0 siblings, 3 replies; 1546+ messages in thread
From: Linus Torvalds @ 2007-09-10 15:09 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Kyle Moffett, Arjan van de Ven, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Mon, 10 Sep 2007, Denys Vlasenko wrote:
> 
> static inline int
> qla2x00_wait_for_loop_ready(scsi_qla_host_t *ha)
> {
>         int      return_status = QLA_SUCCESS;
>         unsigned long loop_timeout ;
>         scsi_qla_host_t *pha = to_qla_parent(ha);
> 
>         /* wait for 5 min at the max for loop to be ready */
>         loop_timeout = jiffies + (MAX_LOOP_TIMEOUT * HZ);
> 
>         while ((!atomic_read(&pha->loop_down_timer) &&
>             atomic_read(&pha->loop_state) == LOOP_DOWN) ||
>             atomic_read(&pha->loop_state) != LOOP_READY) {
>                 if (atomic_read(&pha->loop_state) == LOOP_DEAD) {
...
> Is above correct or buggy? Correct, because msleep is a barrier.
> Is it obvious? No.

It's *buggy*. But it has nothing to do with any msleep() in the loop, or 
anything else.

And more importantly, it would be equally buggy even *with* a "volatile" 
atomic_read().

Why is this so hard for people to understand? You're all acting like 
morons.

The reason it is buggy has absolutely nothing to do with whether the read 
is done or not, it has to do with the fact that the CPU may re-order the 
reads *regardless* of whether the read is done in some specific order by 
the compiler ot not! In effect, there is zero ordering between all those 
three reads, and if you don't have memory barriers (or a lock or other 
serialization), that code is buggy.

So stop this idiotic discussion thread already. The above kind of code 
needs memory barriers to be non-buggy. The whole "volatile or not" 
discussion is totally idiotic, and pointless, and anybody who doesn't 
understand that by now needs to just shut up and think about it more, 
rather than make this discussion drag out even further.

The fact is, "volatile" *only* makes things worse. It generates worse 
code, and never fixes any real bugs. This is a *fact*.

			Linus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 15:09                                                                           ` Linus Torvalds
@ 2007-09-10 16:46                                                                             ` Denys Vlasenko
  2007-09-10 19:59                                                                               ` Kyle Moffett
  2007-09-10 18:59                                                                             ` Christoph Lameter
  2007-09-10 23:19                                                                             ` [PATCH] Document non-semantics of atomic_read() and atomic_set() Chris Snook
  2 siblings, 1 reply; 1546+ messages in thread
From: Denys Vlasenko @ 2007-09-10 16:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kyle Moffett, Arjan van de Ven, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Monday 10 September 2007 16:09, Linus Torvalds wrote:
> On Mon, 10 Sep 2007, Denys Vlasenko wrote:
> > static inline int
> > qla2x00_wait_for_loop_ready(scsi_qla_host_t *ha)
> > {
> >         int      return_status = QLA_SUCCESS;
> >         unsigned long loop_timeout ;
> >         scsi_qla_host_t *pha = to_qla_parent(ha);
> > 
> >         /* wait for 5 min at the max for loop to be ready */
> >         loop_timeout = jiffies + (MAX_LOOP_TIMEOUT * HZ);
> > 
> >         while ((!atomic_read(&pha->loop_down_timer) &&
> >             atomic_read(&pha->loop_state) == LOOP_DOWN) ||
> >             atomic_read(&pha->loop_state) != LOOP_READY) {
> >                 if (atomic_read(&pha->loop_state) == LOOP_DEAD) {
> ...
> > Is above correct or buggy? Correct, because msleep is a barrier.
> > Is it obvious? No.
> 
> It's *buggy*. But it has nothing to do with any msleep() in the loop, or 
> anything else.
> 
> And more importantly, it would be equally buggy even *with* a "volatile" 
> atomic_read().

I am not saying that this code is okay, this isn't the point.
(The code is in fact awful for several more reasons).

My point is that people are confused as to what atomic_read()
exactly means, and this is bad. Same for cpu_relax().
First one says "read", and second one doesn't say "barrier".

This is real code from current kernel which demonstrates this:

"I don't know that cpu_relax() is a barrier already":

drivers/kvm/kvm_main.c
        while (atomic_read(&completed) != needed) {
                cpu_relax();
                barrier();
        }

"I think that atomic_read() is a read from memory and therefore
I don't need a barrier":

arch/x86_64/kernel/crash.c
        msecs = 1000; /* Wait at most a second for the other cpus to stop */
        while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
                mdelay(1);
                msecs--;
        }

Since neither camp seems to give up, I am proposing renaming
them to something less confusing, and make everybody happy.

cpu_relax_barrier()
atomic_value(&x)
atomic_fetch(&x)

I'm not native English speaker, do these sound better?
--
vda

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 14:38                                                                       ` Denys Vlasenko
@ 2007-09-10 17:02                                                                         ` Arjan van de Ven
  0 siblings, 0 replies; 1546+ messages in thread
From: Arjan van de Ven @ 2007-09-10 17:02 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher

On Mon, 10 Sep 2007 15:38:23 +0100
Denys Vlasenko <vda.linux@googlemail.com> wrote:

> On Monday 10 September 2007 15:51, Arjan van de Ven wrote:
> > On Mon, 10 Sep 2007 11:56:29 +0100
> > Denys Vlasenko <vda.linux@googlemail.com> wrote:
> > 
> > > 
> > > Well, if you insist on having it again:
> > > 
> > > Waiting for atomic value to be zero:
> > > 
> > >         while (atomic_read(&x))
> > >                 continue;
> > > 
> > 
> > and this I would say is buggy code all the way.
> >
> > Not from a pure C level semantics, but from a "busy waiting is
> > buggy" semantics level and a "I'm inventing my own locking"
> > semantics level.
> 
> After inspecting arch/*, I cannot agree with you.

the arch/ people obviously are allowed to do their own locking stuff...
BECAUSE THEY HAVE TO IMPLEMENT THAT!


the arch maintainers know EXACTLY how their hw behaves (well, we hope)
so they tend to be the exception to many rules in the kernel....

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 17:41                                         ` Segher Boessenkool
  2007-08-17 18:38                                           ` Satyam Sharma
@ 2007-09-10 18:59                                           ` Christoph Lameter
  2007-09-10 20:54                                             ` Paul E. McKenney
  2007-09-11  2:27                                             ` Segher Boessenkool
  1 sibling, 2 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-09-10 18:59 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Paul Mackerras, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

On Fri, 17 Aug 2007, Segher Boessenkool wrote:

> "volatile" has nothing to do with reordering.  atomic_dec() writes
> to memory, so it _does_ have "volatile semantics", implicitly, as
> long as the compiler cannot optimise the atomic variable away
> completely -- any store counts as a side effect.

Stores can be reordered. Only x86 has (mostly) implicit write ordering. So 
no atomic_dec has no volatile semantics and may be reordered on a variety 
of processors. Writes to memory may not follow code order on several 
processors.



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 15:09                                                                           ` Linus Torvalds
  2007-09-10 16:46                                                                             ` Denys Vlasenko
@ 2007-09-10 18:59                                                                             ` Christoph Lameter
  2007-09-10 23:19                                                                             ` [PATCH] Document non-semantics of atomic_read() and atomic_set() Chris Snook
  2 siblings, 0 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-09-10 18:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Denys Vlasenko, Kyle Moffett, Arjan van de Ven, Nick Piggin,
	Satyam Sharma, Herbert Xu, Paul Mackerras, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Mon, 10 Sep 2007, Linus Torvalds wrote:

> The fact is, "volatile" *only* makes things worse. It generates worse 
> code, and never fixes any real bugs. This is a *fact*.

Yes, lets just drop the volatiles now! We need a patch that gets rid of 
them.... Volunteers?



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 16:46                                                                             ` Denys Vlasenko
@ 2007-09-10 19:59                                                                               ` Kyle Moffett
  0 siblings, 0 replies; 1546+ messages in thread
From: Kyle Moffett @ 2007-09-10 19:59 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linus Torvalds, Arjan van de Ven, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Sep 10, 2007, at 12:46:33, Denys Vlasenko wrote:
> My point is that people are confused as to what atomic_read()   
> exactly means, and this is bad. Same for cpu_relax().  First one  
> says "read", and second one doesn't say "barrier".

Q&A:

Q:  When is it OK to use atomic_read()?
A:  You are asking the question, so never.

Q:  But I need to check the value of the atomic at this point in time...
A:  Your code is buggy if it needs to do that on an atomic_t for  
anything other than debugging or optimization.  Use either  
atomic_*_return() or a lock and some normal integers.

Q:  "So why can't the atomic_read DTRT magically?"
A:  Because "the right thing" depends on the situation and is usually  
best done with something other than atomic_t.

If somebody can post some non-buggy code which is correctly using  
atomic_read() *and* depends on the compiler generating extra  
nonsensical loads due to "volatile" then the issue *might* be  
reconsidered.  This also includes samples of code which uses  
atomic_read() and needs memory barriers (so that we can fix the buggy  
code, not so we can change atomic_read()).  So far the only code  
samples anybody has posted are buggy regardless of whether or not the  
value and/or accessors are flagged "volatile" or not.  And hey, maybe  
the volatile ops *should* be implemented in inline ASM for future- 
proof-ness, but that's a separate issue.

Cheers,
Kyle Moffett

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 18:59                                           ` Christoph Lameter
@ 2007-09-10 20:54                                             ` Paul E. McKenney
  2007-09-10 21:36                                               ` Christoph Lameter
  2007-09-11  2:27                                             ` Segher Boessenkool
  1 sibling, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2007-09-10 20:54 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Segher Boessenkool, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	David Miller, Ilpo Järvinen, ak, cfriesen, rpjday, Netdev,
	jesper.juhl, linux-arch, Andrew Morton, zlynx, schwidefsky,
	Chris Snook, Herbert Xu, Linus Torvalds, wensong, wjiang

On Mon, Sep 10, 2007 at 11:59:29AM -0700, Christoph Lameter wrote:
> On Fri, 17 Aug 2007, Segher Boessenkool wrote:
> 
> > "volatile" has nothing to do with reordering.  atomic_dec() writes
> > to memory, so it _does_ have "volatile semantics", implicitly, as
> > long as the compiler cannot optimise the atomic variable away
> > completely -- any store counts as a side effect.
> 
> Stores can be reordered. Only x86 has (mostly) implicit write ordering. So 
> no atomic_dec has no volatile semantics and may be reordered on a variety 
> of processors. Writes to memory may not follow code order on several 
> processors.

The one exception to this being the case where process-level code is
communicating to an interrupt handler running on that same CPU -- on
all CPUs that I am aware of, a given CPU always sees its own writes
in order.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 20:54                                             ` Paul E. McKenney
@ 2007-09-10 21:36                                               ` Christoph Lameter
  2007-09-10 21:50                                                 ` Paul E. McKenney
  0 siblings, 1 reply; 1546+ messages in thread
From: Christoph Lameter @ 2007-09-10 21:36 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Segher Boessenkool, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	David Miller, Ilpo Järvinen, ak, cfriesen, rpjday, Netdev,
	jesper.juhl, linux-arch, Andrew Morton, zlynx, schwidefsky,
	Chris Snook, Herbert Xu, Linus Torvalds, wensong, wjiang

On Mon, 10 Sep 2007, Paul E. McKenney wrote:

> The one exception to this being the case where process-level code is
> communicating to an interrupt handler running on that same CPU -- on
> all CPUs that I am aware of, a given CPU always sees its own writes
> in order.

Yes but that is due to the code path effectively continuing in the 
interrupt handler. The cpu makes sure that op codes being executed always 
see memory in a consistent way. The basic ordering problem with out of 
order writes is therefore coming from other processors concurrently 
executing code and holding variables in registers that are modified 
elsewhere. The only solution that I know of are one or the other form of 
barrier.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 21:36                                               ` Christoph Lameter
@ 2007-09-10 21:50                                                 ` Paul E. McKenney
  0 siblings, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-09-10 21:50 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Segher Boessenkool, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	David Miller, Ilpo Järvinen, ak, cfriesen, rpjday, Netdev,
	jesper.juhl, linux-arch, Andrew Morton, zlynx, schwidefsky,
	Chris Snook, Herbert Xu, Linus Torvalds, wensong, wjiang

On Mon, Sep 10, 2007 at 02:36:26PM -0700, Christoph Lameter wrote:
> On Mon, 10 Sep 2007, Paul E. McKenney wrote:
> 
> > The one exception to this being the case where process-level code is
> > communicating to an interrupt handler running on that same CPU -- on
> > all CPUs that I am aware of, a given CPU always sees its own writes
> > in order.
> 
> Yes but that is due to the code path effectively continuing in the 
> interrupt handler. The cpu makes sure that op codes being executed always 
> see memory in a consistent way. The basic ordering problem with out of 
> order writes is therefore coming from other processors concurrently 
> executing code and holding variables in registers that are modified 
> elsewhere. The only solution that I know of are one or the other form of 
> barrier.

So we are agreed then -- volatile accesses may be of some assistance when
interacting with interrupt handlers running on the same CPU (presumably
when using per-CPU variables), but are generally useless when sharing
variables among CPUs.  Correct?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* [PATCH] Document non-semantics of atomic_read() and atomic_set()
  2007-09-10 15:09                                                                           ` Linus Torvalds
  2007-09-10 16:46                                                                             ` Denys Vlasenko
  2007-09-10 18:59                                                                             ` Christoph Lameter
@ 2007-09-10 23:19                                                                             ` Chris Snook
  2007-09-10 23:44                                                                               ` Paul E. McKenney
  2007-09-11 19:35                                                                               ` Christoph Lameter
  2 siblings, 2 replies; 1546+ messages in thread
From: Chris Snook @ 2007-09-10 23:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Denys Vlasenko, Kyle Moffett, Arjan van de Ven, Nick Piggin,
	Satyam Sharma, Herbert Xu, Paul Mackerras, Christoph Lameter,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

From: Chris Snook <csnook@redhat.com>

Unambiguously document the fact that atomic_read() and atomic_set()
do not imply any ordering or memory access, and that callers are
obligated to explicitly invoke barriers as needed to ensure that
changes to atomic variables are visible in all contexts that need
to see them.

Signed-off-by: Chris Snook <csnook@redhat.com>

--- a/Documentation/atomic_ops.txt	2007-07-08 19:32:17.000000000 -0400
+++ b/Documentation/atomic_ops.txt	2007-09-10 19:02:50.000000000 -0400
@@ -12,7 +12,11 @@
 C integer type will fail.  Something like the following should
 suffice:
 
-	typedef struct { volatile int counter; } atomic_t;
+	typedef struct { int counter; } atomic_t;
+
+	Historically, counter has been declared volatile.  This is now
+discouraged.  See Documentation/volatile-considered-harmful.txt for the
+complete rationale.
 
 	The first operations to implement for atomic_t's are the
 initializers and plain reads.
@@ -42,6 +46,22 @@
 
 which simply reads the current value of the counter.
 
+*** WARNING: atomic_read() and atomic_set() DO NOT IMPLY BARRIERS! ***
+
+Some architectures may choose to use the volatile keyword, barriers, or
+inline assembly to guarantee some degree of immediacy for atomic_read()
+and atomic_set().  This is not uniformly guaranteed, and may change in
+the future, so all users of atomic_t should treat atomic_read() and
+atomic_set() as simple C assignment statements that may be reordered or
+optimized away entirely by the compiler or processor, and explicitly
+invoke the appropriate compiler and/or memory barrier for each use case.
+Failure to do so will result in code that may suddenly break when used with
+different architectures or compiler optimizations, or even changes in
+unrelated code which changes how the compiler optimizes the section
+accessing atomic_t variables.
+
+*** YOU HAVE BEEN WARNED! ***
+
 Now, we move onto the actual atomic operation interfaces.
 
 	void atomic_add(int i, atomic_t *v);

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH] Document non-semantics of atomic_read() and atomic_set()
  2007-09-10 23:19                                                                             ` [PATCH] Document non-semantics of atomic_read() and atomic_set() Chris Snook
@ 2007-09-10 23:44                                                                               ` Paul E. McKenney
  2007-09-11 19:35                                                                               ` Christoph Lameter
  1 sibling, 0 replies; 1546+ messages in thread
From: Paul E. McKenney @ 2007-09-10 23:44 UTC (permalink / raw)
  To: Chris Snook
  Cc: Linus Torvalds, Denys Vlasenko, Kyle Moffett, Arjan van de Ven,
	Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Ilpo Jarvinen, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Mon, Sep 10, 2007 at 07:19:44PM -0400, Chris Snook wrote:
> From: Chris Snook <csnook@redhat.com>
> 
> Unambiguously document the fact that atomic_read() and atomic_set()
> do not imply any ordering or memory access, and that callers are
> obligated to explicitly invoke barriers as needed to ensure that
> changes to atomic variables are visible in all contexts that need
> to see them.

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> Signed-off-by: Chris Snook <csnook@redhat.com>
> 
> --- a/Documentation/atomic_ops.txt	2007-07-08 19:32:17.000000000 -0400
> +++ b/Documentation/atomic_ops.txt	2007-09-10 19:02:50.000000000 -0400
> @@ -12,7 +12,11 @@
>  C integer type will fail.  Something like the following should
>  suffice:
> 
> -	typedef struct { volatile int counter; } atomic_t;
> +	typedef struct { int counter; } atomic_t;
> +
> +	Historically, counter has been declared volatile.  This is now
> +discouraged.  See Documentation/volatile-considered-harmful.txt for the
> +complete rationale.
> 
>  	The first operations to implement for atomic_t's are the
>  initializers and plain reads.
> @@ -42,6 +46,22 @@
> 
>  which simply reads the current value of the counter.
> 
> +*** WARNING: atomic_read() and atomic_set() DO NOT IMPLY BARRIERS! ***
> +
> +Some architectures may choose to use the volatile keyword, barriers, or
> +inline assembly to guarantee some degree of immediacy for atomic_read()
> +and atomic_set().  This is not uniformly guaranteed, and may change in
> +the future, so all users of atomic_t should treat atomic_read() and
> +atomic_set() as simple C assignment statements that may be reordered or
> +optimized away entirely by the compiler or processor, and explicitly
> +invoke the appropriate compiler and/or memory barrier for each use case.
> +Failure to do so will result in code that may suddenly break when used with
> +different architectures or compiler optimizations, or even changes in
> +unrelated code which changes how the compiler optimizes the section
> +accessing atomic_t variables.
> +
> +*** YOU HAVE BEEN WARNED! ***
> +
>  Now, we move onto the actual atomic operation interfaces.
> 
>  	void atomic_add(int i, atomic_t *v);

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 18:59                                           ` Christoph Lameter
  2007-09-10 20:54                                             ` Paul E. McKenney
@ 2007-09-11  2:27                                             ` Segher Boessenkool
  1 sibling, 0 replies; 1546+ messages in thread
From: Segher Boessenkool @ 2007-09-11  2:27 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

>> "volatile" has nothing to do with reordering.  atomic_dec() writes
>> to memory, so it _does_ have "volatile semantics", implicitly, as
>> long as the compiler cannot optimise the atomic variable away
>> completely -- any store counts as a side effect.
>
> Stores can be reordered. Only x86 has (mostly) implicit write ordering.
> So no atomic_dec has no volatile semantics

Read again: I said the C "volatile" construct has nothing to do
with CPU memory access reordering.

> and may be reordered on a variety
> of processors. Writes to memory may not follow code order on several
> processors.

The _compiler_ isn't allowed to reorder things here.  Yes, of course
you do need stronger barriers for many purposes, volatile isn't all
that useful you know.


Segher


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: [PATCH] Document non-semantics of atomic_read() and atomic_set()
  2007-09-10 23:19                                                                             ` [PATCH] Document non-semantics of atomic_read() and atomic_set() Chris Snook
  2007-09-10 23:44                                                                               ` Paul E. McKenney
@ 2007-09-11 19:35                                                                               ` Christoph Lameter
  1 sibling, 0 replies; 1546+ messages in thread
From: Christoph Lameter @ 2007-09-11 19:35 UTC (permalink / raw)
  To: Chris Snook
  Cc: Linus Torvalds, Denys Vlasenko, Kyle Moffett, Arjan van de Ven,
	Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Acked-by: Christoph Lameter <clameter@sgi.com>


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2011-09-26  4:52 ` NeilBrown
@ 2011-09-26  7:03   ` Roman Mamedov
  2011-09-26 23:23     ` Re: Kenn
  2011-09-26  7:42   ` Re: Kenn
  1 sibling, 1 reply; 1546+ messages in thread
From: Roman Mamedov @ 2011-09-26  7:03 UTC (permalink / raw)
  To: NeilBrown; +Cc: kenn, linux-raid

[-- Attachment #1: Type: text/plain, Size: 917 bytes --]

On Mon, 26 Sep 2011 14:52:48 +1000
NeilBrown <neilb@suse.de> wrote:

> On Sun, 25 Sep 2011 21:23:31 -0700 "Kenn" <kenn@kenn.us> wrote:
> 
> > I have a raid5 array that had a drive drop out, and resilvered the wrong
> > drive when I put it back in, corrupting and destroying the raid.  I
> > stopped the array at less than 1% resilvering and I'm in the process of
> > making a dd-copy of the drive to recover the files.
> 
> I don't know what you mean by "resilvered".

At first I thought the initial poster just invented some peculiar funny word of his own, but it looks like it's from the ZFS circles:
https://encrypted.google.com/search?q=resilver+zfs
@Kenn; you probably mean 'resync' or 'rebuild', but no one ever calls those processes 'resilver' here, you'll get no google results and blank/unknowing/funny looks from people when using that term in relation to mdadm.

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2011-09-26  4:52 ` NeilBrown
  2011-09-26  7:03   ` Re: Roman Mamedov
@ 2011-09-26  7:42   ` Kenn
  2011-09-26  8:04     ` Re: NeilBrown
  1 sibling, 1 reply; 1546+ messages in thread
From: Kenn @ 2011-09-26  7:42 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Replying.  I realize and I apologize I didn't create a subject.  I hope
this doesn't confuse majordomo.

> On Sun, 25 Sep 2011 21:23:31 -0700 "Kenn" <kenn@kenn.us> wrote:
>
>> I have a raid5 array that had a drive drop out, and resilvered the wrong
>> drive when I put it back in, corrupting and destroying the raid.  I
>> stopped the array at less than 1% resilvering and I'm in the process of
>> making a dd-copy of the drive to recover the files.
>
> I don't know what you mean by "resilvered".

Resilvering -- Rebuilding the array.  Lesser used term, sorry!

>
>>
>> (1) Is there anything diagnostic I can contribute to add more
>> wrong-drive-resilvering protection to mdadm?  I have the command history
>> showing everything I did, I have the five drives available for reading
>> sectors, I haven't touched anything yet.
>
> Yes, report the command history, and any relevant kernel logs, and the
> output
> of "mdadm --examine" on all relevant devices.
>
> NeilBrown

Awesome!  I hope this is useful.  It's really long, so I edited down the
logs and command history to what I thought were the important bits.  If
you want more, I can post unedited versions, please let me know.

### Command History ###

# The start of the sequence, removing sde from array
mdadm --examine /dev/sde
mdadm --detail /dev/md3
cat /proc/mdstat
mdadm /dev/md3 --remove /dev/sde1
mdadm /dev/md3 --remove /dev/sde
mdadm /dev/md3 --fail /dev/sde1
cat /proc/mdstat
mdadm --examine /dev/sde1
fdisk -l | grep 750
mdadm --examine /dev/sde1
mdadm --remove /dev/sde
mdadm /dev/md3 --remove /dev/sde
mdadm /dev/md3 --fail /dev/sde
fdisk /dev/sde
ls
vi /var/log/syslog
reboot
vi /var/log/syslog
reboot
mdadm --detail /dev/md3
mdadm --examine /dev/sde1
# Wiping sde
fdisk /dev/sde
newfs -t ext3 /dev/sde1
mkfs -t ext3 /dev/sde1
mkfs -t ext3 /dev/sde2
fdisk /dev/sde
mdadm --stop /dev/md3
# Putting sde back into array
mdadm --examine /dev/sde
mdadm --help
mdadm --misc --help
mdadm --zero-superblock /dev/sde
mdadm --query /dev/sde
mdadm --examine /dev/sde
mdadm --detail /dev/sde
mdadm --detail /dev/sde1
fdisk /dev/sde
mdadm --assemble --no-degraded /dev/md3  /dev/hde1 /dev/hdi1 /dev/sde1
/dev/hdk1 /dev/hdg1
cat /proc/mdstat
mdadm --stop /dev/md3
mdadm --create /dev/md3 --level=5 --raid-devices=5  /dev/hde1 /dev/hdi1
missing /dev/hdk1 /dev/hdg1
mount -o ro /raid53
ls /raid53
umount /raid53
mdadm --stop /dev/md3
# The command that did the bad rebuild
mdadm --create /dev/md3 --level=5 --raid-devices=5  /dev/hde1 /dev/hdi1
/dev/sde1 /dev/hdk1 /dev/hdg1
cat /proc/mdstat
mdadm --examine /dev/md3
mdadm --query /dev/md3
mdadm --detail /dev/md3
mount /raid53
mdadm --stop /dev/md3
# Trying to get the corrupted disk back up
mdadm --create /dev/md3 --level=5 --raid-devices=5  /dev/hde1 /dev/hdi1
missing /dev/hdk1 /dev/hdg1
cat /proc/mdstat
mount /raid53
fsck -n /dev/md3



### KERNEL LOGS ###

# Me messing around with fdisk and mdadm creating new partitions to wipe
out sde
Sep 22 15:56:39 teresa kernel: [ 7897.778204] sd 5:0:0:0: [sde] 1465149168
512-byte hardware sectors (750156 MB)
Sep 22 15:56:39 teresa kernel: [ 7897.778204] sd 5:0:0:0: [sde] Write
Protect is off
Sep 22 15:56:39 teresa kernel: [ 7897.778204] sd 5:0:0:0: [sde] Mode
Sense: 00 3a 00 00
Sep 22 15:56:39 teresa kernel: [ 7897.778204] sd 5:0:0:0: [sde] Write
cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep 22 15:56:39 teresa kernel: [ 7897.778204]  sde: sde1 sde2
Sep 22 15:56:41 teresa kernel: [ 7899.848026] sd 5:0:0:0: [sde] 1465149168
512-byte hardware sectors (750156 MB)
Sep 22 15:56:41 teresa kernel: [ 7899.848026] sd 5:0:0:0: [sde] Write
Protect is off
Sep 22 15:56:41 teresa kernel: [ 7899.848026] sd 5:0:0:0: [sde] Mode
Sense: 00 3a 00 00
Sep 22 15:56:41 teresa kernel: [ 7899.848026] sd 5:0:0:0: [sde] Write
cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep 22 15:56:41 teresa kernel: [ 7899.848026]  sde: sde1 sde2
Sep 22 16:01:49 teresa kernel: [ 8207.733821] sd 5:0:0:0: [sde] 1465149168
512-byte hardware sectors (750156 MB)
Sep 22 16:01:49 teresa kernel: [ 8207.733919] sd 5:0:0:0: [sde] Write
Protect is off
Sep 22 16:01:49 teresa kernel: [ 8207.733943] sd 5:0:0:0: [sde] Mode
Sense: 00 3a 00 00
Sep 22 16:01:49 teresa kernel: [ 8207.734039] sd 5:0:0:0: [sde] Write
cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep 22 16:01:49 teresa kernel: [ 8207.734083]  sde: sde1
Sep 22 16:01:51 teresa kernel: [ 8209.777260] sd 5:0:0:0: [sde] 1465149168
512-byte hardware sectors (750156 MB)
Sep 22 16:01:51 teresa kernel: [ 8209.777260] sd 5:0:0:0: [sde] Write
Protect is off
Sep 22 16:01:51 teresa kernel: [ 8209.777260] sd 5:0:0:0: [sde] Mode
Sense: 00 3a 00 00
Sep 22 16:01:51 teresa kernel: [ 8209.777260] sd 5:0:0:0: [sde] Write
cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep 22 16:01:51 teresa kernel: [ 8209.777260]  sde: sde1
Sep 22 16:02:09 teresa mdadm[2694]: DeviceDisappeared event detected on md
device /dev/md3
Sep 22 16:02:09 teresa kernel: [ 8227.781860] md: md3 stopped.
Sep 22 16:02:09 teresa kernel: [ 8227.781908] md: unbind<hde1>
Sep 22 16:02:09 teresa kernel: [ 8227.781937] md: export_rdev(hde1)
Sep 22 16:02:09 teresa kernel: [ 8227.782261] md: unbind<hdg1>
Sep 22 16:02:09 teresa kernel: [ 8227.782292] md: export_rdev(hdg1)
Sep 22 16:02:09 teresa kernel: [ 8227.782561] md: unbind<hdk1>
Sep 22 16:02:09 teresa kernel: [ 8227.782590] md: export_rdev(hdk1)
Sep 22 16:02:09 teresa kernel: [ 8227.782855] md: unbind<hdi1>
Sep 22 16:02:09 teresa kernel: [ 8227.782885] md: export_rdev(hdi1)
Sep 22 16:15:32 teresa smartd[2657]: Device: /dev/hda, Failed SMART usage
Attribute: 194 Temperature_Celsius.
Sep 22 16:15:33 teresa smartd[2657]: Device: /dev/hdk, SMART Usage
Attribute: 194 Temperature_Celsius changed from 110 to 111
Sep 22 16:15:33 teresa smartd[2657]: Device: /dev/sdb, SMART Usage
Attribute: 194 Temperature_Celsius changed from 113 to 116
Sep 22 16:15:33 teresa smartd[2657]: Device: /dev/sdc, SMART Usage
Attribute: 190 Airflow_Temperature_Cel changed from 52 to 51
Sep 22 16:17:01 teresa /USR/SBIN/CRON[2965]: (root) CMD (   cd / &&
run-parts --report /etc/cron.hourly)
Sep 22 16:18:42 teresa kernel: [ 9220.400915] md: md3 stopped.
Sep 22 16:18:42 teresa kernel: [ 9220.411525] md: bind<hdi1>
Sep 22 16:18:42 teresa kernel: [ 9220.411884] md: bind<sde1>
Sep 22 16:18:42 teresa kernel: [ 9220.412577] md: bind<hdk1>
Sep 22 16:18:42 teresa kernel: [ 9220.413162] md: bind<hdg1>
Sep 22 16:18:42 teresa kernel: [ 9220.413750] md: bind<hde1>
Sep 22 16:18:42 teresa kernel: [ 9220.413855] md: kicking non-fresh sde1
from array!
Sep 22 16:18:42 teresa kernel: [ 9220.413887] md: unbind<sde1>
Sep 22 16:18:42 teresa kernel: [ 9220.413915] md: export_rdev(sde1)
Sep 22 16:18:42 teresa kernel: [ 9220.477393] raid5: device hde1
operational as raid disk 0
Sep 22 16:18:42 teresa kernel: [ 9220.477420] raid5: device hdg1
operational as raid disk 4
Sep 22 16:18:42 teresa kernel: [ 9220.477438] raid5: device hdk1
operational as raid disk 3
Sep 22 16:18:42 teresa kernel: [ 9220.477456] raid5: device hdi1
operational as raid disk 1
Sep 22 16:18:42 teresa kernel: [ 9220.478236] raid5: allocated 5252kB for md3
Sep 22 16:18:42 teresa kernel: [ 9220.478265] raid5: raid level 5 set md3
active with 4 out of 5 devices, algorithm 2
Sep 22 16:18:42 teresa kernel: [ 9220.478294] RAID5 conf printout:
Sep 22 16:18:42 teresa kernel: [ 9220.478309]  --- rd:5 wd:4
Sep 22 16:18:42 teresa kernel: [ 9220.478324]  disk 0, o:1, dev:hde1
Sep 22 16:18:42 teresa kernel: [ 9220.478339]  disk 1, o:1, dev:hdi1
Sep 22 16:18:42 teresa kernel: [ 9220.478354]  disk 3, o:1, dev:hdk1
Sep 22 16:18:42 teresa kernel: [ 9220.478369]  disk 4, o:1, dev:hdg1
# Me stopping md3
Sep 22 16:18:53 teresa mdadm[2694]: DeviceDisappeared event detected on md
device /dev/md3
Sep 22 16:18:53 teresa kernel: [ 9231.572348] md: md3 stopped.
Sep 22 16:18:53 teresa kernel: [ 9231.572394] md: unbind<hde1>
Sep 22 16:18:53 teresa kernel: [ 9231.572423] md: export_rdev(hde1)
Sep 22 16:18:53 teresa kernel: [ 9231.572728] md: unbind<hdg1>
Sep 22 16:18:53 teresa kernel: [ 9231.572758] md: export_rdev(hdg1)
Sep 22 16:18:53 teresa kernel: [ 9231.572988] md: unbind<hdk1>
Sep 22 16:18:53 teresa kernel: [ 9231.573015] md: export_rdev(hdk1)
Sep 22 16:18:53 teresa kernel: [ 9231.573243] md: unbind<hdi1>
Sep 22 16:18:53 teresa kernel: [ 9231.573270] md: export_rdev(hdi1)
# Me creating md3 with sde1 missing
Sep 22 16:19:51 teresa kernel: [ 9289.621646] md: bind<hde1>
Sep 22 16:19:51 teresa kernel: [ 9289.665268] md: bind<hdi1>
Sep 22 16:19:51 teresa kernel: [ 9289.695676] md: bind<hdk1>
Sep 22 16:19:51 teresa kernel: [ 9289.726906] md: bind<hdg1>
Sep 22 16:19:51 teresa kernel: [ 9289.809030] raid5: device hdg1
operational as raid disk 4
Sep 22 16:19:51 teresa kernel: [ 9289.809057] raid5: device hdk1
operational as raid disk 3
Sep 22 16:19:51 teresa kernel: [ 9289.809075] raid5: device hdi1
operational as raid disk 1
Sep 22 16:19:51 teresa kernel: [ 9289.809093] raid5: device hde1
operational as raid disk 0
Sep 22 16:19:51 teresa kernel: [ 9289.809821] raid5: allocated 5252kB for md3
Sep 22 16:19:51 teresa kernel: [ 9289.809850] raid5: raid level 5 set md3
active with 4 out of 5 devices, algorithm 2
Sep 22 16:19:51 teresa kernel: [ 9289.809877] RAID5 conf printout:
Sep 22 16:19:51 teresa kernel: [ 9289.809891]  --- rd:5 wd:4
Sep 22 16:19:51 teresa kernel: [ 9289.809907]  disk 0, o:1, dev:hde1
Sep 22 16:19:51 teresa kernel: [ 9289.809922]  disk 1, o:1, dev:hdi1
Sep 22 16:19:51 teresa kernel: [ 9289.809937]  disk 3, o:1, dev:hdk1
Sep 22 16:19:51 teresa kernel: [ 9289.809953]  disk 4, o:1, dev:hdg1
Sep 22 16:20:20 teresa kernel: [ 9318.486512] kjournald starting.  Commit
interval 5 seconds
Sep 22 16:20:20 teresa kernel: [ 9318.486512] EXT3-fs: mounted filesystem
with ordered data mode.
# Me stopping md3 again
Sep 22 16:20:42 teresa mdadm[2694]: DeviceDisappeared event detected on md
device /dev/md3
Sep 22 16:20:42 teresa kernel: [ 9340.300590] md: md3 stopped.
Sep 22 16:20:42 teresa kernel: [ 9340.300639] md: unbind<hdg1>
Sep 22 16:20:42 teresa kernel: [ 9340.300668] md: export_rdev(hdg1)
Sep 22 16:20:42 teresa kernel: [ 9340.300921] md: unbind<hdk1>
Sep 22 16:20:42 teresa kernel: [ 9340.300950] md: export_rdev(hdk1)
Sep 22 16:20:42 teresa kernel: [ 9340.301183] md: unbind<hdi1>
Sep 22 16:20:42 teresa kernel: [ 9340.301211] md: export_rdev(hdi1)
Sep 22 16:20:42 teresa kernel: [ 9340.301438] md: unbind<hde1>
Sep 22 16:20:42 teresa kernel: [ 9340.301465] md: export_rdev(hde1)
# This is me doing the fatal create, that recovers the wrong disk
Sep 22 16:21:39 teresa kernel: [ 9397.609864] md: bind<hde1>
Sep 22 16:21:39 teresa kernel: [ 9397.652426] md: bind<hdi1>
Sep 22 16:21:39 teresa kernel: [ 9397.673203] md: bind<sde1>
Sep 22 16:21:39 teresa kernel: [ 9397.699373] md: bind<hdk1>
Sep 22 16:21:39 teresa kernel: [ 9397.739372] md: bind<hdg1>
Sep 22 16:21:39 teresa kernel: [ 9397.801729] raid5: device hdk1
operational as raid disk 3
Sep 22 16:21:39 teresa kernel: [ 9397.801756] raid5: device sde1
operational as raid disk 2
Sep 22 16:21:39 teresa kernel: [ 9397.801774] raid5: device hdi1
operational as raid disk 1
Sep 22 16:21:39 teresa kernel: [ 9397.801793] raid5: device hde1
operational as raid disk 0
Sep 22 16:21:39 teresa kernel: [ 9397.802531] raid5: allocated 5252kB for md3
Sep 22 16:21:39 teresa kernel: [ 9397.802559] raid5: raid level 5 set md3
active with 4 out of 5 devices, algorithm 2
Sep 22 16:21:39 teresa kernel: [ 9397.802586] RAID5 conf printout:
Sep 22 16:21:39 teresa kernel: [ 9397.802600]  --- rd:5 wd:4
Sep 22 16:21:39 teresa kernel: [ 9397.802615]  disk 0, o:1, dev:hde1
Sep 22 16:21:39 teresa kernel: [ 9397.802631]  disk 1, o:1, dev:hdi1
Sep 22 16:21:39 teresa kernel: [ 9397.802646]  disk 2, o:1, dev:sde1
Sep 22 16:21:39 teresa kernel: [ 9397.802661]  disk 3, o:1, dev:hdk1
Sep 22 16:21:39 teresa kernel: [ 9397.838429] RAID5 conf printout:
Sep 22 16:21:39 teresa kernel: [ 9397.838454]  --- rd:5 wd:4
Sep 22 16:21:39 teresa kernel: [ 9397.838471]  disk 0, o:1, dev:hde1
Sep 22 16:21:39 teresa kernel: [ 9397.838486]  disk 1, o:1, dev:hdi1
Sep 22 16:21:39 teresa kernel: [ 9397.838502]  disk 2, o:1, dev:sde1
Sep 22 16:21:39 teresa kernel: [ 9397.838518]  disk 3, o:1, dev:hdk1
Sep 22 16:21:39 teresa kernel: [ 9397.838533]  disk 4, o:1, dev:hdg1
Sep 22 16:21:39 teresa mdadm[2694]: RebuildStarted event detected on md
device /dev/md3
Sep 22 16:21:39 teresa kernel: [ 9397.841822] md: recovery of RAID array md3
Sep 22 16:21:39 teresa kernel: [ 9397.841848] md: minimum _guaranteed_ 
speed: 1000 KB/sec/disk.
Sep 22 16:21:39 teresa kernel: [ 9397.841868] md: using maximum available
idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
Sep 22 16:21:39 teresa kernel: [ 9397.841908] md: using 128k window, over
a total of 732571904 blocks.
Sep 22 16:22:33 teresa kernel: [ 9451.640192] EXT3-fs error (device md3):
ext3_check_descriptors: Block bitmap for group 3968 not in group (block
0)!
Sep 22 16:22:33 teresa kernel: [ 9451.750241] EXT3-fs: group descriptors
corrupted!
Sep 22 16:22:39 teresa kernel: [ 9458.079151] md: md_do_sync() got signal
... exiting
Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: md3 stopped.
Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<hdg1>
Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(hdg1)
Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<hdk1>
Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(hdk1)
Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<sde1>
Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(sde1)
Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<hdi1>
Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(hdi1)
Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<hde1>
Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(hde1)
Sep 22 16:22:39 teresa mdadm[2694]: DeviceDisappeared event detected on md
device /dev/md3
# Me trying to recreate md3 without sde
Sep 22 16:23:50 teresa kernel: [ 9529.065477] md: bind<hde1>
Sep 22 16:23:50 teresa kernel: [ 9529.107767] md: bind<hdi1>
Sep 22 16:23:50 teresa kernel: [ 9529.137743] md: bind<hdk1>
Sep 22 16:23:50 teresa kernel: [ 9529.177990] md: bind<hdg1>
Sep 22 16:23:51 teresa mdadm[2694]: RebuildFinished event detected on md
device /dev/md3
Sep 22 16:23:51 teresa kernel: [ 9529.240814] raid5: device hdg1
operational as raid disk 4
Sep 22 16:23:51 teresa kernel: [ 9529.241734] raid5: device hdk1
operational as raid disk 3
Sep 22 16:23:51 teresa kernel: [ 9529.241752] raid5: device hdi1
operational as raid disk 1
Sep 22 16:23:51 teresa kernel: [ 9529.241770] raid5: device hde1
operational as raid disk 0
Sep 22 16:23:51 teresa kernel: [ 9529.242520] raid5: allocated 5252kB for md3
Sep 22 16:23:51 teresa kernel: [ 9529.242547] raid5: raid level 5 set md3
active with 4 out of 5 devices, algorithm 2
Sep 22 16:23:51 teresa kernel: [ 9529.242574] RAID5 conf printout:
Sep 22 16:23:51 teresa kernel: [ 9529.242588]  --- rd:5 wd:4
Sep 22 16:23:51 teresa kernel: [ 9529.242603]  disk 0, o:1, dev:hde1
Sep 22 16:23:51 teresa kernel: [ 9529.242618]  disk 1, o:1, dev:hdi1
Sep 22 16:23:51 teresa kernel: [ 9529.242633]  disk 3, o:1, dev:hdk1
Sep 22 16:23:51 teresa kernel: [ 9529.242649]  disk 4, o:1, dev:hdg1
# And me trying a fsck -n or a mount
Sep 22 16:24:07 teresa kernel: [ 9545.326343] EXT3-fs error (device md3):
ext3_check_descriptors: Block bitmap for group 3968 not in group (block
0)!
Sep 22 16:24:07 teresa kernel: [ 9545.369071] EXT3-fs: group descriptors
corrupted!


### EXAMINES OF PARTITIONS ###

=== --examine /dev/hde1 ===
/dev/hde1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host teresa)
  Creation Time : Thu Sep 22 16:23:50 2011
     Raid Level : raid5
  Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
     Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
   Raid Devices : 5
  Total Devices : 4
Preferred Minor : 3

    Update Time : Sun Sep 25 22:11:22 2011
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 0
       Checksum : b7f6a3c0 - correct
         Events : 10

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0      33        1        0      active sync   /dev/hde1

   0     0      33        1        0      active sync   /dev/hde1
   1     1      56        1        1      active sync   /dev/hdi1
   2     2       0        0        2      faulty removed
   3     3      57        1        3      active sync   /dev/hdk1
   4     4      34        1        4      active sync   /dev/hdg1

=== --examine /dev/hdi1 ===
/dev/hdi1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host teresa)
  Creation Time : Thu Sep 22 16:23:50 2011
     Raid Level : raid5
  Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
     Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
   Raid Devices : 5
  Total Devices : 4
Preferred Minor : 3

    Update Time : Sun Sep 25 22:11:22 2011
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 0
       Checksum : b7f6a3d9 - correct
         Events : 10

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1      56        1        1      active sync   /dev/hdi1

   0     0      33        1        0      active sync   /dev/hde1
   1     1      56        1        1      active sync   /dev/hdi1
   2     2       0        0        2      faulty removed
   3     3      57        1        3      active sync   /dev/hdk1
   4     4      34        1        4      active sync   /dev/hdg1

=== --examine /dev/sde1 ===
/dev/sde1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : e6e3df36:1195239f:47f7b12e:9c2b2218 (local to host teresa)
  Creation Time : Thu Sep 22 16:21:39 2011
     Raid Level : raid5
  Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
     Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 3

    Update Time : Thu Sep 22 16:22:39 2011
          State : clean
 Active Devices : 4
Working Devices : 5
 Failed Devices : 1
  Spare Devices : 1
       Checksum : 4e69d679 - correct
         Events : 8

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       65        2      active sync   /dev/sde1

   0     0      33        1        0      active sync   /dev/hde1
   1     1      56        1        1      active sync   /dev/hdi1
   2     2       8       65        2      active sync   /dev/sde1
   3     3      57        1        3      active sync   /dev/hdk1
   4     4       0        0        4      faulty removed
   5     5      34        1        5      spare   /dev/hdg1

=== --examine /dev/hdk1 ===
/dev/hdk1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host teresa)
  Creation Time : Thu Sep 22 16:23:50 2011
     Raid Level : raid5
  Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
     Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
   Raid Devices : 5
  Total Devices : 4
Preferred Minor : 3

    Update Time : Sun Sep 25 22:11:22 2011
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 0
       Checksum : b7f6a3de - correct
         Events : 10

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3      57        1        3      active sync   /dev/hdk1

   0     0      33        1        0      active sync   /dev/hde1
   1     1      56        1        1      active sync   /dev/hdi1
   2     2       0        0        2      faulty removed
   3     3      57        1        3      active sync   /dev/hdk1
   4     4      34        1        4      active sync   /dev/hdg1

=== --examine /dev/hdg1 ===
/dev/hdg1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host teresa)
  Creation Time : Thu Sep 22 16:23:50 2011
     Raid Level : raid5
  Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
     Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
   Raid Devices : 5
  Total Devices : 4
Preferred Minor : 3

    Update Time : Sun Sep 25 22:11:22 2011
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 0
       Checksum : b7f6a3c9 - correct
         Events : 10

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     4      34        1        4      active sync   /dev/hdg1

   0     0      33        1        0      active sync   /dev/hde1
   1     1      56        1        1      active sync   /dev/hdi1
   2     2       0        0        2      faulty removed
   3     3      57        1        3      active sync   /dev/hdk1
   4     4      34        1        4      active sync   /dev/hdg1




>
>
>>
>> (2) Can I suggest improvements into resilvering?  Can I contribute code
>> to
>> implement them?  Such as resilver from the end of the drive back to the
>> front, so if you notice the wrong drive resilvering, you can stop and
>> not
>> lose the MBR and the directory format structure that's stored in the
>> first
>> few sectors?  I'd also like to take a look at adding a raid mode where
>> there's checksum in every stripe block so the system can detect
>> corrupted
>> disks and not resilver.  I'd also like to add a raid option where a
>> resilvering need will be reported by email and needs to be started
>> manually.  All to prevent what happened to me from happening again.
>>
>> Thanks for your time.
>>
>> Kenn Frank
>>
>> P.S.  Setup:
>>
>> # uname -a
>> Linux teresa 2.6.26-2-686 #1 SMP Sat Jun 11 14:54:10 UTC 2011 i686
>> GNU/Linux
>>
>> # mdadm --version
>> mdadm - v2.6.7.2 - 14th November 2008
>>
>> # mdadm --detail /dev/md3
>> /dev/md3:
>>         Version : 00.90
>>   Creation Time : Thu Sep 22 16:23:50 2011
>>      Raid Level : raid5
>>      Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
>>   Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
>>    Raid Devices : 5
>>   Total Devices : 4
>> Preferred Minor : 3
>>     Persistence : Superblock is persistent
>>
>>     Update Time : Thu Sep 22 20:19:09 2011
>>           State : clean, degraded
>>  Active Devices : 4
>> Working Devices : 4
>>  Failed Devices : 0
>>   Spare Devices : 0
>>
>>          Layout : left-symmetric
>>      Chunk Size : 64K
>>
>>            UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host
>> teresa)
>>          Events : 0.6
>>
>>     Number   Major   Minor   RaidDevice State
>>        0      33        1        0      active sync   /dev/hde1
>>        1      56        1        1      active sync   /dev/hdi1
>>        2       0        0        2      removed
>>        3      57        1        3      active sync   /dev/hdk1
>>        4      34        1        4      active sync   /dev/hdg1
>>
>>
>
>



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2011-09-26  7:42   ` Re: Kenn
@ 2011-09-26  8:04     ` NeilBrown
  2011-09-26 18:04       ` Re: Kenn
  0 siblings, 1 reply; 1546+ messages in thread
From: NeilBrown @ 2011-09-26  8:04 UTC (permalink / raw)
  To: kenn; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 26202 bytes --]

On Mon, 26 Sep 2011 00:42:23 -0700 "Kenn" <kenn@kenn.us> wrote:

> Replying.  I realize and I apologize I didn't create a subject.  I hope
> this doesn't confuse majordomo.
> 
> > On Sun, 25 Sep 2011 21:23:31 -0700 "Kenn" <kenn@kenn.us> wrote:
> >
> >> I have a raid5 array that had a drive drop out, and resilvered the wrong
> >> drive when I put it back in, corrupting and destroying the raid.  I
> >> stopped the array at less than 1% resilvering and I'm in the process of
> >> making a dd-copy of the drive to recover the files.
> >
> > I don't know what you mean by "resilvered".
> 
> Resilvering -- Rebuilding the array.  Lesser used term, sorry!

I see..

I guess that looking-glass mirrors have a silver backing and when it becomes
tarnished you might re-silver the mirror to make it better again.
So the name works as a poor pun for RAID1.  But I don't see how it applies
to RAID5....
No matter.

Basically you have messed up badly.
Recreating arrays should only be done as a last-ditch attempt to get data
back, and preferably with expert advice...

When you created the array with all devices present it effectively started
copying the corruption that you had deliberately (why??) placed on device 2
(sde) onto device 4 (counting from 0).
So now you have two devices that are corrupt in the early blocks.
There is not much you can do to fix that.

There is some chance that 'fsck' could find a backup superblock somewhere and
try to put the pieces back together.  But the 'mkfs' probably made a
substantial mess of important data structures so I don't consider you chances
very high.
Keeping sde out and just working with the remaining 4 is certainly your best
bet.

What made you think it would be a good idea to re-create the array when all
you wanted to do was trigger a resync/recovery??

NeilBrown


> 
> >
> >>
> >> (1) Is there anything diagnostic I can contribute to add more
> >> wrong-drive-resilvering protection to mdadm?  I have the command history
> >> showing everything I did, I have the five drives available for reading
> >> sectors, I haven't touched anything yet.
> >
> > Yes, report the command history, and any relevant kernel logs, and the
> > output
> > of "mdadm --examine" on all relevant devices.
> >
> > NeilBrown
> 
> Awesome!  I hope this is useful.  It's really long, so I edited down the
> logs and command history to what I thought were the important bits.  If
> you want more, I can post unedited versions, please let me know.
> 
> ### Command History ###
> 
> # The start of the sequence, removing sde from array
> mdadm --examine /dev/sde
> mdadm --detail /dev/md3
> cat /proc/mdstat
> mdadm /dev/md3 --remove /dev/sde1
> mdadm /dev/md3 --remove /dev/sde
> mdadm /dev/md3 --fail /dev/sde1
> cat /proc/mdstat
> mdadm --examine /dev/sde1
> fdisk -l | grep 750
> mdadm --examine /dev/sde1
> mdadm --remove /dev/sde
> mdadm /dev/md3 --remove /dev/sde
> mdadm /dev/md3 --fail /dev/sde
> fdisk /dev/sde
> ls
> vi /var/log/syslog
> reboot
> vi /var/log/syslog
> reboot
> mdadm --detail /dev/md3
> mdadm --examine /dev/sde1
> # Wiping sde
> fdisk /dev/sde
> newfs -t ext3 /dev/sde1
> mkfs -t ext3 /dev/sde1
> mkfs -t ext3 /dev/sde2
> fdisk /dev/sde
> mdadm --stop /dev/md3
> # Putting sde back into array
> mdadm --examine /dev/sde
> mdadm --help
> mdadm --misc --help
> mdadm --zero-superblock /dev/sde
> mdadm --query /dev/sde
> mdadm --examine /dev/sde
> mdadm --detail /dev/sde
> mdadm --detail /dev/sde1
> fdisk /dev/sde
> mdadm --assemble --no-degraded /dev/md3  /dev/hde1 /dev/hdi1 /dev/sde1
> /dev/hdk1 /dev/hdg1
> cat /proc/mdstat
> mdadm --stop /dev/md3
> mdadm --create /dev/md3 --level=5 --raid-devices=5  /dev/hde1 /dev/hdi1
> missing /dev/hdk1 /dev/hdg1
> mount -o ro /raid53
> ls /raid53
> umount /raid53
> mdadm --stop /dev/md3
> # The command that did the bad rebuild
> mdadm --create /dev/md3 --level=5 --raid-devices=5  /dev/hde1 /dev/hdi1
> /dev/sde1 /dev/hdk1 /dev/hdg1
> cat /proc/mdstat
> mdadm --examine /dev/md3
> mdadm --query /dev/md3
> mdadm --detail /dev/md3
> mount /raid53
> mdadm --stop /dev/md3
> # Trying to get the corrupted disk back up
> mdadm --create /dev/md3 --level=5 --raid-devices=5  /dev/hde1 /dev/hdi1
> missing /dev/hdk1 /dev/hdg1
> cat /proc/mdstat
> mount /raid53
> fsck -n /dev/md3
> 
> 
> 
> ### KERNEL LOGS ###
> 
> # Me messing around with fdisk and mdadm creating new partitions to wipe
> out sde
> Sep 22 15:56:39 teresa kernel: [ 7897.778204] sd 5:0:0:0: [sde] 1465149168
> 512-byte hardware sectors (750156 MB)
> Sep 22 15:56:39 teresa kernel: [ 7897.778204] sd 5:0:0:0: [sde] Write
> Protect is off
> Sep 22 15:56:39 teresa kernel: [ 7897.778204] sd 5:0:0:0: [sde] Mode
> Sense: 00 3a 00 00
> Sep 22 15:56:39 teresa kernel: [ 7897.778204] sd 5:0:0:0: [sde] Write
> cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Sep 22 15:56:39 teresa kernel: [ 7897.778204]  sde: sde1 sde2
> Sep 22 15:56:41 teresa kernel: [ 7899.848026] sd 5:0:0:0: [sde] 1465149168
> 512-byte hardware sectors (750156 MB)
> Sep 22 15:56:41 teresa kernel: [ 7899.848026] sd 5:0:0:0: [sde] Write
> Protect is off
> Sep 22 15:56:41 teresa kernel: [ 7899.848026] sd 5:0:0:0: [sde] Mode
> Sense: 00 3a 00 00
> Sep 22 15:56:41 teresa kernel: [ 7899.848026] sd 5:0:0:0: [sde] Write
> cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Sep 22 15:56:41 teresa kernel: [ 7899.848026]  sde: sde1 sde2
> Sep 22 16:01:49 teresa kernel: [ 8207.733821] sd 5:0:0:0: [sde] 1465149168
> 512-byte hardware sectors (750156 MB)
> Sep 22 16:01:49 teresa kernel: [ 8207.733919] sd 5:0:0:0: [sde] Write
> Protect is off
> Sep 22 16:01:49 teresa kernel: [ 8207.733943] sd 5:0:0:0: [sde] Mode
> Sense: 00 3a 00 00
> Sep 22 16:01:49 teresa kernel: [ 8207.734039] sd 5:0:0:0: [sde] Write
> cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Sep 22 16:01:49 teresa kernel: [ 8207.734083]  sde: sde1
> Sep 22 16:01:51 teresa kernel: [ 8209.777260] sd 5:0:0:0: [sde] 1465149168
> 512-byte hardware sectors (750156 MB)
> Sep 22 16:01:51 teresa kernel: [ 8209.777260] sd 5:0:0:0: [sde] Write
> Protect is off
> Sep 22 16:01:51 teresa kernel: [ 8209.777260] sd 5:0:0:0: [sde] Mode
> Sense: 00 3a 00 00
> Sep 22 16:01:51 teresa kernel: [ 8209.777260] sd 5:0:0:0: [sde] Write
> cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Sep 22 16:01:51 teresa kernel: [ 8209.777260]  sde: sde1
> Sep 22 16:02:09 teresa mdadm[2694]: DeviceDisappeared event detected on md
> device /dev/md3
> Sep 22 16:02:09 teresa kernel: [ 8227.781860] md: md3 stopped.
> Sep 22 16:02:09 teresa kernel: [ 8227.781908] md: unbind<hde1>
> Sep 22 16:02:09 teresa kernel: [ 8227.781937] md: export_rdev(hde1)
> Sep 22 16:02:09 teresa kernel: [ 8227.782261] md: unbind<hdg1>
> Sep 22 16:02:09 teresa kernel: [ 8227.782292] md: export_rdev(hdg1)
> Sep 22 16:02:09 teresa kernel: [ 8227.782561] md: unbind<hdk1>
> Sep 22 16:02:09 teresa kernel: [ 8227.782590] md: export_rdev(hdk1)
> Sep 22 16:02:09 teresa kernel: [ 8227.782855] md: unbind<hdi1>
> Sep 22 16:02:09 teresa kernel: [ 8227.782885] md: export_rdev(hdi1)
> Sep 22 16:15:32 teresa smartd[2657]: Device: /dev/hda, Failed SMART usage
> Attribute: 194 Temperature_Celsius.
> Sep 22 16:15:33 teresa smartd[2657]: Device: /dev/hdk, SMART Usage
> Attribute: 194 Temperature_Celsius changed from 110 to 111
> Sep 22 16:15:33 teresa smartd[2657]: Device: /dev/sdb, SMART Usage
> Attribute: 194 Temperature_Celsius changed from 113 to 116
> Sep 22 16:15:33 teresa smartd[2657]: Device: /dev/sdc, SMART Usage
> Attribute: 190 Airflow_Temperature_Cel changed from 52 to 51
> Sep 22 16:17:01 teresa /USR/SBIN/CRON[2965]: (root) CMD (   cd / &&
> run-parts --report /etc/cron.hourly)
> Sep 22 16:18:42 teresa kernel: [ 9220.400915] md: md3 stopped.
> Sep 22 16:18:42 teresa kernel: [ 9220.411525] md: bind<hdi1>
> Sep 22 16:18:42 teresa kernel: [ 9220.411884] md: bind<sde1>
> Sep 22 16:18:42 teresa kernel: [ 9220.412577] md: bind<hdk1>
> Sep 22 16:18:42 teresa kernel: [ 9220.413162] md: bind<hdg1>
> Sep 22 16:18:42 teresa kernel: [ 9220.413750] md: bind<hde1>
> Sep 22 16:18:42 teresa kernel: [ 9220.413855] md: kicking non-fresh sde1
> from array!
> Sep 22 16:18:42 teresa kernel: [ 9220.413887] md: unbind<sde1>
> Sep 22 16:18:42 teresa kernel: [ 9220.413915] md: export_rdev(sde1)
> Sep 22 16:18:42 teresa kernel: [ 9220.477393] raid5: device hde1
> operational as raid disk 0
> Sep 22 16:18:42 teresa kernel: [ 9220.477420] raid5: device hdg1
> operational as raid disk 4
> Sep 22 16:18:42 teresa kernel: [ 9220.477438] raid5: device hdk1
> operational as raid disk 3
> Sep 22 16:18:42 teresa kernel: [ 9220.477456] raid5: device hdi1
> operational as raid disk 1
> Sep 22 16:18:42 teresa kernel: [ 9220.478236] raid5: allocated 5252kB for md3
> Sep 22 16:18:42 teresa kernel: [ 9220.478265] raid5: raid level 5 set md3
> active with 4 out of 5 devices, algorithm 2
> Sep 22 16:18:42 teresa kernel: [ 9220.478294] RAID5 conf printout:
> Sep 22 16:18:42 teresa kernel: [ 9220.478309]  --- rd:5 wd:4
> Sep 22 16:18:42 teresa kernel: [ 9220.478324]  disk 0, o:1, dev:hde1
> Sep 22 16:18:42 teresa kernel: [ 9220.478339]  disk 1, o:1, dev:hdi1
> Sep 22 16:18:42 teresa kernel: [ 9220.478354]  disk 3, o:1, dev:hdk1
> Sep 22 16:18:42 teresa kernel: [ 9220.478369]  disk 4, o:1, dev:hdg1
> # Me stopping md3
> Sep 22 16:18:53 teresa mdadm[2694]: DeviceDisappeared event detected on md
> device /dev/md3
> Sep 22 16:18:53 teresa kernel: [ 9231.572348] md: md3 stopped.
> Sep 22 16:18:53 teresa kernel: [ 9231.572394] md: unbind<hde1>
> Sep 22 16:18:53 teresa kernel: [ 9231.572423] md: export_rdev(hde1)
> Sep 22 16:18:53 teresa kernel: [ 9231.572728] md: unbind<hdg1>
> Sep 22 16:18:53 teresa kernel: [ 9231.572758] md: export_rdev(hdg1)
> Sep 22 16:18:53 teresa kernel: [ 9231.572988] md: unbind<hdk1>
> Sep 22 16:18:53 teresa kernel: [ 9231.573015] md: export_rdev(hdk1)
> Sep 22 16:18:53 teresa kernel: [ 9231.573243] md: unbind<hdi1>
> Sep 22 16:18:53 teresa kernel: [ 9231.573270] md: export_rdev(hdi1)
> # Me creating md3 with sde1 missing
> Sep 22 16:19:51 teresa kernel: [ 9289.621646] md: bind<hde1>
> Sep 22 16:19:51 teresa kernel: [ 9289.665268] md: bind<hdi1>
> Sep 22 16:19:51 teresa kernel: [ 9289.695676] md: bind<hdk1>
> Sep 22 16:19:51 teresa kernel: [ 9289.726906] md: bind<hdg1>
> Sep 22 16:19:51 teresa kernel: [ 9289.809030] raid5: device hdg1
> operational as raid disk 4
> Sep 22 16:19:51 teresa kernel: [ 9289.809057] raid5: device hdk1
> operational as raid disk 3
> Sep 22 16:19:51 teresa kernel: [ 9289.809075] raid5: device hdi1
> operational as raid disk 1
> Sep 22 16:19:51 teresa kernel: [ 9289.809093] raid5: device hde1
> operational as raid disk 0
> Sep 22 16:19:51 teresa kernel: [ 9289.809821] raid5: allocated 5252kB for md3
> Sep 22 16:19:51 teresa kernel: [ 9289.809850] raid5: raid level 5 set md3
> active with 4 out of 5 devices, algorithm 2
> Sep 22 16:19:51 teresa kernel: [ 9289.809877] RAID5 conf printout:
> Sep 22 16:19:51 teresa kernel: [ 9289.809891]  --- rd:5 wd:4
> Sep 22 16:19:51 teresa kernel: [ 9289.809907]  disk 0, o:1, dev:hde1
> Sep 22 16:19:51 teresa kernel: [ 9289.809922]  disk 1, o:1, dev:hdi1
> Sep 22 16:19:51 teresa kernel: [ 9289.809937]  disk 3, o:1, dev:hdk1
> Sep 22 16:19:51 teresa kernel: [ 9289.809953]  disk 4, o:1, dev:hdg1
> Sep 22 16:20:20 teresa kernel: [ 9318.486512] kjournald starting.  Commit
> interval 5 seconds
> Sep 22 16:20:20 teresa kernel: [ 9318.486512] EXT3-fs: mounted filesystem
> with ordered data mode.
> # Me stopping md3 again
> Sep 22 16:20:42 teresa mdadm[2694]: DeviceDisappeared event detected on md
> device /dev/md3
> Sep 22 16:20:42 teresa kernel: [ 9340.300590] md: md3 stopped.
> Sep 22 16:20:42 teresa kernel: [ 9340.300639] md: unbind<hdg1>
> Sep 22 16:20:42 teresa kernel: [ 9340.300668] md: export_rdev(hdg1)
> Sep 22 16:20:42 teresa kernel: [ 9340.300921] md: unbind<hdk1>
> Sep 22 16:20:42 teresa kernel: [ 9340.300950] md: export_rdev(hdk1)
> Sep 22 16:20:42 teresa kernel: [ 9340.301183] md: unbind<hdi1>
> Sep 22 16:20:42 teresa kernel: [ 9340.301211] md: export_rdev(hdi1)
> Sep 22 16:20:42 teresa kernel: [ 9340.301438] md: unbind<hde1>
> Sep 22 16:20:42 teresa kernel: [ 9340.301465] md: export_rdev(hde1)
> # This is me doing the fatal create, that recovers the wrong disk
> Sep 22 16:21:39 teresa kernel: [ 9397.609864] md: bind<hde1>
> Sep 22 16:21:39 teresa kernel: [ 9397.652426] md: bind<hdi1>
> Sep 22 16:21:39 teresa kernel: [ 9397.673203] md: bind<sde1>
> Sep 22 16:21:39 teresa kernel: [ 9397.699373] md: bind<hdk1>
> Sep 22 16:21:39 teresa kernel: [ 9397.739372] md: bind<hdg1>
> Sep 22 16:21:39 teresa kernel: [ 9397.801729] raid5: device hdk1
> operational as raid disk 3
> Sep 22 16:21:39 teresa kernel: [ 9397.801756] raid5: device sde1
> operational as raid disk 2
> Sep 22 16:21:39 teresa kernel: [ 9397.801774] raid5: device hdi1
> operational as raid disk 1
> Sep 22 16:21:39 teresa kernel: [ 9397.801793] raid5: device hde1
> operational as raid disk 0
> Sep 22 16:21:39 teresa kernel: [ 9397.802531] raid5: allocated 5252kB for md3
> Sep 22 16:21:39 teresa kernel: [ 9397.802559] raid5: raid level 5 set md3
> active with 4 out of 5 devices, algorithm 2
> Sep 22 16:21:39 teresa kernel: [ 9397.802586] RAID5 conf printout:
> Sep 22 16:21:39 teresa kernel: [ 9397.802600]  --- rd:5 wd:4
> Sep 22 16:21:39 teresa kernel: [ 9397.802615]  disk 0, o:1, dev:hde1
> Sep 22 16:21:39 teresa kernel: [ 9397.802631]  disk 1, o:1, dev:hdi1
> Sep 22 16:21:39 teresa kernel: [ 9397.802646]  disk 2, o:1, dev:sde1
> Sep 22 16:21:39 teresa kernel: [ 9397.802661]  disk 3, o:1, dev:hdk1
> Sep 22 16:21:39 teresa kernel: [ 9397.838429] RAID5 conf printout:
> Sep 22 16:21:39 teresa kernel: [ 9397.838454]  --- rd:5 wd:4
> Sep 22 16:21:39 teresa kernel: [ 9397.838471]  disk 0, o:1, dev:hde1
> Sep 22 16:21:39 teresa kernel: [ 9397.838486]  disk 1, o:1, dev:hdi1
> Sep 22 16:21:39 teresa kernel: [ 9397.838502]  disk 2, o:1, dev:sde1
> Sep 22 16:21:39 teresa kernel: [ 9397.838518]  disk 3, o:1, dev:hdk1
> Sep 22 16:21:39 teresa kernel: [ 9397.838533]  disk 4, o:1, dev:hdg1
> Sep 22 16:21:39 teresa mdadm[2694]: RebuildStarted event detected on md
> device /dev/md3
> Sep 22 16:21:39 teresa kernel: [ 9397.841822] md: recovery of RAID array md3
> Sep 22 16:21:39 teresa kernel: [ 9397.841848] md: minimum _guaranteed_ 
> speed: 1000 KB/sec/disk.
> Sep 22 16:21:39 teresa kernel: [ 9397.841868] md: using maximum available
> idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
> Sep 22 16:21:39 teresa kernel: [ 9397.841908] md: using 128k window, over
> a total of 732571904 blocks.
> Sep 22 16:22:33 teresa kernel: [ 9451.640192] EXT3-fs error (device md3):
> ext3_check_descriptors: Block bitmap for group 3968 not in group (block
> 0)!
> Sep 22 16:22:33 teresa kernel: [ 9451.750241] EXT3-fs: group descriptors
> corrupted!
> Sep 22 16:22:39 teresa kernel: [ 9458.079151] md: md_do_sync() got signal
> ... exiting
> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: md3 stopped.
> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<hdg1>
> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(hdg1)
> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<hdk1>
> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(hdk1)
> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<sde1>
> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(sde1)
> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<hdi1>
> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(hdi1)
> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<hde1>
> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(hde1)
> Sep 22 16:22:39 teresa mdadm[2694]: DeviceDisappeared event detected on md
> device /dev/md3
> # Me trying to recreate md3 without sde
> Sep 22 16:23:50 teresa kernel: [ 9529.065477] md: bind<hde1>
> Sep 22 16:23:50 teresa kernel: [ 9529.107767] md: bind<hdi1>
> Sep 22 16:23:50 teresa kernel: [ 9529.137743] md: bind<hdk1>
> Sep 22 16:23:50 teresa kernel: [ 9529.177990] md: bind<hdg1>
> Sep 22 16:23:51 teresa mdadm[2694]: RebuildFinished event detected on md
> device /dev/md3
> Sep 22 16:23:51 teresa kernel: [ 9529.240814] raid5: device hdg1
> operational as raid disk 4
> Sep 22 16:23:51 teresa kernel: [ 9529.241734] raid5: device hdk1
> operational as raid disk 3
> Sep 22 16:23:51 teresa kernel: [ 9529.241752] raid5: device hdi1
> operational as raid disk 1
> Sep 22 16:23:51 teresa kernel: [ 9529.241770] raid5: device hde1
> operational as raid disk 0
> Sep 22 16:23:51 teresa kernel: [ 9529.242520] raid5: allocated 5252kB for md3
> Sep 22 16:23:51 teresa kernel: [ 9529.242547] raid5: raid level 5 set md3
> active with 4 out of 5 devices, algorithm 2
> Sep 22 16:23:51 teresa kernel: [ 9529.242574] RAID5 conf printout:
> Sep 22 16:23:51 teresa kernel: [ 9529.242588]  --- rd:5 wd:4
> Sep 22 16:23:51 teresa kernel: [ 9529.242603]  disk 0, o:1, dev:hde1
> Sep 22 16:23:51 teresa kernel: [ 9529.242618]  disk 1, o:1, dev:hdi1
> Sep 22 16:23:51 teresa kernel: [ 9529.242633]  disk 3, o:1, dev:hdk1
> Sep 22 16:23:51 teresa kernel: [ 9529.242649]  disk 4, o:1, dev:hdg1
> # And me trying a fsck -n or a mount
> Sep 22 16:24:07 teresa kernel: [ 9545.326343] EXT3-fs error (device md3):
> ext3_check_descriptors: Block bitmap for group 3968 not in group (block
> 0)!
> Sep 22 16:24:07 teresa kernel: [ 9545.369071] EXT3-fs: group descriptors
> corrupted!
> 
> 
> ### EXAMINES OF PARTITIONS ###
> 
> === --examine /dev/hde1 ===
> /dev/hde1:
>           Magic : a92b4efc
>         Version : 00.90.00
>            UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host teresa)
>   Creation Time : Thu Sep 22 16:23:50 2011
>      Raid Level : raid5
>   Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
>      Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
>    Raid Devices : 5
>   Total Devices : 4
> Preferred Minor : 3
> 
>     Update Time : Sun Sep 25 22:11:22 2011
>           State : clean
>  Active Devices : 4
> Working Devices : 4
>  Failed Devices : 1
>   Spare Devices : 0
>        Checksum : b7f6a3c0 - correct
>          Events : 10
> 
>          Layout : left-symmetric
>      Chunk Size : 64K
> 
>       Number   Major   Minor   RaidDevice State
> this     0      33        1        0      active sync   /dev/hde1
> 
>    0     0      33        1        0      active sync   /dev/hde1
>    1     1      56        1        1      active sync   /dev/hdi1
>    2     2       0        0        2      faulty removed
>    3     3      57        1        3      active sync   /dev/hdk1
>    4     4      34        1        4      active sync   /dev/hdg1
> 
> === --examine /dev/hdi1 ===
> /dev/hdi1:
>           Magic : a92b4efc
>         Version : 00.90.00
>            UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host teresa)
>   Creation Time : Thu Sep 22 16:23:50 2011
>      Raid Level : raid5
>   Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
>      Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
>    Raid Devices : 5
>   Total Devices : 4
> Preferred Minor : 3
> 
>     Update Time : Sun Sep 25 22:11:22 2011
>           State : clean
>  Active Devices : 4
> Working Devices : 4
>  Failed Devices : 1
>   Spare Devices : 0
>        Checksum : b7f6a3d9 - correct
>          Events : 10
> 
>          Layout : left-symmetric
>      Chunk Size : 64K
> 
>       Number   Major   Minor   RaidDevice State
> this     1      56        1        1      active sync   /dev/hdi1
> 
>    0     0      33        1        0      active sync   /dev/hde1
>    1     1      56        1        1      active sync   /dev/hdi1
>    2     2       0        0        2      faulty removed
>    3     3      57        1        3      active sync   /dev/hdk1
>    4     4      34        1        4      active sync   /dev/hdg1
> 
> === --examine /dev/sde1 ===
> /dev/sde1:
>           Magic : a92b4efc
>         Version : 00.90.00
>            UUID : e6e3df36:1195239f:47f7b12e:9c2b2218 (local to host teresa)
>   Creation Time : Thu Sep 22 16:21:39 2011
>      Raid Level : raid5
>   Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
>      Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
>    Raid Devices : 5
>   Total Devices : 5
> Preferred Minor : 3
> 
>     Update Time : Thu Sep 22 16:22:39 2011
>           State : clean
>  Active Devices : 4
> Working Devices : 5
>  Failed Devices : 1
>   Spare Devices : 1
>        Checksum : 4e69d679 - correct
>          Events : 8
> 
>          Layout : left-symmetric
>      Chunk Size : 64K
> 
>       Number   Major   Minor   RaidDevice State
> this     2       8       65        2      active sync   /dev/sde1
> 
>    0     0      33        1        0      active sync   /dev/hde1
>    1     1      56        1        1      active sync   /dev/hdi1
>    2     2       8       65        2      active sync   /dev/sde1
>    3     3      57        1        3      active sync   /dev/hdk1
>    4     4       0        0        4      faulty removed
>    5     5      34        1        5      spare   /dev/hdg1
> 
> === --examine /dev/hdk1 ===
> /dev/hdk1:
>           Magic : a92b4efc
>         Version : 00.90.00
>            UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host teresa)
>   Creation Time : Thu Sep 22 16:23:50 2011
>      Raid Level : raid5
>   Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
>      Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
>    Raid Devices : 5
>   Total Devices : 4
> Preferred Minor : 3
> 
>     Update Time : Sun Sep 25 22:11:22 2011
>           State : clean
>  Active Devices : 4
> Working Devices : 4
>  Failed Devices : 1
>   Spare Devices : 0
>        Checksum : b7f6a3de - correct
>          Events : 10
> 
>          Layout : left-symmetric
>      Chunk Size : 64K
> 
>       Number   Major   Minor   RaidDevice State
> this     3      57        1        3      active sync   /dev/hdk1
> 
>    0     0      33        1        0      active sync   /dev/hde1
>    1     1      56        1        1      active sync   /dev/hdi1
>    2     2       0        0        2      faulty removed
>    3     3      57        1        3      active sync   /dev/hdk1
>    4     4      34        1        4      active sync   /dev/hdg1
> 
> === --examine /dev/hdg1 ===
> /dev/hdg1:
>           Magic : a92b4efc
>         Version : 00.90.00
>            UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host teresa)
>   Creation Time : Thu Sep 22 16:23:50 2011
>      Raid Level : raid5
>   Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
>      Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
>    Raid Devices : 5
>   Total Devices : 4
> Preferred Minor : 3
> 
>     Update Time : Sun Sep 25 22:11:22 2011
>           State : clean
>  Active Devices : 4
> Working Devices : 4
>  Failed Devices : 1
>   Spare Devices : 0
>        Checksum : b7f6a3c9 - correct
>          Events : 10
> 
>          Layout : left-symmetric
>      Chunk Size : 64K
> 
>       Number   Major   Minor   RaidDevice State
> this     4      34        1        4      active sync   /dev/hdg1
> 
>    0     0      33        1        0      active sync   /dev/hde1
>    1     1      56        1        1      active sync   /dev/hdi1
>    2     2       0        0        2      faulty removed
>    3     3      57        1        3      active sync   /dev/hdk1
>    4     4      34        1        4      active sync   /dev/hdg1
> 
> 
> 
> 
> >
> >
> >>
> >> (2) Can I suggest improvements into resilvering?  Can I contribute code
> >> to
> >> implement them?  Such as resilver from the end of the drive back to the
> >> front, so if you notice the wrong drive resilvering, you can stop and
> >> not
> >> lose the MBR and the directory format structure that's stored in the
> >> first
> >> few sectors?  I'd also like to take a look at adding a raid mode where
> >> there's checksum in every stripe block so the system can detect
> >> corrupted
> >> disks and not resilver.  I'd also like to add a raid option where a
> >> resilvering need will be reported by email and needs to be started
> >> manually.  All to prevent what happened to me from happening again.
> >>
> >> Thanks for your time.
> >>
> >> Kenn Frank
> >>
> >> P.S.  Setup:
> >>
> >> # uname -a
> >> Linux teresa 2.6.26-2-686 #1 SMP Sat Jun 11 14:54:10 UTC 2011 i686
> >> GNU/Linux
> >>
> >> # mdadm --version
> >> mdadm - v2.6.7.2 - 14th November 2008
> >>
> >> # mdadm --detail /dev/md3
> >> /dev/md3:
> >>         Version : 00.90
> >>   Creation Time : Thu Sep 22 16:23:50 2011
> >>      Raid Level : raid5
> >>      Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
> >>   Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
> >>    Raid Devices : 5
> >>   Total Devices : 4
> >> Preferred Minor : 3
> >>     Persistence : Superblock is persistent
> >>
> >>     Update Time : Thu Sep 22 20:19:09 2011
> >>           State : clean, degraded
> >>  Active Devices : 4
> >> Working Devices : 4
> >>  Failed Devices : 0
> >>   Spare Devices : 0
> >>
> >>          Layout : left-symmetric
> >>      Chunk Size : 64K
> >>
> >>            UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host
> >> teresa)
> >>          Events : 0.6
> >>
> >>     Number   Major   Minor   RaidDevice State
> >>        0      33        1        0      active sync   /dev/hde1
> >>        1      56        1        1      active sync   /dev/hdi1
> >>        2       0        0        2      removed
> >>        3      57        1        3      active sync   /dev/hdk1
> >>        4      34        1        4      active sync   /dev/hdg1
> >>
> >>
> >
> >
> 


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2011-09-26  8:04     ` Re: NeilBrown
@ 2011-09-26 18:04       ` Kenn
  2011-09-26 19:56         ` Re: David Brown
  0 siblings, 1 reply; 1546+ messages in thread
From: Kenn @ 2011-09-26 18:04 UTC (permalink / raw)
  To: linux-raid; +Cc: neilb

> On Mon, 26 Sep 2011 00:42:23 -0700 "Kenn" <kenn@kenn.us> wrote:
>
>> Replying.  I realize and I apologize I didn't create a subject.  I hope
>> this doesn't confuse majordomo.
>>
>> > On Sun, 25 Sep 2011 21:23:31 -0700 "Kenn" <kenn@kenn.us> wrote:
>> >
>> >> I have a raid5 array that had a drive drop out, and resilvered the
>> wrong
>> >> drive when I put it back in, corrupting and destroying the raid.  I
>> >> stopped the array at less than 1% resilvering and I'm in the process
>> of
>> >> making a dd-copy of the drive to recover the files.
>> >
>> > I don't know what you mean by "resilvered".
>>
>> Resilvering -- Rebuilding the array.  Lesser used term, sorry!
>
> I see..
>
> I guess that looking-glass mirrors have a silver backing and when it
> becomes
> tarnished you might re-silver the mirror to make it better again.
> So the name works as a poor pun for RAID1.  But I don't see how it applies
> to RAID5....
> No matter.
>
> Basically you have messed up badly.
> Recreating arrays should only be done as a last-ditch attempt to get data
> back, and preferably with expert advice...
>
> When you created the array with all devices present it effectively started
> copying the corruption that you had deliberately (why??) placed on device
> 2
> (sde) onto device 4 (counting from 0).
> So now you have two devices that are corrupt in the early blocks.
> There is not much you can do to fix that.
>
> There is some chance that 'fsck' could find a backup superblock somewhere
> and
> try to put the pieces back together.  But the 'mkfs' probably made a
> substantial mess of important data structures so I don't consider you
> chances
> very high.
> Keeping sde out and just working with the remaining 4 is certainly your
> best
> bet.
>
> What made you think it would be a good idea to re-create the array when
> all
> you wanted to do was trigger a resync/recovery??
>
> NeilBrown

Originally I had failed & removed sde from the array and then added it
back in, but no resilvering happened, it was just placed as raid device #
5 as an active (faulty?) spare, no rebuilding.  So I thought I'd have to
recreate the array to get it to rebuild.

Because my sde disk was only questionably healthy, if the problem was the
loose cable, I wanted to test the sde disk by having a complete rebuild
put onto it.   I was confident in all the other drives because when I
mounted the array without sde, I ran a complete md5sum scan and
everything's checksum was correct.  So I wanted to force a complete
rebuilding of the array on sde and the --zero-superblock was supposed to
render sde "new" to the array to force the rebuild onto sde.  I just did
the fsck and mkfs for good measure instead of spending the time of using
dd to zero every byte on the drive.  At the time because I thought if
--zero-superblock went wrong, md would reject a blank drive as a data
source for rebuilding and prevent resilvering.

So that brings up another point -- I've been reading through your blog,
and I acknowledge your thoughts on not having much benefit to checksums on
every block (http://neil.brown.name/blog/20110227114201), but sometimes
people like to having that extra lock on their door even though it takes
more effort to go in and out of their home.  In my five-drive array, if
the last five words were the checksums of the blocks on every drive, the
checksums off each drive could vote on trusting the blocks of every other
drive during the rebuild process, and prevent an idiot (me) from killing
his data.  It would force wasteful sectors on the drive, perhaps harm
performance by squeezing 2+n bytes out of each sector, but if someone
wants to protect their data as much as possible, it would be a welcome
option where performance is not a priority.

Also, the checksums do provide some protection: first, against against
partial media failure, which is a major flaw in raid 456 design according
to http://www.miracleas.com/BAARF/RAID5_versus_RAID10.txt , and checksum
voting could protect against the Atomicity/write-in-place flaw outlined in
http://en.wikipedia.org/wiki/RAID#Problems_with_RAID .

What do you think?

Kenn

>
>
>>
>> >
>> >>
>> >> (1) Is there anything diagnostic I can contribute to add more
>> >> wrong-drive-resilvering protection to mdadm?  I have the command
>> history
>> >> showing everything I did, I have the five drives available for
>> reading
>> >> sectors, I haven't touched anything yet.
>> >
>> > Yes, report the command history, and any relevant kernel logs, and the
>> > output
>> > of "mdadm --examine" on all relevant devices.
>> >
>> > NeilBrown
>>
>> Awesome!  I hope this is useful.  It's really long, so I edited down the
>> logs and command history to what I thought were the important bits.  If
>> you want more, I can post unedited versions, please let me know.
>>
>> ### Command History ###
>>
>> # The start of the sequence, removing sde from array
>> mdadm --examine /dev/sde
>> mdadm --detail /dev/md3
>> cat /proc/mdstat
>> mdadm /dev/md3 --remove /dev/sde1
>> mdadm /dev/md3 --remove /dev/sde
>> mdadm /dev/md3 --fail /dev/sde1
>> cat /proc/mdstat
>> mdadm --examine /dev/sde1
>> fdisk -l | grep 750
>> mdadm --examine /dev/sde1
>> mdadm --remove /dev/sde
>> mdadm /dev/md3 --remove /dev/sde
>> mdadm /dev/md3 --fail /dev/sde
>> fdisk /dev/sde
>> ls
>> vi /var/log/syslog
>> reboot
>> vi /var/log/syslog
>> reboot
>> mdadm --detail /dev/md3
>> mdadm --examine /dev/sde1
>> # Wiping sde
>> fdisk /dev/sde
>> newfs -t ext3 /dev/sde1
>> mkfs -t ext3 /dev/sde1
>> mkfs -t ext3 /dev/sde2
>> fdisk /dev/sde
>> mdadm --stop /dev/md3
>> # Putting sde back into array
>> mdadm --examine /dev/sde
>> mdadm --help
>> mdadm --misc --help
>> mdadm --zero-superblock /dev/sde
>> mdadm --query /dev/sde
>> mdadm --examine /dev/sde
>> mdadm --detail /dev/sde
>> mdadm --detail /dev/sde1
>> fdisk /dev/sde
>> mdadm --assemble --no-degraded /dev/md3  /dev/hde1 /dev/hdi1 /dev/sde1
>> /dev/hdk1 /dev/hdg1
>> cat /proc/mdstat
>> mdadm --stop /dev/md3
>> mdadm --create /dev/md3 --level=5 --raid-devices=5  /dev/hde1 /dev/hdi1
>> missing /dev/hdk1 /dev/hdg1
>> mount -o ro /raid53
>> ls /raid53
>> umount /raid53
>> mdadm --stop /dev/md3
>> # The command that did the bad rebuild
>> mdadm --create /dev/md3 --level=5 --raid-devices=5  /dev/hde1 /dev/hdi1
>> /dev/sde1 /dev/hdk1 /dev/hdg1
>> cat /proc/mdstat
>> mdadm --examine /dev/md3
>> mdadm --query /dev/md3
>> mdadm --detail /dev/md3
>> mount /raid53
>> mdadm --stop /dev/md3
>> # Trying to get the corrupted disk back up
>> mdadm --create /dev/md3 --level=5 --raid-devices=5  /dev/hde1 /dev/hdi1
>> missing /dev/hdk1 /dev/hdg1
>> cat /proc/mdstat
>> mount /raid53
>> fsck -n /dev/md3
>>
>>
>>
>> ### KERNEL LOGS ###
>>
>> # Me messing around with fdisk and mdadm creating new partitions to wipe
>> out sde
>> Sep 22 15:56:39 teresa kernel: [ 7897.778204] sd 5:0:0:0: [sde]
>> 1465149168
>> 512-byte hardware sectors (750156 MB)
>> Sep 22 15:56:39 teresa kernel: [ 7897.778204] sd 5:0:0:0: [sde] Write
>> Protect is off
>> Sep 22 15:56:39 teresa kernel: [ 7897.778204] sd 5:0:0:0: [sde] Mode
>> Sense: 00 3a 00 00
>> Sep 22 15:56:39 teresa kernel: [ 7897.778204] sd 5:0:0:0: [sde] Write
>> cache: enabled, read cache: enabled, doesn't support DPO or FUA
>> Sep 22 15:56:39 teresa kernel: [ 7897.778204]  sde: sde1 sde2
>> Sep 22 15:56:41 teresa kernel: [ 7899.848026] sd 5:0:0:0: [sde]
>> 1465149168
>> 512-byte hardware sectors (750156 MB)
>> Sep 22 15:56:41 teresa kernel: [ 7899.848026] sd 5:0:0:0: [sde] Write
>> Protect is off
>> Sep 22 15:56:41 teresa kernel: [ 7899.848026] sd 5:0:0:0: [sde] Mode
>> Sense: 00 3a 00 00
>> Sep 22 15:56:41 teresa kernel: [ 7899.848026] sd 5:0:0:0: [sde] Write
>> cache: enabled, read cache: enabled, doesn't support DPO or FUA
>> Sep 22 15:56:41 teresa kernel: [ 7899.848026]  sde: sde1 sde2
>> Sep 22 16:01:49 teresa kernel: [ 8207.733821] sd 5:0:0:0: [sde]
>> 1465149168
>> 512-byte hardware sectors (750156 MB)
>> Sep 22 16:01:49 teresa kernel: [ 8207.733919] sd 5:0:0:0: [sde] Write
>> Protect is off
>> Sep 22 16:01:49 teresa kernel: [ 8207.733943] sd 5:0:0:0: [sde] Mode
>> Sense: 00 3a 00 00
>> Sep 22 16:01:49 teresa kernel: [ 8207.734039] sd 5:0:0:0: [sde] Write
>> cache: enabled, read cache: enabled, doesn't support DPO or FUA
>> Sep 22 16:01:49 teresa kernel: [ 8207.734083]  sde: sde1
>> Sep 22 16:01:51 teresa kernel: [ 8209.777260] sd 5:0:0:0: [sde]
>> 1465149168
>> 512-byte hardware sectors (750156 MB)
>> Sep 22 16:01:51 teresa kernel: [ 8209.777260] sd 5:0:0:0: [sde] Write
>> Protect is off
>> Sep 22 16:01:51 teresa kernel: [ 8209.777260] sd 5:0:0:0: [sde] Mode
>> Sense: 00 3a 00 00
>> Sep 22 16:01:51 teresa kernel: [ 8209.777260] sd 5:0:0:0: [sde] Write
>> cache: enabled, read cache: enabled, doesn't support DPO or FUA
>> Sep 22 16:01:51 teresa kernel: [ 8209.777260]  sde: sde1
>> Sep 22 16:02:09 teresa mdadm[2694]: DeviceDisappeared event detected on
>> md
>> device /dev/md3
>> Sep 22 16:02:09 teresa kernel: [ 8227.781860] md: md3 stopped.
>> Sep 22 16:02:09 teresa kernel: [ 8227.781908] md: unbind<hde1>
>> Sep 22 16:02:09 teresa kernel: [ 8227.781937] md: export_rdev(hde1)
>> Sep 22 16:02:09 teresa kernel: [ 8227.782261] md: unbind<hdg1>
>> Sep 22 16:02:09 teresa kernel: [ 8227.782292] md: export_rdev(hdg1)
>> Sep 22 16:02:09 teresa kernel: [ 8227.782561] md: unbind<hdk1>
>> Sep 22 16:02:09 teresa kernel: [ 8227.782590] md: export_rdev(hdk1)
>> Sep 22 16:02:09 teresa kernel: [ 8227.782855] md: unbind<hdi1>
>> Sep 22 16:02:09 teresa kernel: [ 8227.782885] md: export_rdev(hdi1)
>> Sep 22 16:15:32 teresa smartd[2657]: Device: /dev/hda, Failed SMART
>> usage
>> Attribute: 194 Temperature_Celsius.
>> Sep 22 16:15:33 teresa smartd[2657]: Device: /dev/hdk, SMART Usage
>> Attribute: 194 Temperature_Celsius changed from 110 to 111
>> Sep 22 16:15:33 teresa smartd[2657]: Device: /dev/sdb, SMART Usage
>> Attribute: 194 Temperature_Celsius changed from 113 to 116
>> Sep 22 16:15:33 teresa smartd[2657]: Device: /dev/sdc, SMART Usage
>> Attribute: 190 Airflow_Temperature_Cel changed from 52 to 51
>> Sep 22 16:17:01 teresa /USR/SBIN/CRON[2965]: (root) CMD (   cd / &&
>> run-parts --report /etc/cron.hourly)
>> Sep 22 16:18:42 teresa kernel: [ 9220.400915] md: md3 stopped.
>> Sep 22 16:18:42 teresa kernel: [ 9220.411525] md: bind<hdi1>
>> Sep 22 16:18:42 teresa kernel: [ 9220.411884] md: bind<sde1>
>> Sep 22 16:18:42 teresa kernel: [ 9220.412577] md: bind<hdk1>
>> Sep 22 16:18:42 teresa kernel: [ 9220.413162] md: bind<hdg1>
>> Sep 22 16:18:42 teresa kernel: [ 9220.413750] md: bind<hde1>
>> Sep 22 16:18:42 teresa kernel: [ 9220.413855] md: kicking non-fresh sde1
>> from array!
>> Sep 22 16:18:42 teresa kernel: [ 9220.413887] md: unbind<sde1>
>> Sep 22 16:18:42 teresa kernel: [ 9220.413915] md: export_rdev(sde1)
>> Sep 22 16:18:42 teresa kernel: [ 9220.477393] raid5: device hde1
>> operational as raid disk 0
>> Sep 22 16:18:42 teresa kernel: [ 9220.477420] raid5: device hdg1
>> operational as raid disk 4
>> Sep 22 16:18:42 teresa kernel: [ 9220.477438] raid5: device hdk1
>> operational as raid disk 3
>> Sep 22 16:18:42 teresa kernel: [ 9220.477456] raid5: device hdi1
>> operational as raid disk 1
>> Sep 22 16:18:42 teresa kernel: [ 9220.478236] raid5: allocated 5252kB
>> for md3
>> Sep 22 16:18:42 teresa kernel: [ 9220.478265] raid5: raid level 5 set
>> md3
>> active with 4 out of 5 devices, algorithm 2
>> Sep 22 16:18:42 teresa kernel: [ 9220.478294] RAID5 conf printout:
>> Sep 22 16:18:42 teresa kernel: [ 9220.478309]  --- rd:5 wd:4
>> Sep 22 16:18:42 teresa kernel: [ 9220.478324]  disk 0, o:1, dev:hde1
>> Sep 22 16:18:42 teresa kernel: [ 9220.478339]  disk 1, o:1, dev:hdi1
>> Sep 22 16:18:42 teresa kernel: [ 9220.478354]  disk 3, o:1, dev:hdk1
>> Sep 22 16:18:42 teresa kernel: [ 9220.478369]  disk 4, o:1, dev:hdg1
>> # Me stopping md3
>> Sep 22 16:18:53 teresa mdadm[2694]: DeviceDisappeared event detected on
>> md
>> device /dev/md3
>> Sep 22 16:18:53 teresa kernel: [ 9231.572348] md: md3 stopped.
>> Sep 22 16:18:53 teresa kernel: [ 9231.572394] md: unbind<hde1>
>> Sep 22 16:18:53 teresa kernel: [ 9231.572423] md: export_rdev(hde1)
>> Sep 22 16:18:53 teresa kernel: [ 9231.572728] md: unbind<hdg1>
>> Sep 22 16:18:53 teresa kernel: [ 9231.572758] md: export_rdev(hdg1)
>> Sep 22 16:18:53 teresa kernel: [ 9231.572988] md: unbind<hdk1>
>> Sep 22 16:18:53 teresa kernel: [ 9231.573015] md: export_rdev(hdk1)
>> Sep 22 16:18:53 teresa kernel: [ 9231.573243] md: unbind<hdi1>
>> Sep 22 16:18:53 teresa kernel: [ 9231.573270] md: export_rdev(hdi1)
>> # Me creating md3 with sde1 missing
>> Sep 22 16:19:51 teresa kernel: [ 9289.621646] md: bind<hde1>
>> Sep 22 16:19:51 teresa kernel: [ 9289.665268] md: bind<hdi1>
>> Sep 22 16:19:51 teresa kernel: [ 9289.695676] md: bind<hdk1>
>> Sep 22 16:19:51 teresa kernel: [ 9289.726906] md: bind<hdg1>
>> Sep 22 16:19:51 teresa kernel: [ 9289.809030] raid5: device hdg1
>> operational as raid disk 4
>> Sep 22 16:19:51 teresa kernel: [ 9289.809057] raid5: device hdk1
>> operational as raid disk 3
>> Sep 22 16:19:51 teresa kernel: [ 9289.809075] raid5: device hdi1
>> operational as raid disk 1
>> Sep 22 16:19:51 teresa kernel: [ 9289.809093] raid5: device hde1
>> operational as raid disk 0
>> Sep 22 16:19:51 teresa kernel: [ 9289.809821] raid5: allocated 5252kB
>> for md3
>> Sep 22 16:19:51 teresa kernel: [ 9289.809850] raid5: raid level 5 set
>> md3
>> active with 4 out of 5 devices, algorithm 2
>> Sep 22 16:19:51 teresa kernel: [ 9289.809877] RAID5 conf printout:
>> Sep 22 16:19:51 teresa kernel: [ 9289.809891]  --- rd:5 wd:4
>> Sep 22 16:19:51 teresa kernel: [ 9289.809907]  disk 0, o:1, dev:hde1
>> Sep 22 16:19:51 teresa kernel: [ 9289.809922]  disk 1, o:1, dev:hdi1
>> Sep 22 16:19:51 teresa kernel: [ 9289.809937]  disk 3, o:1, dev:hdk1
>> Sep 22 16:19:51 teresa kernel: [ 9289.809953]  disk 4, o:1, dev:hdg1
>> Sep 22 16:20:20 teresa kernel: [ 9318.486512] kjournald starting.
>> Commit
>> interval 5 seconds
>> Sep 22 16:20:20 teresa kernel: [ 9318.486512] EXT3-fs: mounted
>> filesystem
>> with ordered data mode.
>> # Me stopping md3 again
>> Sep 22 16:20:42 teresa mdadm[2694]: DeviceDisappeared event detected on
>> md
>> device /dev/md3
>> Sep 22 16:20:42 teresa kernel: [ 9340.300590] md: md3 stopped.
>> Sep 22 16:20:42 teresa kernel: [ 9340.300639] md: unbind<hdg1>
>> Sep 22 16:20:42 teresa kernel: [ 9340.300668] md: export_rdev(hdg1)
>> Sep 22 16:20:42 teresa kernel: [ 9340.300921] md: unbind<hdk1>
>> Sep 22 16:20:42 teresa kernel: [ 9340.300950] md: export_rdev(hdk1)
>> Sep 22 16:20:42 teresa kernel: [ 9340.301183] md: unbind<hdi1>
>> Sep 22 16:20:42 teresa kernel: [ 9340.301211] md: export_rdev(hdi1)
>> Sep 22 16:20:42 teresa kernel: [ 9340.301438] md: unbind<hde1>
>> Sep 22 16:20:42 teresa kernel: [ 9340.301465] md: export_rdev(hde1)
>> # This is me doing the fatal create, that recovers the wrong disk
>> Sep 22 16:21:39 teresa kernel: [ 9397.609864] md: bind<hde1>
>> Sep 22 16:21:39 teresa kernel: [ 9397.652426] md: bind<hdi1>
>> Sep 22 16:21:39 teresa kernel: [ 9397.673203] md: bind<sde1>
>> Sep 22 16:21:39 teresa kernel: [ 9397.699373] md: bind<hdk1>
>> Sep 22 16:21:39 teresa kernel: [ 9397.739372] md: bind<hdg1>
>> Sep 22 16:21:39 teresa kernel: [ 9397.801729] raid5: device hdk1
>> operational as raid disk 3
>> Sep 22 16:21:39 teresa kernel: [ 9397.801756] raid5: device sde1
>> operational as raid disk 2
>> Sep 22 16:21:39 teresa kernel: [ 9397.801774] raid5: device hdi1
>> operational as raid disk 1
>> Sep 22 16:21:39 teresa kernel: [ 9397.801793] raid5: device hde1
>> operational as raid disk 0
>> Sep 22 16:21:39 teresa kernel: [ 9397.802531] raid5: allocated 5252kB
>> for md3
>> Sep 22 16:21:39 teresa kernel: [ 9397.802559] raid5: raid level 5 set
>> md3
>> active with 4 out of 5 devices, algorithm 2
>> Sep 22 16:21:39 teresa kernel: [ 9397.802586] RAID5 conf printout:
>> Sep 22 16:21:39 teresa kernel: [ 9397.802600]  --- rd:5 wd:4
>> Sep 22 16:21:39 teresa kernel: [ 9397.802615]  disk 0, o:1, dev:hde1
>> Sep 22 16:21:39 teresa kernel: [ 9397.802631]  disk 1, o:1, dev:hdi1
>> Sep 22 16:21:39 teresa kernel: [ 9397.802646]  disk 2, o:1, dev:sde1
>> Sep 22 16:21:39 teresa kernel: [ 9397.802661]  disk 3, o:1, dev:hdk1
>> Sep 22 16:21:39 teresa kernel: [ 9397.838429] RAID5 conf printout:
>> Sep 22 16:21:39 teresa kernel: [ 9397.838454]  --- rd:5 wd:4
>> Sep 22 16:21:39 teresa kernel: [ 9397.838471]  disk 0, o:1, dev:hde1
>> Sep 22 16:21:39 teresa kernel: [ 9397.838486]  disk 1, o:1, dev:hdi1
>> Sep 22 16:21:39 teresa kernel: [ 9397.838502]  disk 2, o:1, dev:sde1
>> Sep 22 16:21:39 teresa kernel: [ 9397.838518]  disk 3, o:1, dev:hdk1
>> Sep 22 16:21:39 teresa kernel: [ 9397.838533]  disk 4, o:1, dev:hdg1
>> Sep 22 16:21:39 teresa mdadm[2694]: RebuildStarted event detected on md
>> device /dev/md3
>> Sep 22 16:21:39 teresa kernel: [ 9397.841822] md: recovery of RAID array
>> md3
>> Sep 22 16:21:39 teresa kernel: [ 9397.841848] md: minimum _guaranteed_
>> speed: 1000 KB/sec/disk.
>> Sep 22 16:21:39 teresa kernel: [ 9397.841868] md: using maximum
>> available
>> idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
>> Sep 22 16:21:39 teresa kernel: [ 9397.841908] md: using 128k window,
>> over
>> a total of 732571904 blocks.
>> Sep 22 16:22:33 teresa kernel: [ 9451.640192] EXT3-fs error (device
>> md3):
>> ext3_check_descriptors: Block bitmap for group 3968 not in group (block
>> 0)!
>> Sep 22 16:22:33 teresa kernel: [ 9451.750241] EXT3-fs: group descriptors
>> corrupted!
>> Sep 22 16:22:39 teresa kernel: [ 9458.079151] md: md_do_sync() got
>> signal
>> ... exiting
>> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: md3 stopped.
>> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<hdg1>
>> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(hdg1)
>> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<hdk1>
>> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(hdk1)
>> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<sde1>
>> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(sde1)
>> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<hdi1>
>> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(hdi1)
>> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: unbind<hde1>
>> Sep 22 16:22:39 teresa kernel: [ 9458.114590] md: export_rdev(hde1)
>> Sep 22 16:22:39 teresa mdadm[2694]: DeviceDisappeared event detected on
>> md
>> device /dev/md3
>> # Me trying to recreate md3 without sde
>> Sep 22 16:23:50 teresa kernel: [ 9529.065477] md: bind<hde1>
>> Sep 22 16:23:50 teresa kernel: [ 9529.107767] md: bind<hdi1>
>> Sep 22 16:23:50 teresa kernel: [ 9529.137743] md: bind<hdk1>
>> Sep 22 16:23:50 teresa kernel: [ 9529.177990] md: bind<hdg1>
>> Sep 22 16:23:51 teresa mdadm[2694]: RebuildFinished event detected on md
>> device /dev/md3
>> Sep 22 16:23:51 teresa kernel: [ 9529.240814] raid5: device hdg1
>> operational as raid disk 4
>> Sep 22 16:23:51 teresa kernel: [ 9529.241734] raid5: device hdk1
>> operational as raid disk 3
>> Sep 22 16:23:51 teresa kernel: [ 9529.241752] raid5: device hdi1
>> operational as raid disk 1
>> Sep 22 16:23:51 teresa kernel: [ 9529.241770] raid5: device hde1
>> operational as raid disk 0
>> Sep 22 16:23:51 teresa kernel: [ 9529.242520] raid5: allocated 5252kB
>> for md3
>> Sep 22 16:23:51 teresa kernel: [ 9529.242547] raid5: raid level 5 set
>> md3
>> active with 4 out of 5 devices, algorithm 2
>> Sep 22 16:23:51 teresa kernel: [ 9529.242574] RAID5 conf printout:
>> Sep 22 16:23:51 teresa kernel: [ 9529.242588]  --- rd:5 wd:4
>> Sep 22 16:23:51 teresa kernel: [ 9529.242603]  disk 0, o:1, dev:hde1
>> Sep 22 16:23:51 teresa kernel: [ 9529.242618]  disk 1, o:1, dev:hdi1
>> Sep 22 16:23:51 teresa kernel: [ 9529.242633]  disk 3, o:1, dev:hdk1
>> Sep 22 16:23:51 teresa kernel: [ 9529.242649]  disk 4, o:1, dev:hdg1
>> # And me trying a fsck -n or a mount
>> Sep 22 16:24:07 teresa kernel: [ 9545.326343] EXT3-fs error (device
>> md3):
>> ext3_check_descriptors: Block bitmap for group 3968 not in group (block
>> 0)!
>> Sep 22 16:24:07 teresa kernel: [ 9545.369071] EXT3-fs: group descriptors
>> corrupted!
>>
>>
>> ### EXAMINES OF PARTITIONS ###
>>
>> === --examine /dev/hde1 ===
>> /dev/hde1:
>>           Magic : a92b4efc
>>         Version : 00.90.00
>>            UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host
>> teresa)
>>   Creation Time : Thu Sep 22 16:23:50 2011
>>      Raid Level : raid5
>>   Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
>>      Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
>>    Raid Devices : 5
>>   Total Devices : 4
>> Preferred Minor : 3
>>
>>     Update Time : Sun Sep 25 22:11:22 2011
>>           State : clean
>>  Active Devices : 4
>> Working Devices : 4
>>  Failed Devices : 1
>>   Spare Devices : 0
>>        Checksum : b7f6a3c0 - correct
>>          Events : 10
>>
>>          Layout : left-symmetric
>>      Chunk Size : 64K
>>
>>       Number   Major   Minor   RaidDevice State
>> this     0      33        1        0      active sync   /dev/hde1
>>
>>    0     0      33        1        0      active sync   /dev/hde1
>>    1     1      56        1        1      active sync   /dev/hdi1
>>    2     2       0        0        2      faulty removed
>>    3     3      57        1        3      active sync   /dev/hdk1
>>    4     4      34        1        4      active sync   /dev/hdg1
>>
>> === --examine /dev/hdi1 ===
>> /dev/hdi1:
>>           Magic : a92b4efc
>>         Version : 00.90.00
>>            UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host
>> teresa)
>>   Creation Time : Thu Sep 22 16:23:50 2011
>>      Raid Level : raid5
>>   Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
>>      Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
>>    Raid Devices : 5
>>   Total Devices : 4
>> Preferred Minor : 3
>>
>>     Update Time : Sun Sep 25 22:11:22 2011
>>           State : clean
>>  Active Devices : 4
>> Working Devices : 4
>>  Failed Devices : 1
>>   Spare Devices : 0
>>        Checksum : b7f6a3d9 - correct
>>          Events : 10
>>
>>          Layout : left-symmetric
>>      Chunk Size : 64K
>>
>>       Number   Major   Minor   RaidDevice State
>> this     1      56        1        1      active sync   /dev/hdi1
>>
>>    0     0      33        1        0      active sync   /dev/hde1
>>    1     1      56        1        1      active sync   /dev/hdi1
>>    2     2       0        0        2      faulty removed
>>    3     3      57        1        3      active sync   /dev/hdk1
>>    4     4      34        1        4      active sync   /dev/hdg1
>>
>> === --examine /dev/sde1 ===
>> /dev/sde1:
>>           Magic : a92b4efc
>>         Version : 00.90.00
>>            UUID : e6e3df36:1195239f:47f7b12e:9c2b2218 (local to host
>> teresa)
>>   Creation Time : Thu Sep 22 16:21:39 2011
>>      Raid Level : raid5
>>   Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
>>      Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
>>    Raid Devices : 5
>>   Total Devices : 5
>> Preferred Minor : 3
>>
>>     Update Time : Thu Sep 22 16:22:39 2011
>>           State : clean
>>  Active Devices : 4
>> Working Devices : 5
>>  Failed Devices : 1
>>   Spare Devices : 1
>>        Checksum : 4e69d679 - correct
>>          Events : 8
>>
>>          Layout : left-symmetric
>>      Chunk Size : 64K
>>
>>       Number   Major   Minor   RaidDevice State
>> this     2       8       65        2      active sync   /dev/sde1
>>
>>    0     0      33        1        0      active sync   /dev/hde1
>>    1     1      56        1        1      active sync   /dev/hdi1
>>    2     2       8       65        2      active sync   /dev/sde1
>>    3     3      57        1        3      active sync   /dev/hdk1
>>    4     4       0        0        4      faulty removed
>>    5     5      34        1        5      spare   /dev/hdg1
>>
>> === --examine /dev/hdk1 ===
>> /dev/hdk1:
>>           Magic : a92b4efc
>>         Version : 00.90.00
>>            UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host
>> teresa)
>>   Creation Time : Thu Sep 22 16:23:50 2011
>>      Raid Level : raid5
>>   Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
>>      Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
>>    Raid Devices : 5
>>   Total Devices : 4
>> Preferred Minor : 3
>>
>>     Update Time : Sun Sep 25 22:11:22 2011
>>           State : clean
>>  Active Devices : 4
>> Working Devices : 4
>>  Failed Devices : 1
>>   Spare Devices : 0
>>        Checksum : b7f6a3de - correct
>>          Events : 10
>>
>>          Layout : left-symmetric
>>      Chunk Size : 64K
>>
>>       Number   Major   Minor   RaidDevice State
>> this     3      57        1        3      active sync   /dev/hdk1
>>
>>    0     0      33        1        0      active sync   /dev/hde1
>>    1     1      56        1        1      active sync   /dev/hdi1
>>    2     2       0        0        2      faulty removed
>>    3     3      57        1        3      active sync   /dev/hdk1
>>    4     4      34        1        4      active sync   /dev/hdg1
>>
>> === --examine /dev/hdg1 ===
>> /dev/hdg1:
>>           Magic : a92b4efc
>>         Version : 00.90.00
>>            UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host
>> teresa)
>>   Creation Time : Thu Sep 22 16:23:50 2011
>>      Raid Level : raid5
>>   Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
>>      Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
>>    Raid Devices : 5
>>   Total Devices : 4
>> Preferred Minor : 3
>>
>>     Update Time : Sun Sep 25 22:11:22 2011
>>           State : clean
>>  Active Devices : 4
>> Working Devices : 4
>>  Failed Devices : 1
>>   Spare Devices : 0
>>        Checksum : b7f6a3c9 - correct
>>          Events : 10
>>
>>          Layout : left-symmetric
>>      Chunk Size : 64K
>>
>>       Number   Major   Minor   RaidDevice State
>> this     4      34        1        4      active sync   /dev/hdg1
>>
>>    0     0      33        1        0      active sync   /dev/hde1
>>    1     1      56        1        1      active sync   /dev/hdi1
>>    2     2       0        0        2      faulty removed
>>    3     3      57        1        3      active sync   /dev/hdk1
>>    4     4      34        1        4      active sync   /dev/hdg1
>>
>>
>>
>>
>> >
>> >
>> >>
>> >> (2) Can I suggest improvements into resilvering?  Can I contribute
>> code
>> >> to
>> >> implement them?  Such as resilver from the end of the drive back to
>> the
>> >> front, so if you notice the wrong drive resilvering, you can stop and
>> >> not
>> >> lose the MBR and the directory format structure that's stored in the
>> >> first
>> >> few sectors?  I'd also like to take a look at adding a raid mode
>> where
>> >> there's checksum in every stripe block so the system can detect
>> >> corrupted
>> >> disks and not resilver.  I'd also like to add a raid option where a
>> >> resilvering need will be reported by email and needs to be started
>> >> manually.  All to prevent what happened to me from happening again.
>> >>
>> >> Thanks for your time.
>> >>
>> >> Kenn Frank
>> >>
>> >> P.S.  Setup:
>> >>
>> >> # uname -a
>> >> Linux teresa 2.6.26-2-686 #1 SMP Sat Jun 11 14:54:10 UTC 2011 i686
>> >> GNU/Linux
>> >>
>> >> # mdadm --version
>> >> mdadm - v2.6.7.2 - 14th November 2008
>> >>
>> >> # mdadm --detail /dev/md3
>> >> /dev/md3:
>> >>         Version : 00.90
>> >>   Creation Time : Thu Sep 22 16:23:50 2011
>> >>      Raid Level : raid5
>> >>      Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
>> >>   Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
>> >>    Raid Devices : 5
>> >>   Total Devices : 4
>> >> Preferred Minor : 3
>> >>     Persistence : Superblock is persistent
>> >>
>> >>     Update Time : Thu Sep 22 20:19:09 2011
>> >>           State : clean, degraded
>> >>  Active Devices : 4
>> >> Working Devices : 4
>> >>  Failed Devices : 0
>> >>   Spare Devices : 0
>> >>
>> >>          Layout : left-symmetric
>> >>      Chunk Size : 64K
>> >>
>> >>            UUID : ed1e6357:74e32684:47f7b12e:9c2b2218 (local to host
>> >> teresa)
>> >>          Events : 0.6
>> >>
>> >>     Number   Major   Minor   RaidDevice State
>> >>        0      33        1        0      active sync   /dev/hde1
>> >>        1      56        1        1      active sync   /dev/hdi1
>> >>        2       0        0        2      removed
>> >>        3      57        1        3      active sync   /dev/hdk1
>> >>        4      34        1        4      active sync   /dev/hdg1
>> >>
>> >>
>> >
>> >
>>
>
>



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2011-09-26 18:04       ` Re: Kenn
@ 2011-09-26 19:56         ` David Brown
  0 siblings, 0 replies; 1546+ messages in thread
From: David Brown @ 2011-09-26 19:56 UTC (permalink / raw)
  To: linux-raid

On 26/09/11 20:04, Kenn wrote:
>> On Mon, 26 Sep 2011 00:42:23 -0700 "Kenn"<kenn@kenn.us>  wrote:
>>
>>> Replying.  I realize and I apologize I didn't create a subject.  I hope
>>> this doesn't confuse majordomo.
>>>
>>>> On Sun, 25 Sep 2011 21:23:31 -0700 "Kenn"<kenn@kenn.us>  wrote:
>>>>
>>>>> I have a raid5 array that had a drive drop out, and resilvered the
>>> wrong
>>>>> drive when I put it back in, corrupting and destroying the raid.  I
>>>>> stopped the array at less than 1% resilvering and I'm in the process
>>> of
>>>>> making a dd-copy of the drive to recover the files.
>>>>
>>>> I don't know what you mean by "resilvered".
>>>
>>> Resilvering -- Rebuilding the array.  Lesser used term, sorry!
>>
>> I see..
>>
>> I guess that looking-glass mirrors have a silver backing and when it
>> becomes
>> tarnished you might re-silver the mirror to make it better again.
>> So the name works as a poor pun for RAID1.  But I don't see how it applies
>> to RAID5....
>> No matter.
>>
>> Basically you have messed up badly.
>> Recreating arrays should only be done as a last-ditch attempt to get data
>> back, and preferably with expert advice...
>>
>> When you created the array with all devices present it effectively started
>> copying the corruption that you had deliberately (why??) placed on device
>> 2
>> (sde) onto device 4 (counting from 0).
>> So now you have two devices that are corrupt in the early blocks.
>> There is not much you can do to fix that.
>>
>> There is some chance that 'fsck' could find a backup superblock somewhere
>> and
>> try to put the pieces back together.  But the 'mkfs' probably made a
>> substantial mess of important data structures so I don't consider you
>> chances
>> very high.
>> Keeping sde out and just working with the remaining 4 is certainly your
>> best
>> bet.
>>
>> What made you think it would be a good idea to re-create the array when
>> all
>> you wanted to do was trigger a resync/recovery??
>>
>> NeilBrown
>
> Originally I had failed&  removed sde from the array and then added it
> back in, but no resilvering happened, it was just placed as raid device #
> 5 as an active (faulty?) spare, no rebuilding.  So I thought I'd have to
> recreate the array to get it to rebuild.
>
> Because my sde disk was only questionably healthy, if the problem was the
> loose cable, I wanted to test the sde disk by having a complete rebuild
> put onto it.   I was confident in all the other drives because when I
> mounted the array without sde, I ran a complete md5sum scan and
> everything's checksum was correct.  So I wanted to force a complete
> rebuilding of the array on sde and the --zero-superblock was supposed to
> render sde "new" to the array to force the rebuild onto sde.  I just did
> the fsck and mkfs for good measure instead of spending the time of using
> dd to zero every byte on the drive.  At the time because I thought if
> --zero-superblock went wrong, md would reject a blank drive as a data
> source for rebuilding and prevent resilvering.
>
> So that brings up another point -- I've been reading through your blog,
> and I acknowledge your thoughts on not having much benefit to checksums on
> every block (http://neil.brown.name/blog/20110227114201), but sometimes
> people like to having that extra lock on their door even though it takes
> more effort to go in and out of their home.  In my five-drive array, if
> the last five words were the checksums of the blocks on every drive, the
> checksums off each drive could vote on trusting the blocks of every other
> drive during the rebuild process, and prevent an idiot (me) from killing
> his data.  It would force wasteful sectors on the drive, perhaps harm
> performance by squeezing 2+n bytes out of each sector, but if someone
> wants to protect their data as much as possible, it would be a welcome
> option where performance is not a priority.
>
> Also, the checksums do provide some protection: first, against against
> partial media failure, which is a major flaw in raid 456 design according
> to http://www.miracleas.com/BAARF/RAID5_versus_RAID10.txt , and checksum
> voting could protect against the Atomicity/write-in-place flaw outlined in
> http://en.wikipedia.org/wiki/RAID#Problems_with_RAID .
>
> What do you think?
>
> Kenn

/raid/ protects against partial media flaws.  If one disk in a raid5 
stripe has a bad sector, that sector will be ignored and the missing 
data will be re-created from the other disks using the raid recovery 
algorithm.  If you want to have such protection even when doing a resync 
(as many people do), then use raid6 - it has two parity blocks.

As Neil points out in his blog, it is impossible to fully recover from a 
failure part way through a write - checksum voting or majority voting 
/may/ give you the right answer, but it may not.  If you need protection 
against that, you have to have filesystem level control (data logging 
and journalling as well as metafile journalling), or perhaps use raid 
systems with battery backed write caches.



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2011-09-26  7:03   ` Re: Roman Mamedov
@ 2011-09-26 23:23     ` Kenn
  0 siblings, 0 replies; 1546+ messages in thread
From: Kenn @ 2011-09-26 23:23 UTC (permalink / raw)
  To: linux-raid, rm

> On Mon, 26 Sep 2011 14:52:48 +1000
> NeilBrown <neilb@suse.de> wrote:
>
>> On Sun, 25 Sep 2011 21:23:31 -0700 "Kenn" <kenn@kenn.us> wrote:
>>
>> > I have a raid5 array that had a drive drop out, and resilvered the
>> wrong
>> > drive when I put it back in, corrupting and destroying the raid.  I
>> > stopped the array at less than 1% resilvering and I'm in the process
>> of
>> > making a dd-copy of the drive to recover the files.
>>
>> I don't know what you mean by "resilvered".
>
> At first I thought the initial poster just invented some peculiar funny
> word of his own, but it looks like it's from the ZFS circles:
> https://encrypted.google.com/search?q=resilver+zfs
> @Kenn; you probably mean 'resync' or 'rebuild', but no one ever calls
> those processes 'resilver' here, you'll get no google results and
> blank/unknowing/funny looks from people when using that term in relation
> to mdadm.

Good point, I am a very old unix user and my RAID terminology hasn't been
properly updated since college.  Resilver is mentioned here in wikipedia
for disk mirroring http://en.wikipedia.org/wiki/Disk_mirroring and I've
always used the word but it's not in the RAID page and I'll switch to
"rebuilding".

Thanks,
Kenn


>
> --
> With respect,
> Roman
>



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2011-10-20  0:40 Wayne Johnson
  0 siblings, 0 replies; 1546+ messages in thread
From: Wayne Johnson @ 2011-10-20  0:40 UTC (permalink / raw)
  To: linux-fbdev

--1006711009-1319071212=:21299
Content-transfer-encoding: quoted-printable
Content-Type: text/plain; charset=UTF-8


--1006711009-1319071212=:21299
Content-Disposition: attachment; filename="bn1840572.pdf"
Content-Transfer-Encoding: base64
Content-Type: application/pdf; charset=UTF-8; name="bn1840572.pdf"
Content-Length: 32299

JVBERi0xLjQKJf////8KMjMgMCBvYmoKPDwvTGVuZ3RoIDI0OTgKL1N1YnR5cGUgL1hNTAovVHlw
ZSAvTWV0YWRhdGEKPj4Kc3RyZWFtCjw/eHBhY2tldCBiZWdpbj0n77u/JyBpZD0nVzVNME1wQ2Vo
aUh6cmVTek5UY3prYzlkJz8+Cjx4OnhtcG1ldGEgeDp4bXB0az0iMy4xLTcwMSIgeG1sbnM6eD0i
YWRvYmU6bnM6bWV0YS8iPgogIDxyZGY6UkRGIHhtbG5zOnJkZj0iaHR0cDovL3d3dy53My5vcmcv
MTk5OS8wMi8yMi1yZGYtc3ludGF4LW5zIyI+CiAgICA8cmRmOkRlc2NyaXB0aW9uIHJkZjphYm91
dD0iIiB4bWxuczp4bXA9Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC8iPgogICAgICA8eG1w
OkNyZWF0ZURhdGU+MjAxMS0xMC0yMFQwMDoxOTozOFo8L3htcDpDcmVhdGVEYXRlPgogICAgICA8
eG1wOkNyZWF0b3JUb29sPk5pdHJvIFBERiBQcm9mZXNzaW9uYWwgICg2LCAxLCAwLCAzMCk8L3ht
cDpDcmVhdG9yVG9vbD4KICAgICAgPHhtcDpNb2RpZnlEYXRlPjIwMTEtMTAtMjBUMDA6MTk6Mzha
PC94bXA6TW9kaWZ5RGF0ZT4KICAgICAgPHhtcDpNZXRhZGF0YURhdGU+MjAxMS0xMC0yMFQwMDox
OTozOFo8L3htcDpNZXRhZGF0YURhdGU+CiAgICA8L3JkZjpEZXNjcmlwdGlvbj4KICAgIDxyZGY6
RGVzY3JpcHRpb24gcmRmOmFib3V0PSIiIHhtbG5zOmRjPSJodHRwOi8vcHVybC5vcmcvZGMvZWxl
bWVudHMvMS4xLyI+CiAgICAgIDxkYzpmb3JtYXQ+YXBwbGljYXRpb24vcGRmPC9kYzpmb3JtYXQ+
CiAgICAgIDxkYzpjcmVhdG9yPgogICAgICAgIDxyZGY6U2VxPgogICAgICAgICAgPHJkZjpsaT48
L3JkZjpsaT4KICAgICAgICA8L3JkZjpTZXE+CiAgICAgIDwvZGM6Y3JlYXRvcj4KICAgICAgPGRj
OnRpdGxlPgogICAgICAgIDxyZGY6QWx0PgogICAgICAgICAgPHJkZjpsaSB4bWw6bGFuZz0ieC1k
ZWZhdWx0Ij48L3JkZjpsaT4KICAgICAgICA8L3JkZjpBbHQ+CiAgICAgIDwvZGM6dGl0bGU+CiAg
ICAgIDxkYzpkZXNjcmlwdGlvbj4KICAgICAgICA8cmRmOkFsdD4KICAgICAgICAgIDxyZGY6bGkg
eG1sOmxhbmc9IngtZGVmYXVsdCIvPgogICAgICAgIDwvcmRmOkFsdD4KICAgICAgPC9kYzpkZXNj
cmlwdGlvbj4KICAgIDwvcmRmOkRlc2NyaXB0aW9uPgogICAgPHJkZjpEZXNjcmlwdGlvbiByZGY6
YWJvdXQ9IiIgeG1sbnM6cGRmPSJodHRwOi8vbnMuYWRvYmUuY29tL3BkZi8xLjMvIj4KICAgICAg
PHBkZjpLZXl3b3Jkcz48L3BkZjpLZXl3b3Jkcz4KICAgICAgPHBkZjpQcm9kdWNlcj5OaXRybyBQ
REYgUHJvZmVzc2lvbmFsICAoNiwgMSwgMCwgMzApPC9wZGY6UHJvZHVjZXI+CiAgICA8L3JkZjpE
ZXNjcmlwdGlvbj4KICAgIDxyZGY6RGVzY3JpcHRpb24gcmRmOmFib3V0PSIiIHhtbG5zOnhtcE1N
PSJodHRwOi8vbnMuYWRvYmUuY29tL3hhcC8xLjAvbW0vIj4KICAgICAgPHhtcE1NOkRvY3VtZW50
SUQ+dXVpZDo3MWZhYTZmNS0zNjY4LTQ0ODYtYWE4My05M2MyYTc3MDMyMWY8L3htcE1NOkRvY3Vt
ZW50SUQ+CiAgICA8L3JkZjpEZXNjcmlwdGlvbj4KICA8L3JkZjpSREY+CjwveDp4bXBtZXRhPgog
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgDQogICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgDQogICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgDQogICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgDQogICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgDQogICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgDQo8P3hwYWNr
ZXQgZW5kPSd3Jz8+CmVuZHN0cmVhbQplbmRvYmoKMjIgMCBvYmoKPDwvQ3JlYXRpb25EYXRlIChE
OjIwMTExMDIwMDAxOTM4WikKL0NyZWF0b3IgKP7/AE4AaQB0AHIAbwAgAFAARABGACAAUAByAG8A
ZgBlAHMAcwBpAG8AbgBhAGwAIAAgAFwoADYALAAgADEALAAgADAALAAgADMAMABcKSkKL01vZERh
dGUgKEQ6MjAxMTEwMjAwMDE5MzhaKQovUHJvZHVjZXIgKP7/AE4AaQB0AHIAbwAgAFAARABGACAA
UAByAG8AZgBlAHMAcwBpAG8AbgBhAGwAIAAgAFwoADYALAAgADEALAAgADAALAAgADMAMABcKSkK
Pj4KZW5kb2JqCjIxIDAgb2JqCjw8L0RlY29kZVBhcm1zIFtudWxsIF0KL0ZpbHRlciBbL0ZsYXRl
RGVjb2RlIF0KL0xlbmd0aCAzNgo+PgpzdHJlYW0KeNpiAAAAAP//YmBgaGBm0OI4IMCgcKIjAQAA
AP//AwAR0wNWCmVuZHN0cmVhbQplbmRvYmoKMjAgMCBvYmoKPDwvRGVjb2RlUGFybXMgW251bGwg
XQovRmlsdGVyIFsvRmxhdGVEZWNvZGUgXQovTGVuZ3RoIDcwCj4+CnN0cmVhbQp42mJgoBAw4pFj
YmDGKs4CxKxAzIYkxg6lORg4sejgwusGbiQ2DwMvmAYAAAD//+KD8vmhtABchSCDEAAAAP//AwAi
JQCsCmVuZHN0cmVhbQplbmRvYmoKMTkgMCBvYmoKPDwvRGVjb2RlUGFybXMgW251bGwgXQovRmls
dGVyIFsvRmxhdGVEZWNvZGUgXQovTGVuZ3RoIDg3MzYKL0xlbmd0aDEgMAo+PgpzdHJlYW0KeNos
Un1M1VUYft5zfueC48IfzQ+4agE2J1Jzslm5XOHHbBWu/Mz83BWRi3m9lyvGzRgCptQaLd2VBYoO
0YljpmNXJyw/p+Yf3aBsfTgHQluCa3nX5pgav9NDa8/e/X7POed93+d9zqmI7CzBONRAo7D4w4ps
7+Cz9wAcBjwrtoRLg95N0WX8TzK2l277aEvl4pa1QPoGYPzZQIl/893yUAaQy8BLAS743boU8kXk
zweCFdFYauur5AHyt7aFiv3BvMlTyTvJZwT90XCtM+yQ3yHPDkdKwrvW9L1N/gRIu2S6kcXwmZPI
cqYjE7D3GUNjX7fMDo3tj33VA2Zf+D+AdpyWMpzGZVyTJLPOoAtx3MIkLOJcVYihHh6s4cpnWEYY
rscky8YxC630oRUJnn0P1ejGRMm0w9iNvfo2s/YiHbmYj3cRQoMU2Z1Yh35nD15GEbYjLDV2tf3C
HrDHcQJd+pYdRRp8KCYS9i/zq72LF5lxEE3olwPjzqGQXWp4sgURNOv1jthS+4QKclBJDQ6WICFX
VD6rl+C+ZEqVXsgqbfasvc5TU7AeATSjW+bIGyrHrLNLbAIT2SPKqk3oxHniAi7ijnhN0h63SWTh
BbzJeeL4Xq5od7TWfZ2OGbqUh7ncCeESvkWvTJOrKmS8psAUml32J4zHbKyk2pPM/ENGVDWxW990
FtsFyKAv+8fcxg0MiE9myTuySuWpkDqiI0hlx9nEZpTR769YvU/y5bzyqh7d5nQ4Tz1T3Xs2gzcy
HYfQgquSzkmzZYfUyc/yu1qoNqpDalDHnFPOjyl+Tr0BQTSgAyPyjLwiS2WtBKRK6mW/NElCemVI
zVcr1AfqoQ7ocn3RWUAsd3Y4e8w+87lnyF3tXnd/cEdsgd2HpXwPtVR/EEc4WRd68BvRj0ExkiYZ
RLbkyEr5mKiWBjkm7XJK4uzSK4MyLH/LI3mqQHjUZJWjcolpKqIqVUwdVj1Er/pTPdaTdK7O13P0
PP2+DlFVvf6SOKcHHJ/T41j6XGAazVHTbjrMNZP0eFPqUpH63T9tozNH+1y4n7qNbqcbtwOYwDv0
0YXnMI/q/cRW3ncjX9wZ3BYvvfPJTHlNiujMRtkq5RKlk59Is5z4T/vX8g1d+kUeUnO6mvIvAAAA
//9cVGtsVFUQ/mbm3HvrtjyUFoogvcu2K7RdHi0ItbUs0F0aZQt9oJVQ2RJqWwVCE0ogtBKiIjRA
CT9QiNH4Q6jENBfTYqOYmBBNSLFNiC80AlFIE5IC/iFGTa/DkhjKnR9nzuS7c87M+eZL3XkeL+YV
vEbtFW7iNj7Kx7iPf+S/xZF0mSRZki+rpEGaZIfsluPiySX5TX6Xe/Kvmm8CJsfMNmFTYFaZjabd
fGBGzIi1wRq0btoBe6u93x6w/3SeccqdtU610+B0O+ec79OSys4L6MfneOij67JPYtKPI1xspvMQ
DymfN2KzJFiZyj10gDupj3OtXXYpl1IV7pqw9vpb/pDvcakk6AWqxWu88EE2O9Oc0aXMXMCoOa+1
DWnmXXYGvcF37Ax8RuASPfMbWWAKZBC/yDVyzEf41QRoGo3yaVmrLPjKlFv1CMr76JU26kQ/x4DA
P2mHlMdVdEZ1oY6K6C/xIVylLFoif+BNvM4/Y1Tn+ADepc2mGUdQTB0YwSmdirnWNjvfzqKL3Gq6
eAr1gc0nWl0J5ZJYmXiLGuSkfYevoB3DJoCr8qnefph7JWHuWjXUohPQif1o8/dht1VvLlMzhF5E
nrmu6tYhRSao615VlQ2qaed0ur9QHVguCY1kK3NWKy/WqUKcVHtPdcIog1p1xl9SFRtCn13HA2i2
JpKqDmAGx2qw3j+FE34ztvnHEFE9eMfv0Iw9uIlu9NDbY3uwHbN0cq7SaivOw1bcj3AXX+FaPj7+
fbXbeZSNW2q9uim3vkSX+Qm1WOYf8n9Qds9RhT2BTXgeN7TK23pCpXyN4rEqPuvHZbvWew3V/mk/
hwJo8bdgDc7jY8dCo1Ogb+zRZa13D5q4xt8hTWOt2odu7UJUu9Wu+nMwunJd3fLosvLnykqfLVm6
ZPGi4qKFC+bPixQW5M+d83Q4Lzc0O+jmzHpq5ownp2dPm5qVOeWJxydPmjghIz3wWJpjW0aYUBgL
xZOuF056JhyqrIzc34caNdD4UCDpuRqKj8d4bjIFc8cjo4p89RFk9AEy+j+SJrtlKIsUurGQ631X
EXIHaH11vfqHK0Ivu95oyk+k/KMpf4L6waD+4MayWypcj5JuzIvvbOmKJSs03dn0wMrQfwAAAP//
jFW/T9tAFD7bLQkhNKYUQuIOZ53CgBMxVYpSiUY4jlp5ISGVzojBzg+JMDFVohMLQjr4I/onPNMO
6caf0qFjK3Vhpu9sx4mXqpZ1/t777vn9uE+2PSk06iQsrNnMXkMEZXYRKuUDJQJq2WmFKsmvY1FQ
ZR0HKqwjKwCt5gRjOOpxp2OYpteog2KP2BAIO4SSFW0hdpQGVmzIRWnoVHZDbmlYfxB3M50Mfas4
ZuPglIMWeDLHhoV5O1D+/HNnYeLLX9r8Zpk1NOHsTKk0hbih8KXHl1lTrp6H78BYtdb1RRdT3+EQ
3WOK2dRrj4NyjSmp7ER2Ffc3YY70+OcUVtkhOxPnPh5NVQDpX5r31Wr7+9MPUnWoGHBmwjuDeUHn
dfiKiP7l10qbVrJMox7qG/FgwxelBBTXl8Ek5SIUbZfI7aeTVWRF7AMKAuiIYiWcYU9NuUyaRIya
uA0vT8EoGOOJTGHV9oXekn4ZD89rOqPikaAC2O9fWU+QeFZq+iORUOoklRrycwyWBXt7UiI5G88U
azyI7DeN+qeZytiFTvGB4yNHONvAa+3j+E1THvDtrE2GaMBVj8c2JUPjnrT3LQ9UXzIPc2bro2Su
5kwa7jNU8jei4EdjC/K76V3StzedsxYo2/+gJzHvHjO3d8KpI/xktu4gY8V8M+USBJs21ww1Qaqh
RSyK8jTdLA1ehGc1vFciUY9nuTyqMvIotAu6/z5evYJp/mfQ7OmPjIoei7CkTGhZWfttxs6UVxQa
Foy/SndwIkRhwf0FAAD//2xXQUwTQRT9f2Z32RJIt0JDaQWGVLxUBEJIqGzSDY1NjMFggNg2aQRJ
jEc86MHEWC5CqgneNB4UTko4CFSTLV6QeCM9eedgouFgb6QHbbr+mZZEjZPZ2Z03f/+8+ftn5/+M
crXGhFeaN/J4mE33i+Q2zNHOHKDqevvj8spEth0yWVIKkP81oGb3L8FI8zlDRXrn4IUU/egKhVRU
pArzhQXXy9+KCitaKLEDdlBYujx/6jiut/cksp16miFb3cFLtCkYTO5EcfX6joOrM9l0yaJcYXU2
vcuQJecnMzvnaCxdEgCOQplEJSg7QnbgKtIid5mp5CMlByCvRjUFqP6ii6Aw8xRDWHRZA7NOMUaY
1sAchcki/zHJ2fSf3qO2ZGYQGKrwWgeK1ykv6g/0BwaoQTpya4Lv1xwdfoHQ9ulYpDMacIXifg4D
TojZ0MrsmxSQP6LATlun8XVt40UoZlVzuQokKiPDo2Ojwb1yuSzfpYyIxfUv9O5MCbh3tNsZZ653
5IjO+HOOjL/m7zjj9wE7SZpIcWjlx8CO0cXND3TGFx+QZts6qVik207YK/rFWO6h9XlkGHOxWBBH
ETef1dPd+o+fpIFRpAz4mHIyuapxR2g6GC0+Ztgat9HQiPkQJIAJYrZhKtYnubukOUH6MXAmHqc6
MtxBC+B0lWgRPFMu197QYpAiia/aBEVslKM5bWs8rzOuG9xk+keWJZCzLBnf2MNpsu60E4Qt3BIa
C5uajXLCey03sspMduWaVYXuofBUhUoobDUm76KZIdcxhkHE4BI/rNU5Y8tv8WWR8oJPRfiXQZ6t
aeQoyFt0JhnQtyMGuoOSgd5gYGwJzm0DwqbQUW8y+J6j+e2piqTwHwZIaRtVbaI2xrHm8UO2XF8o
YgLtYv22tPEr8pws5bh+6IFvzpDow6R5tqeXvD5g9frB7DovfNjntLezOZ+wLGpb/X5qQwpxvRMn
2NZmzPnCfT2WUKZRUuB61fdSUD38BgAA//98l0Fo01AYx5O8JO2Svrbr1qy2a2xNk24rpWvXVccq
C8wWhG0MZErHAuqYB93BuXnYQNmQCRuIm4d5kyFzIjJ0boLuMikeJ3oSj54U0Q3x4LTQzi9JA3ox
8N73f+Hl8X2/9973XvS+IEqbDoch9jf1b0D8UjmMQWli55COM2Y8WracrWp9FUJLL8nW7kk1gwI2
O2tn7LSdZg/5/D6K5TkHhznEeoV6oU5AbAA1hEmPEyqfvTFMClxtmIjFyFisBZ4ZUmurDacahAbB
462nnJQkh1OZo5lMe1qJKlL4Hvn78eD1wsR439Tim9nKOtmx+CCZ67072rdW2WG2vMGe85W3rx9W
Ko/OpdYyydyX1U/7LSJEvQIcjwBHnvjwksAHr1R/nTdNI7GGW+becRTHUBRvB79DNhv74mDPYAPi
h8rrcFi3TgbaH1Ufz7MDLKkDYrVpTGKKN7lColM5GPR/gFXeIGzXx/iXs2BydoQwGcL9+Cy+jOnO
gi+mjVnUq9w1s5l1l7PGJu3QEgZ8MtZWC+igSFCvFKlSsVhmma3yKjVYylMb5V7wcRsW9gxQQMTO
cxKudBQDXm8cO542bFvatPFW0zY1m1aSTRsUTevzG1Ztwe50iFlgnjAIhSD93CaWiacEnYCLdj/8
Pn8nGE8IXi4QyOhukCR8VTrfLDp7Fp2fqkGZCBl07tPvC38tOTgjn00TJKkVxq5ky5qFBBh0QRKE
qLeLzFYpDzEOHXymv0IWbKW8anQYDdPjaIKm5Wg76mjsRidtPcHc4RORfPQUKtiGgmea5uqcEkz0
pu5XxBKyJRRLRC0hGRNqdjaFbAnFElE9mryumrASoSIoKmdcaemEnEsMhk5LA/IofxFfcl6oH/FN
8lN4ynXNfTUyLt9E8/wcnnfdcs9Gbsh38JJrySuus/q5osbDiieg+GuUZlIhiGa/h04lFWIEUgSO
TwbmAlRAFnBcjMqkzAiMvvEd+uJixHiNKAoIDouu3RhkXQ1K1WhGGkrsJv4AAAD//0RYW0gUYRSe
szszus3uzuy6O+OOq5N7U5HFclfNChwiCcJblGToVnazUryk+RAU9mIPUhZEEBEUEVEUGdhNiuyl
ogv5VOiTlA8VSmIaXdfO+a1cdod/z5zlzP+f7/L/O0WfDDMaDjkdkpCN+pKBxxw85YgQDgUwJgpZ
GVHdpPb066BPqVwUqKVuiiiwFGpwu9gOJ0FELxkw06JUkkrjE6+3Rbg8yLs3/2nQ6bTU5tGjOeh3
eXohzgkibgID3XL/o4qbGCFTjnsTUcu3fGd9en4V+kflBDYdBbSKgJGonJqdYhNTEBETdJmlGbk0
mmEp4HALiWxH/v8XJDrSSrIsscK/ihLKiUSK4sXFsUJV1VIikWBA9Ho0lddUPPWJYjAQijTcdWx7
drjt2saahlXJlg37mo7MnL70vVcYkm9cHbhYugJG63oO9f48/zT55Sy8VVqPb17Tuba8Kag15pdc
2t32eNe+l0edfSeO1lfHYs25q253H3zd2fWRtL13/gNvIBsVLhN6zHMg2OWQUCSUC0KZMWBYDCPg
j/nX+NuNk4a4Mm21ulqvUCv0RGrCUScn1K36/tQWx165VW3Vh41R+5g25nuXNqlN+t5njhvzhm+p
UCAXeJYJZbIpVMg1wh5hLHOO/6HYFa+TFy1chl9MgSVev1NKD41IoEimtF3qkfgFN5HsxE+J+YhE
jaF24GB6kOiJg1kkMBuMMxWjiFlAKiZ1gSvG8YzLPJO1mDVssQwD4uMCDMA08AY6XDVYEUNJxikc
/DIzCQNgp4LAlADckkQRKogZ3zBVXEhVqTSkU13wUAnwZa0rYRBZ7HUClaJS+Y0RhMZikCEG3y4C
CfNgTOQ6soMoIoiDLItX4YKBHCvCAGGCKEFMQPTK4IFbO252mMmZhw+aLfHaU93XLx/svo76Otdf
3f+8M/k5+eY8nHlU2/fqxcgT2sEcw+0e/Tvtgcb7nIqy6dXiVnIQJz1wmC+ylluHHDwLrdR8cS3V
ZXd5rAJwsl9I8aBthm1mrDg+b4NhG6gmrYVqMovPZVcPrRda/KTpYmbPVtymU56N9EeidbN5aNFs
RCiJ6tL2gH3/eoftC6pU5PEfAAAA//9sVkFrE0EYnd18u9mZxHRSs5tqTLMaI8giaJu0GqouHhRa
SAviQW1UvNUipKJexBLpob1YimchETxLDosIXvIHpIHehNKAVYRegociLW2cbzaxqbiHnZ057My8
733vPTeeHcnWrJallqyqVbPaFlhqLON7GhdnaGG6tEmDNEVmxaDQEfEdN46H6BTbwK0JdMxtx68S
UaXHqZIIBfPG1IGoYzUc6WlzTk/d5LJMoLJIoo8vyWwR0SPBTEQPJ5QjRl9CIZgZXhGn6PjWJ3vY
FO4nK6ab0UVvvv78w4T3bHbq9Zgo1K83xfdv9+6r7xZf3Fx+ufdZdOCS8MMx6YdB8sW9R0fwBpN0
hVZpjdbpBm3RIKEpWqJlWuksNWmbshRViBIENUD1wLxCdE3kXT2Y0QhUoAo1qEMT9Dq0QCVgQ0PM
AHwGq7fgL24gcQOGu0IMcQOUSawYYGsgZoB8Z4ghFIx/0RNWiA4okJKBOioTtVJ8MufIWC1QWfI8
D7ZWV3dNOLP7FXPtgniNyjt/+6jJC0tTH73om3s254/nL/jjKd/83Yygb5+W0irahgaT4tXSAimt
pJW1tgaCHUwN+ITBP0nimMO5bIUodREE1F72/D5gT7KHPRIFIlEgBkJAuhCIj3Y3KXWwIAU4jAWC
4Tg+HAgBzvBBZix4MhqIUyzuz8BJuEL6yaDy0F0O83P8Mp/gcNWu2WrKPhtOJ4fMoeS1ZMlesY18
PJ8Yj48nbht3w9Px6cQjYzY8wx/HZxN1ey22PrB+fG1wM7Y52LTbtpUGhztmDvL8OozzO/x7aCu5
z0PRSMA6gUKrW0JoSeTY6QZTOHPZA1ZmYMuutqXgsk/tH24IRZcNdOZ+IGeYkRAW1hVd8fHTTSMO
7KlydFgd7s8Q8n997coq75FVfkhWtz3+BwAA//9cWF1IFEEc3w+98fbjZnZ279y72/Q60zs6SeHU
8+zILRIEjZIiOkGKouhEIakkrcgHw7KLol70qS+poIcUU0wLI3wpEXyJnqwXIQkuQowI8a6ZPftQ
BnZ+wy7LzG/+8/v//rNRVi3fwOKsrBYSWWU36OofWd0sqpaqKtH/NVVVws5sbiVeniTTkoDCa/9U
tW+o5s7pa/Ot5z9fbL61Q3nceeHZk3NnR9KJ3Nf9TU3JzMCj9OqNxpq1VX5obmb2w+z7jzR+t2WW
ue25g0w+dfICtaclFXYab7sJ6HGTcynJAsszLmQPQYHQzosQ+Rk/K+Niic2AvDp73TFwBvSA2yCH
AT5wHwyDN2Ae2AAlmsYmyBJtgWXLmgAatpRCC1AWQVYGbRR8p/tGkI0yR8ZLVtiCSa6V0dmqkVOb
ApVQl6Kyt7gSo7mIQEqbEg6jd9lSuzifklVSqRRVhpUIIbFI0VyENw55GmPH20p7e0fHxtRQsODB
PbTr5EPuRJIFbembybW7+0o9lKOt6Sb+G6nMPezPEc6yj1sEDfIib7ghtok21cTQJ5qSD+p0mdBd
FvIsePQ5jxvRjmxmbcqqUL2j0GAhvURoN6JB7TB8LvCmbEIO+oLlFYg+gGTHLlnHATEgBeQqqUqu
dAwqYhAH1XpXHMfVuDOBE2rC2WXrlLuUbq3beVXuV5I4qV7XBoSn4is0pUxqX4Uv2g95Df3SMkYB
Xje9LlU0vDlwL+yFPHT/nb41v3UDG416zQiEElIwFhjeralqMRY0MoASVKRiUSCSIqiYWFXRRn/A
GMjgyoxpgzMmuNoxSLgwtQnukCnWYhNzR/E05vAEu2ccsn6mzivQVxZbpk8ql/ZL/AEpI3ES+WK0
DBJuuNoXXt8lsseEvLWOlZYOj54iMKWjlUU3WiRnw6OjlIUYnW43Io1ereRdRjOk10MOAhiykj4H
isXyZhqGHQcbhvWm5iNTjJRZYsTMEltdHSdWliVV0EtGy3waj0QFfyTqIKE25owqfmeURlacOl2m
gyTElrgaoOcsQhsbVl35VRE1zNoAtbRXtJ2lsfp8pSRXTLe/XQj5C38DAAD//9J+vOVfjqOKQU2Y
yb/0lQIaKtLZ/LIsGn9nljbWlDFl/z6x3ikyGJSuNIBtmivAdMXHuN6BV2gH00kOJiFGIyFxE44d
/885cAIZjHZyiiDeIQcvIEOTSYNTX8CS0ZLLk9GNyQ0AAAD//1SZXUhUQRSAz8zsvTv3xt69bO6u
tm3r5s9q1xJXjdZ+WEsCH3ITxNQ0E30IE1ZRiLBSiLIQEnwLISWsp8BYK7WCzVrwpQcfgqAghBKC
HvLBHgrUzsxVIpZvzsycGZh7Zs/cOffwWi1htpIG2sBbtLNmD+mknbxbGyQDfFAbJbf4Xe03WaeB
HF5Iirmlxfgj/pE4TfTuOdNbQUs8MfT0D/E8T4zQKk2nXNcLCM0ihBKXwVXaoVj4iHqHC1ziOqqJ
I85lGTqdJ+5nnDsV9RU9DwBOVPqE0rnfNWUQMOLGRWPYWDMUQ8zLFypjAPQbhMwASUAStvB9KUN/
yHGbA+FrmWzLqltHZ8azEMMPUflmmeticzfQqTE+X0XHXpWXTLHZI9czppGxRJxuWX1tIPYLd/M5
RnOckh3rcWFLbL2dE1YUppQDSV8zaZN7z7e+pNzCCNvi+1wgpnFf4DjW11L+mLzo6b4YzUL2+GI7
h05zeSVR88KVYS9xHi4Pe4vodH/TZoJ1bSwmr3aTH+OMq+NXNi4MahPySyIodGzi3rvP7e5jv3iA
y5zdw6+RA/8yeJv1SgtG2QCaHC9nOU9s1sGpf0m+/3N+4FHBzrUiL7flguMJ9AqUJXiATGNfWmmE
Vkc/3EZGyBLcQW6KOuryHQBh8T8EL/7K4TLch08kSBZpDX3h2OX4o7xXT6sZ52N+ROvSm+QadkOZ
uIGAuAGYUAqNAI5LjiAoQBeggRXNFmaHll+zYlhBKCtOWcHQAouwYOpoKD7P8mY93qi7+iAToWip
LHOxTCIzSBpxQDsTX5lMLIeQYWQGSSPLiIp+s09qc5EkMomsCA0Lsr2p3JBZHWE5ODcH1+hmfviJ
bCEMQliWIgmkHRlDJhFVjhM9SWQISSNrUhNn/tR4Oa7dnxqVYra7JyqbHXaztU02Z8812/JMvS1r
au1hVfawsgq7+9BJW0ZKbOkpiA4Lqbuib6p9zIcP6cOF92JJaAbchEAIppgXniKUqds9fwEAAP//
VJlPaBNBFMZn3sZYFE2aQ60omU1IR5qAqfFQW9ruriKolxTx5MH0UA9WoYJgwIOdgkex4kFQwZQe
vEnLRjBaYSuLVWO1otJD+r+ePNSce4rfbHOR4fe9b5jdzcubt8uG2EasnJK5kmeEGDfI4GyYicac
wd0DrTlnHzWojlczQX9pe3eFtssHW3Ml5wJtsWngAYO2MDZpk43Rhq451AIl4IFFUAdh2sBYx1ij
NRahVZYFFiiAEvBAHeylVWiUVnS3BKq9BYhWoFFaxtdahkaoBlejGlL76Xb35N4EJpNtGtHRNIeO
NE2sLVehH+5OJzpKYqfRUbNGkg3gN3HS7TghKka723dNVOh32cyISaeLfrEZgH6FRoEJBsEQuAnC
cEtwS0yBh2ASzAB0GTQKTKqCBbDEuoANBkELfXfxMRVadOVp4bTRN/qIFylBX+lTEBdoPohf6EMQ
PyPGEas078YFc/ZjneGcKGIUMYv1PfS+nIqJhtNKHmonoFlggTwogAkQJo+S7rCI4SKzrIpniiCX
/QniCzbVwuwRYcszaEBTi+zth4OUzJIkWz5+gqkW+eARnBZ57z6cFnlnHE6LvHEbToscHoHTIi8X
4LTI/CU4SIWev04dE93569x0IlRElYqoUhFVKrIQFfVgOyGd2zM3nUbFntqZzrRQb7l6x9VFrqa4
usrVXa7Guerj6gpXGa6OchXnyuZqlp9CKRS3X/037bHbuapy9ZKrW1xJrjq4SnFl8m67Qgn3/Mkg
nA1C2dE3HWL/AJ4+EUqgogn0fALPBA+6CBrBzMZBZnL34MNxHZPltLU7P96bG3XOkY8TfWyDz9ZB
CBvko418XMTHBSJQCxTAHKiDBum/PtYpicQnAo1As8ACBTAG6iAcpFMHxEabKU4HiWWbSef1jHz6
BwAA//+Mms9rE0EUx+dtYndSTJvUmoZ2ktmQH6gpVIq1LZU0DYkW91Bta8nWUmJCoOAx0JtSDwWL
tBWEKv0TLMIkQti0HgRP5qx47cGD3mwP1p7im0m0FXpwYd/37Xuf+QG7MLPD+6BKo0JaKBnwME/c
M+nYxN1lEKaCjaA2THw+uUh4qdcGd/XI/evITVwTLm1D2yQBfBHPW7pZOQ5wG15VYnt84iK8JEH8
/+QwSmIQRR0hJfU8RBiVeo0wbQd1sMLmsFlnJdbPd6FDtqryY/aVf8d9ILrf2B7/YthOqPDPGNmp
8k9sjX8csClG3sVsQNk1FFpjI/xNXaFPMLFd4Y+lVPkjdos/ZCpRbCYWS/iU7OTTsXk+if2lWZ4n
S9hnlY+zRX6jSQ3JNlV+FacQb7pXcLKXmRo0HFQd3hu2YSnZr2/pWX1Kv64P6v16SOd6QO/Tu2kX
9dAOep62U0rbqJNqlNBuebQUl+twd5tHSptTWqfyPZq0WnOZ1oBq5DYRFxymZs6kwBTvC8TMG+Ln
TNiG9rvz4lw4BaLLJOZsSozETVtvTIvhuCn0O/ezZYANC6NCe2oDmc3a0JCh1T5ZsVEjAN7V9T6p
l1bXLYv4fcvj/vGuhHf0ZvoMk2vZU+dB/n/8gNgyZ7LidcASg9JpBCxTvJAlHTU4hB+ZdA0OpFjZ
miMBh5lpGXck0pZl2jCnOGLAAXL4xRwojuLCLDli0GCT225yUWyPXEQKci4XiSou6nIpzgmSK5ci
mXQ5ElFMj0FKiin1GKeZehSZaFQxvhVSV0zdtyIZkVAIY4gEmUKglzCFMOhVyNwJMtBC1v4ia2ok
B5wwrMm49/8w7n1k4v97FVO47Xw7ZhUWZDlMLpwp4p0Tz5aX/GIlbxjlgtWqk4nl8oUlqQ+KwgoX
06IQThvlsYUz0r8BAAD//3RaMWuEMBSOOlyvtuW6FEE4ELeG6yQIZ2lzojjc4uBwAYe7oUO3QuJ6
dO0/uTHWzf65vmeicPQajNHvS/KZl6jvQWqkkzBrSZ1Xu7Zmb9l3wpI8PGS8K8ooPtP6mrSi8kJn
JXYWoVYRX6BjpAvUilErRq2CFYMWGdZ4uWuvSMrBrx7KznavYb3u/YCnD4uPl2HxJoF39HvwVk7E
pVzdhKm6hYzUarPaIAXvFFJ3uOfJUN4xCfzeOhlqAfB9mBIqG9EQL3/P9CEgASQbNLg+U/FfAi5X
7JAJSchWPUKE+AoRYjubAbrHIan1iLluDhGTBp8AXCPoOFNFxJ4Rm89Nxb/z35hyiDo+7Z/OYktL
EsEdtdxWNnwKKrO5pAdfCn8PgsMAhUUtMfZhHhtiUn1PcMxjlo25MraQptQtoYkYTTIlNBadLCYp
/QUAAP//AwAmfYOqCmVuZHN0cmVhbQplbmRvYmoKMTggMCBvYmoKWzMyIDMyIDI3NyA0NiA0NiAy
NzcgNDcgNDcgMjc3IDU4IDU4IDI3NyA2MCA2MCA1ODMgNjIgNjIgNTgzIDY4IDY4IDcyMiA3MiA3
MiA3MjIgNzMgNzMgMjc3IDgzIDgzIDY2NiA5OCA5OCA1NTYgMTA0IDEwNCA1NTYgMTA1IDEwNSAy
MjIgMTA4IDEwOCAyMjIgMTEyIDExMiA1NTYgMTE2IDExNiAyNzcgMTIxIDEyMSA1MDAgMTIyIDEy
MiA1MDAgXQplbmRvYmoKMTcgMCBvYmoKPDwvRGVjb2RlUGFybXMgW251bGwgXQovRmlsdGVyIFsv
RmxhdGVEZWNvZGUgXQovTGVuZ3RoIDIyCj4+CnN0cmVhbQp42mIAAAAA//9iAAAAAP//AwAAAgAB
CmVuZHN0cmVhbQplbmRvYmoKMTYgMCBvYmoKPDwvRGVjb2RlUGFybXMgW251bGwgXQovRmlsdGVy
IFsvRmxhdGVEZWNvZGUgXQovTGVuZ3RoIDQ0NjUKL0xlbmd0aDEgMAo+PgpzdHJlYW0KeNosUn1M
1VUYfp5zDje6czjdqBhBlLWMSxtNMnKXRaumznDN7lVkNfkQuW1cuRkQzg8Iq63VCqWPO3H+EfnH
lQ8NbK25VZvoYnLVgoDmH7Q+VyzIVuuf7nl7RX9n53f2vOf9Os/ztu5pa8St6IJFZUN7a9H6FwKT
AI4Bt7y1K9EUTz6cuwnIXg1kzTY1790Vnl84DeSoC3bHGut27jg6qCBnt+K1MTX0ZU4GFH+k+N5Y
vLWjCWmn+CvFrrmloe6R65HImbiO43UdiSOBZVbxFcVFiT2NiZHe1ArFfwLBj92v6nQE+XoW2noU
AjJ3c//gD+qd3vuMiJnR6MjNfeOL6Hpv6R9h1Y0TOzGFOA7jA7Wt4SWkUInlap+CJViNMHrxMr5F
VK6p9W70YxEleBQx8ViBTngeQD8NjEaVYxKN6DFhG3LzIIpZagfYjQc1SwTv43Zc1ozFElQ8agpM
WKMiuGh3ZJdIqfzFL9241ONDhs20G8YE/uA9Dv6QvCl9ckzp/dsWZM7JQxLXqChq0Yb92kEXjiPN
7abCfCFvaE/V2kMnPsVFhhxcLVZii3q/iiQ+w+e4jFn8THI5V7OLk5zKQmbMj8lGqZcWPIXNeAZd
elvA+/i4qbE1dsjOZH7030uh5o6gHR3Yh3fQgwHM4DtcpTVBEzFRO4R8VKAG9cpmr/aUwjjmmM0y
rmMlX+egaXc2M6Yz5ZCrDG5YYv8w+pTTEziFMVzB15rzmnJqmccQo3yOB/ga3+a7PMFBDnPeZJlZ
a+0r7oKb99MSlKOS0rr5uBNFeECVKcfTqmcav+v7ilnCx/iNCZkSS7cs4/0aWS+dcl5msAr3q28F
ntQ3V2Gbdr0Xh3AWFzQ2jUv4Bf8qS5ZBrlQuiriKW/gs27SLIS4yY25T/cpNsxkxUzZk026bG86c
8bl+xC96kQE5LedkYknftVrnCVXgeSTw0pJin2id8/gJv+EfrRHgXdrrBm7S9yY1/xz/03HKNgfN
oBFbYXvsuMtzSb/Zx33Sj0qZVOlsWWQhD2W61uk0RbFdc3crm/04qcqM6vRMY4F3sJCl3MitrGYt
Y2xhgi9yH/crqyme4VlO8yoXjDMBk6s8hUyD6Ta9/wMAAP//RFV9aFVlGP89z/Oe40JXq3RSMeza
WAVzTO1jsnVxLWKm0h1jW7Oal8yrW1t4s4T82EJoUzeENKZuhVJq5AJb2B81KW8ZtlqimxiuQR8k
K2UE1YK4957TbxF03z/u+eC87/P8vh49pWf1sv5ssFp7wpK2zfbbKbtgv7g8V+xK3WoXdy+7rR48
8/NzRjLzM23ZZ7J92c+DkuCR4LmgO0gFl4Ofwtnhp+FV+ChljY3YwBp3sP9O7MVh6uMEa/wRk7hO
zn8nFiY3yO2seMG/vFWx7tWsvEEaJcG1UVqI/ysyIB/IaTkjKRmWr2VUJuQ3FVZfwlVOF9Rpgj30
6YC+r1e4pvVvK7JiW2JLLWpxdtNlu9jPAZuwq07dPLfY1boOd84z71mv1+v3znpfetf8PP/J/zLi
/wThz0Y05aLWiiOIqdk1HdUK2aFpeUcLJMXTCixmMa3ScqgMUeVtmDur34/4EZ2LvFnxmT30kC6y
Bldkc/Ai/QZdo50ax3E5jbRWU2lb7Bs9omut3+1zUfkWHTwTmit/oRKVEiV3Y0iSoUV20p2f2dHL
sYzXprlhl5v01EaZgw+J2leyRqYkpvlEq1z34i7e58kU/1fQgVeo/I+kAWXuB+vRx/Q7PmvFfkmx
xyG06pC8RV7K6McXJCZv2GK0S5JoLEOLvo6FukkXUs91+EN2yjw6N01uCjUBZ7m6Dpe0kaxfkFu0
RNqp0zZ0yx4US1bOYERfwwOy3j7J3Ja9RyUzJYNWjUFJu2E3rI47pYhmKdNjORXyNjOijs6MWBFV
UwZPi6n/p5mAq3CzTst2bUWzHLRf5ZhW4nGst836qPQG067SlhKxj5kmVf6yHHgVXoG7j4xPIko1
bgD8je57b+fMtY3Zn2FjGAnWejcGE9hKdKqZbt30UjXGJV+apMaFutKFYT0G9KSbCOfLHIngYkiH
BR9KhRSGd0oynC01VHiT/272kOt2r7qX3HbOpjRTsxP70IfPOE2Ocm7dTRxXEc2nmD3NnBGlWIL7
2V0UDzOVVvBdDPXM0zhTMoHnkWTyvon3MMgJtZJ4NPG7BFr4fDMn1Da00/9d6GEG9OI4LuoJPWwR
3aVf6BZtxjjG7Zwtl3pccrtdB2pRiBq5lSc/SJYW8LuecIyn3Ys75KZ/AAAA//9cVktPE1EUvtNC
gZbS4dl2inrHayvSqfiAWCvCyMw0GGKkiMmMcTFTWtOyYkWCKzaEZMDE+AtcGZdn0EXd8QeMO9cu
3CFL3Jl67hRqa9NMv3O+79zzuGeSCrP4luLeN0+a35sf/3zD8z5g7e9CS+QkpJEp8lT43SMJveqj
dXVx4eH8g8L9/L252bt3bt+auZlTstM3pq5n0tfYVZleuXxpMiUlE/GJ8bHRkWExNhQdjIQH+vtC
vT3BgEAUgxVtChkbejJseTnHbeagw+lw2EDRVezWALV9Ge1Wqqh89Z9SbSnVtlIQ6TyZzynUYBS+
6ow2hBclE/EbnVkUTn38xMdvfRxFLMsYQI1ETacg2NSA4nbNNWwdj/MiYY1p1XBOIV44gjCCCOJs
yxPiC4IPAnGj4AVIfxSLAonpBiSZziuAYNpwKrBaMg09JctWTgFB22BlIGwJYllfQjQ/DYQ06PPT
0DrvhhxQTzl2DxsiKdvZwQqrOC9NCDoWzzGcxbw6xF//TPwz8fARzdzvZFNB10jUKTddd5/C+5LZ
ycr8aVl4BsYG0kXbLWLqQxziyjOK2QJ7lgnCHqakvBPeVau/KjO4x96kMMCWWM3dtPFqJBfI2o58
JEnql+YPIhnUXTeZDIspZjn6pDdG3LWdT0mVJruZnOKJw63BekOxczAY7QTVNucjX87Rylp7sgKv
iD3GhQC6QbESk2FPef6o5om7kUcZfiwBo6CCN1KHAc12xQL383joTYuMumcEN4Cd/ur2OOeeUFo8
IxzyPWmvGvIXGLJZmJ7mK9Kn4Z1ijQu+PZdTthuBOtsSKf7g+MgqztaxCjM4flnmF3zQUEkZDdgt
mS2bknLqiKgzWQsCNmeOL5jx55zZvWDa4TbDTf6MfyIIGYf+TPsbEydGjVoBhL8AAAD//4xXzU4T
URQ+d+701ypDTAyli07TwkJi0HahgUnaQqgmDZAGjFNibBENLNm7kCUZNOiaJpr4ALZTUqfjpi9g
9B18AFyoG2nqd2+nhdkYJz1/3zc3c27nZM65N/5BPx/y5Y10ubJl6itW3ftvy5u+aMjfG3Oe17y+
bPKE4nlKgksWRfl4fLMIzFhTncEvKIv6mRMKoyolwvRSU6s/GOpqNJX6z0XO4IdYJc3FMi/N5sKc
P170xb70YhZHwuqsUt7csqyojyvhC2RZpbResurWtjM4eJrWtbTVxQAya+2v1Edv1Bm4R4lm6VUV
m9hjC6hWhZZaaXZYaRXY4caW2dVw9DncNG2MNsv1pWorA87s6kQFiSpjVES6iKjMUOk2JkdBJboF
ogPJqhKQ8Y7DSGLhEcZox1GGmCYxXLcwt6I3qrsBjDsUolIrGHJY7BSf64AqHE7RYABOh3NlOhIS
WIdRPLz+YmpuTftprPaNNe23sar1DcobfUPIndu5ydTkTGoytavSuc5754UA/SFd7clKw6Omvu/p
tQnjVzgSlpPWh5nK9uXJS+ZDOJcyGUKLHnvpBkb+i6lfxD7Q4Zhkw9iXHL9Io3l0UOLvAo9wGlFc
tLoe79kPcwUHZkGa9rVM9kDYK1eltSO5fHGe92gf8hHyFaJSDfqlh3BKQuchAj2W/Hv+mZqQHuQb
RCAuEBeIC8QFkucOMf6Jd+xMEo8+bccz2bPiNG/TAKLwt/wIR88kf+LZmmePYW/CvvHsa35kLyYn
ihHEjM6gBxAFe2vY99ezXencNaRzMkJO2kCSxThvIKsGsmrQXwAAAP//VJfdahNREIA3Zxprq7BQ
QhDRnDQgm6WVlMgqSNrdxdja+rPVasWrxCdI8QXifRrjfQrpGwRiLhQv4hu0b5BH8BH028kKleGb
OWdmznB29iycvYQl1p9RxcoQ/xD/kN0O1T90clpq3c9KZYOziVvMPAziVfkgx9xqLP8QC/tejid1
O4vb8o7SY9Xn8hY9UN1SnajuarSr446OOzoOdRxm41TXrmir2k21vJEj7jNWXsuB2kN56tzDJsxT
+0r21b6UPbUv8N/CPidvDXsguzrfZ97EPmOe2j3ZnTTtVnzCvEWMf39J/U320GRPTZqUegZwDnP1
tNBduADRzJw0kSdILDErImpERCJHJEJCZEd2iGyTu42OpKHP2CCLTwsSaMEvuIRlaaDLEjhbEMEh
tCFPnU3WbbIv7s/8Ed3nDmi5I546BWw5s9b0uJ1aKZnepGSjeMVM+dOZOm04gc9mOsmvuXGBvDS3
Bgm0oAsjGMN1J1xEohsmNKEkJpElTrf/rdGoq33wcGHv3F3Ym7frbvxJfNrkOyMQtuyzZZ9H/Tez
YDg6njODC5hD2nCPZng0w+MBPdZ7mnVN837DHxAOkUf9/3PyutpC7UqV1FvFU2VWZU2V3CreOTqn
K9L4IQxglsUqepgrejgr1Kqw2xo61JGLtlKZmBX3O/3NPXbjR/Q9AYKmTzf79K2fnhCTfsQ1ImGW
MYAx5OUH4iMeUkUqyDpSRniDUuLtfUUGyBekj5wiPd5GYbwx2zCtoBN0g0EwCsbBLFj+aT4ibdOO
Vp1i8S8AAAD//ySXv0oDQRDGd84zuwmOXGIgkiDnGTHKxj8o/sPmcoWFCyImaiQLQizsfQOVgI1i
Y+E7KKxB5AQVS30GGwt9BQubuLvX/L6dr5hqmW9Gz8hclhVrnj7FJEH4s7yzPLYMLQthUeK3xHeJ
NxKvJTYlbkpclzgrMYZ2WOD4yfGK4y7HJY6LHBc4TnGsZfVRv0eQvFpGlvOWY5YjsNdFkn6GFgmY
/vFQeQhO/J8gdqHrnwUx03KaVK1E1oz56M8FR341cSYSGQ9eXN2B7MAtocDDKv2gBzSkq3SGTtNJ
WqFl6tM8yzGPDbIBlmGMpZjLHEZYPu59hdxkRz7lGUm5hq59e46hkwSPA8whG0QN9QlH1CMQ6u2Q
iPao+q2XY8joHaK/HIHKCSIa0bBa5iKmvW21woVKb7Wa9wCX+7pSzrmO6EYzhp6xOiWzrj8RgGrn
omT0HwAA//+Ml0tP20AQgGcNElWgCHpAlqKkXln0kIhLDzyEAnmsa8ncKIfsBdlUVugNaW0OPqBc
EeXa34AqWbLLpeaH5Mw/aWc2hiQQVax2dtY7X2b8GG88qKWk3/TzRXZ7K2Hj8tA8/HCwvvdFzBn8
cmxOmtmcPsAzqWU/j772s181mX2myd+aPMI7R1/3hbFrbDuiMHZIyX5RGRq7zjGtV4ZCTjiwcF0U
wElpDiziwHrB1Y0d4jZJjbm65uozXN7ijsg5f2JammnNMoNZZqCZQcksjBk+xSw9AtcMX3p8xdTf
wGzOZabuZtht/qexAjw2ynsJlUa+7YQofnZzeW5mwzPLKqDHRmXV9Mk/+3ZOOgj/sJEdiqxnCyv3
ktf2LCGzZ4scEueknyftUPz22p5jB0Leu0EjnQl3/RQubwRznAXkrEGx3HSOOSWzS7FSipVSLLft
6lg66zEt30FX4re41vfGcgUT2K9y2d1YuzjQ2bzPzavqwyKwO1jG0mQFy9z3KGTa6mx1yIRvGZlW
qQIuTebVPq8+sLvStIbL63YXTOe7wK5UOXljV0pFp+pUkdZdRTEKPSZQoCLAK+is6P+3j7gb0958
g/JD79ELSskI9DNVMZC3iIaJ8+dZjJ6Zmk4CUC8bZUYTxoLuVMyQIjAu00YxNKIboJP8BwAA//+C
iAEAAAD//wMADTdxlQplbmRzdHJlYW0KZW5kb2JqCjE1IDAgb2JqCltdCmVuZG9iagoxNCAwIG9i
ago8PC9CaXRzUGVyQ29tcG9uZW50IDEKL0NvbG9yU3BhY2UgMTIgMCBSCi9EZWNvZGVQYXJtcyBb
bnVsbCBdCi9GaWx0ZXIgWy9GbGF0ZURlY29kZSBdCi9IZWlnaHQgMTI3Ci9MZW5ndGggMzc1Nwov
U3VidHlwZSAvSW1hZ2UKL1R5cGUgL1hPYmplY3QKL1dpZHRoIDcxNwo+PgpzdHJlYW0KeNr6DwAA
AP//7Mw9TsMwFADgkbErEhK5BRSh0mtwAyZkoQxBCW0HpDL2ApV6AiSkDB7SKiNjttLiRIYFljqv
U15VxzFOStUODGFg41nvx7bep/8oVvpf3pPztqmj4vtaQFlDk4lJUk2b6JmzDVJLVrxcM7LcyWWo
SvtZ7tWSpZGL0sHK49sPWT7vjEIXejfXk5O1ypvX0Ph4fodx6x5RDMi4dfPU1X1m3bqHd6fWkll9
3V/mJ+N2Y7Dyjpn1UEeerGWBj0NMz2OwEwa4cIgdi6Cj3Rk4Zy/elMMluNqN7KkdI8FmOku9GvLS
y1BFzMNYXgBhAYd5Zjp3vFz5WUSjK6FH/kLlKiQCWKYjX/hC1pAxQKL4q8SJ7KbAJqHZNh0cqhTN
OOUk1SEVSm1k7HEq6C/k+eqgI7vi7aiSTf/ck78AAAD//+zVsU7DMBAG4IGBkY3yJAiEUB6EgZHR
Q4YONDVMTDxBi/IInUg6UFvQPawgqxyeGEJ6FZHiilNiXLokndKJhelk6fTpJPt+87V8xbL3g11v
JX/utJAxxnOS96aiokzBBKuZXP3CERHLJBMAHmeKiDgoMDaUTDFl28keyUOTUUEpxD37K8ffGFLO
ZgIFhLLPVE55HwYQIwh0smkhz/2Q01vkD6jwP5JuYJNu5mqEskwvJkMUeMYnbJaW6TGeJP6dHDr5
9bmF/Ljk3Gh25BvcQz0vrXbXp/EWZLX/cjmFYDG2TyA6VedmEetlj08hgofTVonUWCmqrzBsNHtb
Zl1ZP5v6Cstmr+Bbyo232UikDfn6/0/5Y/kHAAD//xo1edTkUZNHrskAAAAA//8aNXnU5FGTR67J
AAAAAP//lJhNaxNBGICVHnoqOXoQ2z8giKdaSpNfIqH00EtlhUBj3SZ7EMxJj3qp6R+olBLoWtbN
W8jBk9lbq2x33wbBFdrspIbsJH0zM04+mkZvwl5mGZ599+PwPPsfZMFx4Ik3Qno8scJuCkYn+yml
8u8n/sfGW2AYn+zWWMXAAx29T4zJEUwZqMKJyVI0vlw70spr3ZJpOmuH5FSpu51eHBom/EOeGKiC
iX9yPvZfpu9I3gow8ums1QODZHluHu0PlVsfnMbk0+1M10i9aRxmumYCUF14fVpqqNop5xuFxq/V
FqIq7T18NFtbrf2Q99dayPfnDq7OLza2H/efWOFCyRUJMdvfp9oLc/7eq4bdagwforBzyPf4s/IB
8uXEgqP4Zd0sqyOP8zX3LHrKsijNchxzN7/ii4sHLMvLnbfNMFqrNiXPBF6uImLO8k6W3PVPLG+G
Np0NyNfikKHnJ2Sto2c3laXTTzenLABLPD+NK7sAggrBd1mMBbSTwi6wLUpjJvQCFB5ULsmgJkfj
EyRFrMSXKmPw9IDcE5sM0eemZSBOkYUmo1/EAozIvpSxQJ4IAGZSkVkhBkygblDKinoU5zNWItGO
NcHgxSG53zlfwG+zyzfk53Pi88ydATm+IfdnRuS73oQcDcjRiJzpz6fOW8UviTxe/It8hbqr1TuJ
oOtUKoPIlEroYm3CoaiSZQFJcnQrBypiTb0ONf+3csFpE+i6TUAhk6qw2NTTDCbAePhNXffaGrnI
TvoIuk6lZH7bTITQ5R7AJlWpaIF+C5rcC4izr3od8p/iWpNNoqoynBi5g25I+WYPDE3uYzz6vUPt
uhVWPthbCDvRshTRSrRVz/UjC91ykquSa8FSnRwn7JxFBv+o1yGdCNt0y2TTjlpfqnvcBCfgRtzZ
YUHA/gAAAP//qrr/rhjs5t//H/5/WFGT37q//vCH73+YP/wT+2PP/0/uf0N14nv+2u+H/h/4Y/u9
vMKee5+8/I9oIH/vH93f/ezVid/7vx/+9+GP7fMPP6rbDz+Xv899+P7uPfKt++/zg9MzLMv9hGfD
58QNRNUTWdb9/QzP1eeJGoj6R2wp+ocfJvLBnromj9YpoyZjNfn+73/wdH8fXOf8BVd7t8FyQOZ+
2CjKXdj4CnQgZz88w+wHVmLfIdUVRC3U5L+Y+QsAAAD//2yYMWvbQBiGC4WsHjukJn+hk90SEv+O
EEIHD14SrmAau1YtDSH11owdTKw/kJYWt/Ygy0fSIVOjzU5Qravo4CGV7sDY5/Z0Uj/ZluuQwiG+
43QPnNDdc7xiYWUxdyteJjWr5JWtyOemvEvGXNzfBXzxxsqMpb/JMiIKV3Miuuhr8/SSvBh3ssz9
/GSq8A027mSug8xPkWeH0m1duriTrX1yy9vr2knqnHyJs+P+G+ZssESzufz04MS9jcoolb9grzOS
f8883gJ2wB2/aPSF0/wAgoUDsNm7VHq2SNND6bQ2G9joKW95ybzV9rlJmkY1rFgVukNFotkozc9K
jWFUOuNpnyo9yY1+MVbsH36qowLlOdSIBasj5OkFD4NJQUP7YJCCJ7Z9azQJBYgdlRQpO1JvU55o
NlStG6GSECxdHehDTw4LPrVn5D2MEOUa6oIfI6g9jDwCvgMxHQAZeUIFbUwiZQKjccJrSgzkRLNg
JzsmE3siB5h6so8W5Im9QtYWZGuF/OvhnPx07R/5Y4onmp2RHwD5em1OblH/x3pMHvdVhxA/BEsj
koMahG0PAWtPQmgwL5zNiN7JSHXi7FgRGor8RLMSVgCrgacUhkZs+ZIMZhe0YGxVHaL7Aal2KcFQ
11VS5zNyAC0k9dkqv0V9LybrYqSMVET8RLNS4CzccSTOejG5LovklLRj8sh65tDntEKqIFgN6k2V
7sLHNW+sis1fhXTnSg4c/l42GxxGsRxWb81Czk80Cz9X0yliCZYWhkl3gxHduyrFZE5qJsPkuCu+
smkEdUuy9obNjzq54zY9lqyzJYzz6W5QVlM1M86Ow0cXLBokmhUpdz+XUoRbVnjpiLUlZ4ZL/3si
3YmCf9/36lLA0V8AAAD//wJzf4Oz3m8iyjqUoWBEvQvPzN/PoxYD38HFy3ciTEYZCkbUu/Aewwd7
bCa/H61TRk3GCu7jNPn3b3iF8/d//W+E6t+Iuu49SvV0H9qrBKp4/xc2h5EPU/XvPtxk5AT7nyST
/4NMrqmx/1cnz48wGe5mpBz0G8bbj5D4h+zjv4jWACS3ApsUn3fv//H33rv7MFVgNT9+SdQAAAAA
//+EmUFrE0EUx/UgvVh6q4LafgLpzbaENJ/Dk972IDGUQGKzyU5BbD+EbfIRWpF2LZtkWgPNRbOe
moZtdgyB5pBsJ1KbSZ3MPN9WIxUpLjzYZeHHsizv9/77ntio0g9QXzt5ZT3Qx4J+7Vbad8kwcz81
u16dmV0DrBMyXJrZaQ0XyOvWTosePoxNmGhnsXbcV9ymWjbPjknfn6n2B7HJ3dgk9py6eZRKRfK7
gF3TKHV1NUX8TtRzSGevvsySuw2e0VgeEaVaMi9KYGLbpLaZF4bw3VSmaiourshVwp+eRdxl/+Ld
yQVOBa4RBL1QE0Vls/dCFeIQ8xVNA7NdoydzoodJq6dscM8DbrlyhCk1RqxOQzOxSeOqkJZfsG0L
LyjQgt1T1GhIw5PhvJEIuqiQNJSUw2yh6AsgvmImhJvSQJZFoDSWA0wgORynSIKQMOFRtHMcn0HW
fpExlGICxlQpE2NyqL0kkvf257/foQaSb7srsD8fkvMYajXW3hX51picRzIZeKtXZJ8/jwqnieTt
SbU6JvMy8+qo0kswZZZyLq26hhKcjoCGt4Q7h187VhavPT8c1LT2NXxmtiIXdYITARdlRsL/EARN
H65rw8WwuCK/9VClCtCcNOFKCyNySQ40hNtdh/Mt7xywLCQ7+P6RPPJHELC0oheuhTeQ7IPkaccC
K+EJi9mnzO7AZbhobaBKPch14xtiS1oNMIudlwFs8LK7UvOMaFdixcHvRPykLxTaOKObNbNZQDs3
O0je/ATyNBmJQslY5IvMOWKOCz9Yrm8XUaWHWk+3K8NnA1WECWv2UZtUWG57tLTQ1/eGGGzb5KAv
6NQBH2FKfaMjrWHERTuXpxUTlTYcfBsOKvCxP5hdp9nHNDv3J8Ne6v9l1Bsya+dvCY7P3XGvU+fq
P0nyhmQZCvc6+Xd74YUxeTQl/3HpteMnAAAA///Cwkfq8WIx+UH9aG01vE3+SwWTgQmnvB6+GOn7
dEi/9e//73+hCQneVQVXXH/vguXB4DaoJsTr5vr6f8j57TfYzc9/YjEZKPEXUSv/JmTyv38Iz8PG
hIEmf4XUmv9Q8ziwYP2OyM/3sa7H+fH7G6hOfNvfo6N/I07u+D8z+/YHb0BjwhXV8rL/rz9++LH9
TT1/j/3jwx9qbPsPV+TG799mZ7BhP0he/sPt97zb225UH8ZusvW7wvtvyvPevb9795z13433qu+B
x4TLLd7d/X/99Z239x/ve573/7bt++o55XvL19yv33h39+79IPn776Pvfd0OrGNt/wMAAAD//8ya
QUsCQRiGiQ4dOnjr2D/o7EHMnyKdvEgSQouabtC9P5DlXwgEDQYdyLt7K2TVwUPtQd0pxB1h3f36
vikqYooOHbotfPAMDOw8876MgYzybOTEAAOUYM1ZGKK5MroTdprzCO7dfcEdnEI3aniDoOagV3Ku
svQcGmwQ5qNGy7TRK0VORCtVkdyehxGVqrq5xeAZQbNdk5ymgDqRKDXhzqEwIrKOvIyFaMKfyelX
8lrONt/JVWDJmkfk9Gl/44M8fCPbnFU0ecNElpSaMUhf4NahxeIsalB3wrheCirH0cKmKTwAvYHC
qDqNRffQ1/OMXVaBiykazGRVJbJLZKYgywLQnTBGWgblAGXMLeHCEqVMRXDSiwUnMkXeWlk9omON
z8BWHjkRbyCM1VUupQKn5FSoEx7VpWtByS+27avFJYO81+lb/nLclrFo+X6R5nanFN6NpumZZyJr
J9qFoWX1/Ccln6/XmS3qhFlP3PhxKBPncLA6sfDSeDaRYntSkBE/GokEzeE2XO91d3rjXeM/+CUk
8k/fjsmt4tcnkvxe0cLkVv4XZG4i2//u5H8BAAD//xoSJgMAAAD//wMAEI/DSgplbmRzdHJlYW0K
ZW5kb2JqCjEzIDAgb2JqCjw8L0RlY29kZVBhcm1zIFtudWxsIF0KL0ZpbHRlciBbL0ZsYXRlRGVj
b2RlIF0KL0xlbmd0aCAyNgo+PgpzdHJlYW0KeNpiAAAAAP//YmD4//8/AAAA//8DAAYAAv4KZW5k
c3RyZWFtCmVuZG9iagoxMiAwIG9iagpbL0luZGV4ZWQgL0RldmljZVJHQiAxIDEzIDAgUiBdCmVu
ZG9iagoxMSAwIG9iago8PC9CYXNlRm9udCAvQXJpYWwKL0Rlc2NlbmRhbnRGb250cyBbMTAgMCBS
IF0KL0VuY29kaW5nIC9JZGVudGl0eS1ICi9OYW1lIC9GMgovU3VidHlwZSAvVHlwZTAKL1RvVW5p
Y29kZSAvSWRlbnRpdHktSAovVHlwZSAvRm9udAo+PgplbmRvYmoKMTAgMCBvYmoKPDwvQmFzZUZv
bnQgL0FyaWFsCi9DSURTeXN0ZW1JbmZvIDw8L09yZGVyaW5nIChJZGVudGl0eSkKL1JlZ2lzdHJ5
IChBZG9iZSkKL1N1cHBsZW1lbnQgMQo+PgovQ0lEVG9HSURNYXAgMjAgMCBSCi9Gb250RGVzY3Jp
cHRvciA5IDAgUgovU3VidHlwZSAvQ0lERm9udFR5cGUyCi9UeXBlIC9Gb250Ci9XIDE4IDAgUgo+
PgplbmRvYmoKOSAwIG9iago8PC9Bc2NlbnQgMTAwNQovQ0lEU2V0IDIxIDAgUgovQ2FwSGVpZ2h0
IDAKL0Rlc2NlbnQgLTMyNAovRmxhZ3MgMzIKL0ZvbnRCQm94IFstNjY0IC0zMjQgMjAwMCAxMDA1
IF0KL0ZvbnRGaWxlMiAxOSAwIFIKL0ZvbnROYW1lIC9BcmlhbAovSXRhbGljQW5nbGUgMAovU3Rl
bVYgMAovVHlwZSAvRm9udERlc2NyaXB0b3IKPj4KZW5kb2JqCjggMCBvYmoKPDwvRGVjb2RlUGFy
bXMgW251bGwgXQovRmlsdGVyIFsvRmxhdGVEZWNvZGUgXQovTGVuZ3RoIDMxMQo+PgpzdHJlYW0K
eNrMUT1PAzEMvTm/wmNvuNR2YidBiAGVj7JVnMSAWEBQkHpAERKiv54ktNIhYGNAUV7iF9vvyUE4
MQRLszaLvNcmsEXnnI8QKNqkicXDMKKVnSVUjQlWRgktInt0v/KjNitzAY9ZQ1woFIoA5pXERmVR
Dz+q3AxmOh+WBLOn7HFRfaIlRhSFN4PVfO6TBREJlNhydF4IQsLcWckxvNxW7cV/Sk/JRkJJCv3G
HPZmesyQbPISPEF/l7+lTIdGU5EQrCYKgaEfzOWkOWhdgS4WvOrPzFH/Fy6IbIoceewiRetYhPS7
i/u2k0nz2nZuh8+V2av36RZ10lxX+qGlXZ6tuCrEext3uZ/Vs+a02RTyvO0ox/P6BjnQ0blfarfw
dQAfAAAA//8iPtYCwakfAAAA//8DAHqml/0KZW5kc3RyZWFtCmVuZG9iago3IDAgb2JqCjw8L0Jh
c2VGb250IC9Db3VyaWVyTmV3Ci9EZXNjZW5kYW50Rm9udHMgWzYgMCBSIF0KL0VuY29kaW5nIC9J
ZGVudGl0eS1ICi9OYW1lIC9GMQovU3VidHlwZSAvVHlwZTAKL1RvVW5pY29kZSAvSWRlbnRpdHkt
SAovVHlwZSAvRm9udAo+PgplbmRvYmoKNiAwIG9iago8PC9CYXNlRm9udCAvQ291cmllck5ldwov
Q0lEU3lzdGVtSW5mbyA8PC9PcmRlcmluZyAoSWRlbnRpdHkpCi9SZWdpc3RyeSAoQWRvYmUpCi9T
dXBwbGVtZW50IDEKPj4KL0NJRFRvR0lETWFwIDE3IDAgUgovRm9udERlc2NyaXB0b3IgNSAwIFIK
L1N1YnR5cGUgL0NJREZvbnRUeXBlMgovVHlwZSAvRm9udAovVyAxNSAwIFIKPj4KZW5kb2JqCjUg
MCBvYmoKPDwvQXNjZW50IDEwMjAKL0NhcEhlaWdodCAwCi9EZXNjZW50IC02NzkKL0ZsYWdzIDMy
Ci9Gb250QkJveCBbLTEyMSAtNjc5IDYyMiAxMDIwIF0KL0ZvbnRGaWxlMiAxNiAwIFIKL0ZvbnRO
YW1lIC9Db3VyaWVyTmV3Ci9JdGFsaWNBbmdsZSAwCi9TdGVtViAwCi9UeXBlIC9Gb250RGVzY3Jp
cHRvcgo+PgplbmRvYmoKNCAwIG9iago8PC9Db250ZW50cyA4IDAgUgovTWVkaWFCb3ggWzAgMCA2
MTIuMjgzNDY1IDc5MC44NjYxNDIgXQovUGFyZW50IDEgMCBSCi9UeXBlIC9QYWdlCj4+CmVuZG9i
agozIDAgb2JqCjw8L0NvbG9yU3BhY2UgPDwvQ1MxIDEyIDAgUgo+PgovRm9udCA8PC9GMSA3IDAg
UgovRjIgMTEgMCBSCj4+Ci9YT2JqZWN0IDw8L0ltZzEgMTQgMCBSCj4+Cj4+CmVuZG9iagoyIDAg
b2JqCjw8L01ldGFkYXRhIDIzIDAgUgovT3BlbkFjdGlvbiBbNCAwIFIgL1hZWiAtMzI3NjggLTMy
NzY4IDEgXQovUGFnZUxheW91dCAvU2luZ2xlUGFnZQovUGFnZU1vZGUgL1VzZU5vbmUKL1BhZ2Vz
IDEgMCBSCi9UeXBlIC9DYXRhbG9nCi9WaWV3ZXJQcmVmZXJlbmNlcyA8PC9DZW50ZXJXaW5kb3cg
ZmFsc2UKL0RpcmVjdGlvbiAvTDJSCi9EaXNwbGF5RG9jVGl0bGUgZmFsc2UKL0ZpdFdpbmRvdyBm
YWxzZQovSGlkZU1lbnViYXIgZmFsc2UKL0hpZGVUb29sYmFyIGZhbHNlCi9IaWRlV2luZG93VUkg
ZmFsc2UKL05vbkZ1bGxTY3JlZW5QYWdlTW9kZSAvVXNlTm9uZQovUHJpbnRBcmVhIC9Dcm9wQm94
Ci9QcmludENsaXAgL0Nyb3BCb3gKL1ByaW50U2NhbGluZyAvQXBwRGVmYXVsdAovVmlld0FyZWEg
L0Nyb3BCb3gKL1ZpZXdDbGlwIC9Dcm9wQm94Cj4+Cj4+CmVuZG9iagoxIDAgb2JqCjw8L0NvdW50
IDEKL0tpZHMgWzQgMCBSIF0KL01lZGlhQm94IFswIDAgNTk1LjI3NTU5MSA4NDEuODg5NzY0IF0K
L1Jlc291cmNlcyAzIDAgUgovVHlwZSAvUGFnZXMKPj4KZW5kb2JqCnhyZWYNCjAgMjQNCjAwMDAw
MDAwMDAgNjU1MzUgZg0KMDAwMDAyMzE4MyAwMDAwMCBuDQowMDAwMDIyNzM5IDAwMDAwIG4NCjAw
MDAwMjI2MzEgMDAwMDAgbg0KMDAwMDAyMjUzMCAwMDAwMCBuDQowMDAwMDIyMzQyIDAwMDAwIG4N
CjAwMDAwMjIxNDAgMDAwMDAgbg0KMDAwMDAyMTk4OSAwMDAwMCBuDQowMDAwMDIxNTgzIDAwMDAw
IG4NCjAwMDAwMjEzODQgMDAwMDAgbg0KMDAwMDAyMTE4NiAwMDAwMCBuDQowMDAwMDIxMDM4IDAw
MDAwIG4NCjAwMDAwMjA5OTAgMDAwMDAgbg0KMDAwMDAyMDg2OSAwMDAwMCBuDQowMDAwMDE2OTIy
IDAwMDAwIG4NCjAwMDAwMTY5MDMgMDAwMDAgbg0KMDAwMDAxMjMzMCAwMDAwMCBuDQowMDAwMDEy
MjEzIDAwMDAwIG4NCjAwMDAwMTIwMDAgMDAwMDAgbg0KMDAwMDAwMzE1NiAwMDAwMCBuDQowMDAw
MDAyOTkxIDAwMDAwIG4NCjAwMDAwMDI4NjAgMDAwMDAgbg0KMDAwMDAwMjU5NSAwMDAwMCBuDQow
MDAwMDAwMDE1IDAwMDAwIG4NCnRyYWlsZXINCjw8L0lEIFsoFsP4ubwbSGi6mi5Cr0tFZUVlKSAo
eOBy/frUTDeS0erDJi6kg6SDKSBdCi9JbmZvIDIyIDAgUgovUm9vdCAyIDAgUgovU2l6ZSAyNAo+
Pg0Kc3RhcnR4cmVmDQoyMzI5Ng0KJSVFT0Y--1006711009-1319071212=:21299--

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2011-10-26 20:51 bfeely
  0 siblings, 0 replies; 1546+ messages in thread
From: bfeely @ 2011-10-26 20:51 UTC (permalink / raw)
  To: lighth7015, linux-kernel, listserv, literature, lpulsifer

..Fulfill your life with only positive emotions due to it!  
http://www.cavexpert.com/m.friends.page.php?ahaid_hotmail=60b6

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:..
@ 2011-10-28 15:55 Young Chang
  0 siblings, 0 replies; 1546+ messages in thread
From: Young Chang @ 2011-10-28 15:55 UTC (permalink / raw)


May I ask if you would be eligible to pursue a Business Proposal of  
$19.7m with
me if you don't mind? Let me know if you are interested?




^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:..
@ 2011-10-28 16:03 Young Chang
  0 siblings, 0 replies; 1546+ messages in thread
From: Young Chang @ 2011-10-28 16:03 UTC (permalink / raw)


May I ask if you would be eligible to pursue a Business Proposal of  
$19.7m with
me if you don't mind? Let me know if you are interested?




^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] ` <CAPu47WTjxrrF+tHGRJOgKohD-sijBvX8iC-gBUnbsRw_KS4K5g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-11-01 11:52   ` Harald Hoyer
  0 siblings, 0 replies; 1546+ messages in thread
From: Harald Hoyer @ 2011-11-01 11:52 UTC (permalink / raw)
  To: Renjun Qu; +Cc: initramfs-u79uwXL29TY76Z2rM5mHXA

On 25.10.2011 07:55, Renjun Qu wrote:
> Hello everyone:
>          I am building a small embedded linux system right now. I
> confused that why the default rootfs must has a "dev/console" node. I
> have made a experiment as follow:
>         1st, configure the kernel(2.6.32) to support the initramfs
> and initrd, and set the source of the initramfs to a directory which
> only has a "root" node.
>         2nd, configure the kernel to support the root file system on
> nfs, and rebuild the kernel. I have created all necessary files and
> directories in the directory exported by the nfs server.
>         3rd,  set the appropriate boot parameter, and boot my kernel.
>         The result is, the kernel can successfully mount the root
> file system, but the "/sbin/init" program can not print any
> information to me. My "sbin/init" program is just a hello world
> program which only print some greeting information.There is a
> "dev/console" device node in the nfs exported directory.
>         But if i set the source of the initramfs to empty, and redo
> my experiment from the 2nd step,everything will be ok. So, i think
> there must be a “dev/console” node in the rootfs. The real root file
> system already has a "dev/console" node, why default rootfs must have
> a dev/console node. Please help me.
> 
>                          Best regards,
> 
>                             Ren jun Qu
> --
> To unsubscribe from this list: send the line "unsubscribe initramfs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Your small initramfs should have something like this in sbin/init:

mknod -m 0666 /dev/null c 1 3

mount -t proc  -o nosuid,noexec,nodev proc /proc
mount -t sysfs -o nosuid,noexec,nodev sysfs /sys

ismounted() {
    while read a m a; do
        [ "$m" = "$1" ] && return 0
    done < /proc/mounts
    return 1
}

_opt="-o mode=0755,nosuid,exec"; _fs="-t devtmpfs"
ismounted /dev && { _opt="$_opt,remount"; unset _fs; }
if ! mount $_fs $_opt devtmpfs /dev >/dev/null 2>&1; then
    # if it failed (remount can't fail - no need to redo $_opt), fall back to
    # normal tmpfs
    mount -t tmpfs $_opt tmpfs /dev >/dev/null 2>&1
    # Make some basic devices first, let udev handle the rest
    mknod -m 0666 /dev/null c 1 3
    mknod -m 0666 /dev/ptmx c 5 2
    mknod -m 0600 /dev/console c 5 1
    mknod -m 0660 /dev/kmsg c 1 11
fi

# prepare the /dev directory (note: newer udevd takes care of it automatically)
[ ! -h /dev/fd ] && ln -s /proc/self/fd /dev/fd >/dev/null 2>&1
[ ! -h /dev/stdin ] && ln -s /proc/self/fd/0 /dev/stdin >/dev/null 2>&1
[ ! -h /dev/stdout ] && ln -s /proc/self/fd/1 /dev/stdout >/dev/null 2>&1
[ ! -h /dev/stderr ] && ln -s /proc/self/fd/2 /dev/stderr >/dev/null 2>&1

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2011-11-08  2:26 ` (unknown) Wu Fengguang
@ 2011-11-08  4:40   ` Stephen Rothwell
  0 siblings, 0 replies; 1546+ messages in thread
From: Stephen Rothwell @ 2011-11-08  4:40 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: linux-next, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 471 bytes --]

Hi,

On Tue, 8 Nov 2011 10:26:31 +0800 Wu Fengguang <fengguang.wu@intel.com> wrote:
>
> I'm moving back to kernel.org and would you please switch
> 
>         git://github.com/fengguang/linux.git#writeback-for-next
> 
> to
> 
>         git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux.git#writeback-for-next

OK, I have switched to that now.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2011-11-09 11:58 ` pradeep Annavarapu
  0 siblings, 0 replies; 1546+ messages in thread
From: pradeep Annavarapu @ 2011-11-09 11:58 UTC (permalink / raw)
  To: lavi2905, leelaratnam, lillian.gonzalez, linux-kernel,
	linux-newbie, linux-serial, lucky, manch

http://www.passionchapel.org/group.php?id=53&top=49&page=21

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2011-11-09 11:58 ` pradeep Annavarapu
  0 siblings, 0 replies; 1546+ messages in thread
From: pradeep Annavarapu @ 2011-11-09 11:58 UTC (permalink / raw)
  To: lavi2905, leelaratnam, lillian.gonzalez, linux-kernel,
	linux-newbie, linux-serial, lucky, manchidevi

http://www.passionchapel.org/group.php?id=53&top=49&page=21

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2011-11-21 15:22 No subject Jimmy Pan
@ 2011-11-22 16:41 ` Jimmy Pan
  0 siblings, 0 replies; 1546+ messages in thread
From: Jimmy Pan @ 2011-11-22 16:41 UTC (permalink / raw)
  To: kernelnewbies

Sorry for the spam here, my email account was stolen.


On Mon, Nov 21, 2011 at 11:22 PM, Jimmy Pan <dspjm1@gmail.com> wrote:
> ..Do you want to feel something new? Do you want to feel new
> unforgettable sensations? This is for you!
> http://un-ocean.fr/p.g.php?wellink_friend_id\x14ox0
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2011-12-11  8:41 James Brown
  0 siblings, 0 replies; 1546+ messages in thread
From: James Brown @ 2011-12-11  8:41 UTC (permalink / raw)
  To: mail1

https://docs.google.com/document/d/1yAkUys2osN7co_KbzphWLLsoe-TPq7ELZhoySYvzjF0/edit

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2011-12-13  2:58 ` Matt Shaw
  0 siblings, 0 replies; 1546+ messages in thread
From: Matt Shaw @ 2011-12-13  2:58 UTC (permalink / raw)
  To: doshoes1990

https://docs.google.com/document/d/1-MgXERW0_TNnd0VK2caXYhDXtT58z1DJ0laAmJajrEs/edit

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2011-12-13  2:58 ` Matt Shaw
  0 siblings, 0 replies; 1546+ messages in thread
From: Matt Shaw @ 2011-12-13  2:58 UTC (permalink / raw)
  To: doshoes1990

https://docs.google.com/document/d/1-MgXERW0_TNnd0VK2caXYhDXtT58z1DJ0laAmJaj=
rEs/edit

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]   ` <CAOzFzEhVs=sm26wspdAH1rcc-S9nVW1xLok9ho--LnzxJXnNsw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-01-16 15:49     ` Joseph Glanville
  0 siblings, 0 replies; 1546+ messages in thread
From: Joseph Glanville @ 2012-01-16 15:49 UTC (permalink / raw)
  To: Duane Griffin; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA

Ick, HTML mode in GMail breaks Majordomo. :(

Sorry for spam.

Joseph.

On 17 January 2012 02:47, Joseph Glanville
<joseph.glanville-2MxvZkOi9dvvnOemgxGiVw@public.gmane.org> wrote:
>
> I guess it worked. :)
>
>
> On 17 January 2012 02:46, Duane Griffin <duaneg-E2rlibRYLgQAvxtiuMwx3w@public.gmane.org> wrote:
>>
>> subscribe
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
>
> --
> Founder | Director | VP Research
> Orion Virtualisation Solutions | www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846




--
Founder | Director | VP Research
Orion Virtualisation Solutions | www.orionvm.com.au | Phone: 1300 56
99 52 | Mobile: 0428 754 846

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-01-30 19:43 Laurent Bonnans
@ 2012-01-31  5:58 ` Mohammed Shafi
  2012-02-01 11:14   ` Re: Mohammed Shafi
  0 siblings, 1 reply; 1546+ messages in thread
From: Mohammed Shafi @ 2012-01-31  5:58 UTC (permalink / raw)
  To: Laurent Bonnans; +Cc: linux-wireless, Felix Fietkau

[-- Attachment #1: Type: text/plain, Size: 2813 bytes --]

On Tue, Jan 31, 2012 at 1:13 AM, Laurent Bonnans <bonnans.l@gmail.com> wrote:
> Since the update from linux 3.2.1 to 3.2.2, dhcp stopped working on
> some APs on my laptop with an AR9285 Wireless card.
>
> dhcp works fine on an open wifi network but receives no response on a
> wep network I use. I haven't been able to test it on a third network
> for now.

 reverting  "ath9k_hw: fix interpretation of the rx KeyMiss flag" does
helps.  i  need to analyze
if it exposes some real issue which need to be fixed.


> Reverting to 3.2.1 solved the issue which is still there in the latest
> git revision as of today.
>
> DHCPDISCOVER requests are still sent but no ACK is received (nothing
> in Wireshark).
>
> dhcp failure may be one particular instance of the problem but I
> haven't been able to connect with a static ip (my ap doesn't like it)
> so this is the
> only result I know.
>
>
> ver_linux output (latest git kernel) :
>
> Linux litbox 3.3.0-rc1-hack-00383-g0a96265 #1 SMP PREEMPT Mon Jan 30
> 02:22:54 CET 2012 x86_64 Intel(R) Core(TM) i3 CPU M 370 @ 2.40GHz
> GenuineIntel GNU/Linux
>
> Gnu C                  4.6.2
> Gnu make               3.82
> binutils               2.22.0.20111227
> util-linux             2.20.1
> mount                  support
> module-init-tools      4
> e2fsprogs              1.42
> jfsutils               1.1.15
> reiserfsprogs          3.6.21
> xfsprogs               3.1.7
> pcmciautils            018
> PPP                    2.4.5
> Linux C Library        2.15
> Dynamic linker (ldd)   2.15
> Linux C++ Library      so.6.0
> Procps                 3.2.8
> Net-tools              1.60
> Kbd                    1.15.3
> Sh-utils               8.15
> wireless-tools         29
> Modules Loaded         ipv6 cpufreq_ondemand fuse xts gf128mul
> uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev
> v4l2_compat_ioctl32 media arc4 ath9k ath9k_common ath9k_hw nouveau
> ehci_hcd usbcore i915 snd_hda_codec_hdmi snd_hda_codec_realtek joydev
> ath ttm intel_ips mac80211 cfg80211 asus_laptop sparse_keymap rfkill
> drm_kms_helper drm snd_hda_intel snd_hda_codec snd_hwdep mxm_wmi
> psmouse i2c_algo_bit serio_raw i2c_core pcspkr input_polldev mei
> iTCO_wdt usb_common evdev intel_agp iTCO_vendor_support intel_gtt
> atl1c wmi video snd_pcm snd_page_alloc snd_timer snd soundcore thermal
> battery ac button tun kvm_intel kvm aes_x86_64 aes_generic
> acpi_cpufreq mperf processor freq_table ext4 crc16 jbd2 mbcache
> dm_crypt dm_mod sd_mod ahci libahci libata scsi_mod
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
shafi

[-- Attachment #2: 0001-Revert-ath9k_hw-fix-interpretation-of-the-rx-KeyMiss.patch --]
[-- Type: text/x-diff, Size: 1813 bytes --]

From 171ef4d092d47bf63b33b1e4d5eafd4320e6bb1d Mon Sep 17 00:00:00 2001
From: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
Date: Tue, 31 Jan 2012 10:36:47 +0530
Subject: [PATCH] Revert "ath9k_hw: fix interpretation of the rx KeyMiss flag"

This reverts commit 7a532fe7131216a02c81a6c1b1f8632da1195a58.
---
 drivers/net/wireless/ath/ath9k/ar9003_mac.c |    5 ++---
 drivers/net/wireless/ath/ath9k/mac.c        |    5 ++---
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/ar9003_mac.c b/drivers/net/wireless/ath/ath9k/ar9003_mac.c
index 09b8c9d..88c81c5 100644
--- a/drivers/net/wireless/ath/ath9k/ar9003_mac.c
+++ b/drivers/net/wireless/ath/ath9k/ar9003_mac.c
@@ -557,11 +557,10 @@ int ath9k_hw_process_rxdesc_edma(struct ath_hw *ah, struct ath_rx_status *rxs,
 			rxs->rs_status |= ATH9K_RXERR_DECRYPT;
 		else if (rxsp->status11 & AR_MichaelErr)
 			rxs->rs_status |= ATH9K_RXERR_MIC;
+		if (rxsp->status11 & AR_KeyMiss)
+			rxs->rs_status |= ATH9K_RXERR_KEYMISS;
 	}
 
-	if (rxsp->status11 & AR_KeyMiss)
-		rxs->rs_status |= ATH9K_RXERR_KEYMISS;
-
 	return 0;
 }
 EXPORT_SYMBOL(ath9k_hw_process_rxdesc_edma);
diff --git a/drivers/net/wireless/ath/ath9k/mac.c b/drivers/net/wireless/ath/ath9k/mac.c
index e196aba..fd3f19c 100644
--- a/drivers/net/wireless/ath/ath9k/mac.c
+++ b/drivers/net/wireless/ath/ath9k/mac.c
@@ -618,11 +618,10 @@ int ath9k_hw_rxprocdesc(struct ath_hw *ah, struct ath_desc *ds,
 			rs->rs_status |= ATH9K_RXERR_DECRYPT;
 		else if (ads.ds_rxstatus8 & AR_MichaelErr)
 			rs->rs_status |= ATH9K_RXERR_MIC;
+		if (ads.ds_rxstatus8 & AR_KeyMiss)
+			rs->rs_status |= ATH9K_RXERR_KEYMISS;
 	}
 
-	if (ads.ds_rxstatus8 & AR_KeyMiss)
-		rs->rs_status |= ATH9K_RXERR_KEYMISS;
-
 	return 0;
 }
 EXPORT_SYMBOL(ath9k_hw_rxprocdesc);
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 1546+ messages in thread

* Re:
  2011-12-22  9:43   ` Malwina Bartoszynska
@ 2012-01-31 15:53     ` Max
  0 siblings, 0 replies; 1546+ messages in thread
From: Max @ 2012-01-31 15:53 UTC (permalink / raw)
  To: linux-btrfs

Malwina Bartoszynska <m.bartoszynska <at> rootbox.com> writes:

> 
> W dniu 2011-12-21 20:06, Chris Mason pisze:
> > On Wed, Dec 21, 2011 at 01:54:06PM +0000, Malwina Bartoszynska wrote:
> >> Hello,
> >> after unmounting btrfs partition, I can't mount it again.
> >>
> >> root <at> xxx:~# btrfs device scan
> >> Scanning for Btrfs filesystems
> >> root <at> xxx:~# mount /dev/sdb /data/osd.0/
> >> mount: wrong fs type, bad option, bad superblock on /dev/sdb,
> >>         missing codepage or helper program, or other error
> >>         In some cases useful info is found in syslog - try
> >>         dmesg | tail  or so
> >>
> >> root <at> xxxx:~# dmesg|tail
> >> [57192.607912] device fsid ed25c604-3e11-4459-85b5-e4090c4d22d0 devid
> >> 2 transid14429 /dev/sda
> >> [57204.796573] end_request: I/O error, dev fd0, sector 0
> >> [57231.660913] device fsid ed25c604-3e11-4459-85b5-e4090c4d22d0 devid 1
> >>   transid 14429 /dev/sdb
> >> [57231.680387] parent transid verify failed on 424308420608 wanted 6970
> >>   found 8959
> >> [57231.680546] parent transid verify failed on 424308420608 wanted 6970
> >> found 8959
> >> [57231.680705] parent transid verify failed on 424308420608 wanted 6970
> >> found 8959
> >> [57231.680861] parent transid verify failed on 424308420608 wanted 6970
> >> found 8959
> >> [57231.680869] parent transid verify failed on 424308420608 wanted 6970
> >> found 8959
> >> [57231.680875] Failed to read block groups: -5
> >> [57231.704165] btrfs: open_ctree failed
> > Can you tell us more about this filesystem?  Was there an unclean
> > shutdown or did you just unmount, mount again?
> >
> > The confusing thing is that all of your disks seem to have the same copy
> > of the block, so it looks like things were written properly.
> >
> > -chris
> There was no shutdown before this, filesystem was just unmounted(which 
> looked as properly done - no errors). Then tried to mount it again.
> Is there way of fixing it?
> --
> Malwina Bartoszynska
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo <at> vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

I have the same problem. In my case the failure happens during parallel writing 
several files. I did wget from several sources in parallel

'ls /srv/shared/Downloads/xxx/xxx/' blocked.
and dmesg gave:
[112920.940110] INFO: task btrfs-transacti:719 blocked for more than 120 
seconds.
[112920.965833] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[112920.988255] btrfs-transacti D ffffffff81805120     0   719      2 0x00000000
[112920.988266]  ffff880b857e3d10 0000000000000046 ffff880b80b08198 
0000000000000000
[112920.988273]  ffff880b857e3fd8 ffff880b857e3fd8 ffff880b857e3fd8 
0000000000012a40
[112920.988279]  ffffffff81c0b020 ffff880b8c0c0000 ffff880b857e3d10 
ffff880945187a40
[112920.988298] Call Trace:
[112920.988315]  [<ffffffff8160492f>] schedule+0x3f/0x60
[112920.988326]  [<ffffffff81604f75>] schedule_timeout+0x2a5/0x320
[112920.988338]  [<ffffffff810329a9>] ? default_spin_lock_flags+0x9/0x10
[112920.988371]  [<ffffffffa003ba15>] btrfs_commit_transaction+0x245/0x860 
[btrfs]
[112920.988384]  [<ffffffff81081660>] ? add_wait_queue+0x60/0x60
[112920.988414]  [<ffffffffa00347b5>] transaction_kthread+0x275/0x290 [btrfs]
[112920.988437]  [<ffffffffa0034540>] ? btrfs_congested_fn+0xb0/0xb0 [btrfs]
[112920.988448]  [<ffffffff81080bbc>] kthread+0x8c/0xa0
[112920.988458]  [<ffffffff8160fca4>] kernel_thread_helper+0x4/0x10
[112920.988469]  [<ffffffff81080b30>] ? flush_kthread_worker+0xa0/0xa0
[112920.988479]  [<ffffffff8160fca0>] ? gs_change+0x13/0x13

after reboot the disk was not mounted at all. 

I tried to fix it. 
original btrfsck didn't work at all. 
~$ btrfsck /dev/vdc
Could not open /dev/vdc

after manual update to btrfs-tools_0.19+20111105-2_amd64.deb
it gave me: 

~$ sudo btrfsck /dev/vdc
parent transid verify failed on 20971520 wanted 1347 found 3121
parent transid verify failed on 20971520 wanted 1347 found 3121
parent transid verify failed on 20971520 wanted 1347 found 3121
parent transid verify failed on 20971520 wanted 1347 found 3121
Ignoring transid failure
parent transid verify failed on 29470720 wanted 1357 found 3231
parent transid verify failed on 29470720 wanted 1357 found 3231
parent transid verify failed on 29470720 wanted 1357 found 3231
parent transid verify failed on 29470720 wanted 1357 found 3231
Ignoring transid failure
parent transid verify failed on 29470720 wanted 1357 found 3231
Ignoring transid failure
parent transid verify failed on 29487104 wanted 1357 found 3235
parent transid verify failed on 29487104 wanted 1357 found 3235
parent transid verify failed on 29487104 wanted 1357 found 3235
parent transid verify failed on 29487104 wanted 1357 found 3235
Ignoring transid failure
leaf 29487104 items 1 free space 3454 generation 3235 owner 7
fs uuid c5ce4702-2dbf-4b57-8067-bd6129fc124b
chunk uuid 0ffa84fe-33a3-4b8e-95a4-de5f93e88163
	item 0 key (EXTENT_CSUM EXTENT_CSUM 64343257088) itemoff 3479 itemsize 
516
		extent csum item
failed to find block number 150802432

Is it possible to fix it? 
I don't want to download 500 GB data again. 

Regards, 
    Max



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-01-31  5:58 ` Mohammed Shafi
@ 2012-02-01 11:14   ` Mohammed Shafi
  2012-02-01 16:27     ` Re: John W. Linville
  0 siblings, 1 reply; 1546+ messages in thread
From: Mohammed Shafi @ 2012-02-01 11:14 UTC (permalink / raw)
  To: Laurent Bonnans; +Cc: linux-wireless, Felix Fietkau

On Tue, Jan 31, 2012 at 11:28 AM, Mohammed Shafi
<shafi.wireless@gmail.com> wrote:
> On Tue, Jan 31, 2012 at 1:13 AM, Laurent Bonnans <bonnans.l@gmail.com> wrote:
>> Since the update from linux 3.2.1 to 3.2.2, dhcp stopped working on
>> some APs on my laptop with an AR9285 Wireless card.
>>
>> dhcp works fine on an open wifi network but receives no response on a
>> wep network I use. I haven't been able to test it on a third network
>> for now.
>
>  reverting  "ath9k_hw: fix interpretation of the rx KeyMiss flag" does
> helps.  i  need to analyze
> if it exposes some real issue which need to be fixed.
>

this seems to be a problem in WEP alone, where the key miss is always
set for this case and RX_FLAG_DECRYPTED is not set. mac80211 trys to
decrypt,  but fails due to ICV mismatch.

-- 
shafi

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-02-01 11:14   ` Re: Mohammed Shafi
@ 2012-02-01 16:27     ` John W. Linville
  2012-02-01 17:04       ` Re: Felix Fietkau
  0 siblings, 1 reply; 1546+ messages in thread
From: John W. Linville @ 2012-02-01 16:27 UTC (permalink / raw)
  To: Mohammed Shafi; +Cc: Laurent Bonnans, linux-wireless, Felix Fietkau

On Wed, Feb 01, 2012 at 04:44:08PM +0530, Mohammed Shafi wrote:
> On Tue, Jan 31, 2012 at 11:28 AM, Mohammed Shafi
> <shafi.wireless@gmail.com> wrote:
> > On Tue, Jan 31, 2012 at 1:13 AM, Laurent Bonnans <bonnans.l@gmail.com> wrote:
> >> Since the update from linux 3.2.1 to 3.2.2, dhcp stopped working on
> >> some APs on my laptop with an AR9285 Wireless card.
> >>
> >> dhcp works fine on an open wifi network but receives no response on a
> >> wep network I use. I haven't been able to test it on a third network
> >> for now.
> >
> >  reverting  "ath9k_hw: fix interpretation of the rx KeyMiss flag" does
> > helps.  i  need to analyze
> > if it exposes some real issue which need to be fixed.
> >
> 
> this seems to be a problem in WEP alone, where the key miss is always
> set for this case and RX_FLAG_DECRYPTED is not set. mac80211 trys to
> decrypt,  but fails due to ICV mismatch.

OK...any way to differentiate this case at that point in the code?

John
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-02-01 16:27     ` Re: John W. Linville
@ 2012-02-01 17:04       ` Felix Fietkau
  2012-02-02  5:37         ` Re: Mohammed Shafi
  0 siblings, 1 reply; 1546+ messages in thread
From: Felix Fietkau @ 2012-02-01 17:04 UTC (permalink / raw)
  To: John W. Linville; +Cc: Mohammed Shafi, Laurent Bonnans, linux-wireless

On 2012-02-01 5:27 PM, John W. Linville wrote:
> On Wed, Feb 01, 2012 at 04:44:08PM +0530, Mohammed Shafi wrote:
>> On Tue, Jan 31, 2012 at 11:28 AM, Mohammed Shafi
>> <shafi.wireless@gmail.com> wrote:
>> > On Tue, Jan 31, 2012 at 1:13 AM, Laurent Bonnans <bonnans.l@gmail.com> wrote:
>> >> Since the update from linux 3.2.1 to 3.2.2, dhcp stopped working on
>> >> some APs on my laptop with an AR9285 Wireless card.
>> >>
>> >> dhcp works fine on an open wifi network but receives no response on a
>> >> wep network I use. I haven't been able to test it on a third network
>> >> for now.
>> >
>> >  reverting  "ath9k_hw: fix interpretation of the rx KeyMiss flag" does
>> > helps.  i  need to analyze
>> > if it exposes some real issue which need to be fixed.
>> >
>> 
>> this seems to be a problem in WEP alone, where the key miss is always
>> set for this case and RX_FLAG_DECRYPTED is not set. mac80211 trys to
>> decrypt,  but fails due to ICV mismatch.
> 
> OK...any way to differentiate this case at that point in the code?
> 
> John
Please try this patch:

---
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -823,6 +823,15 @@ static bool ath9k_rx_accept(struct ath_c
 		(ATH9K_RXERR_DECRYPT | ATH9K_RXERR_CRC | ATH9K_RXERR_MIC |
 		 ATH9K_RXERR_KEYMISS));
 
+	/*
+	 * First 4 slots are reserved for WEP, and for packets using them,
+	 * ATH9K_RXERR_KEYMISS can be reported even though decryption was
+	 * successful, since no MAC address based key cache lookup was
+	 * performed.
+	 */
+	if (rx_stats->rs_keyix < 4)
+		rx_stats->rs_status &= ~ATH9K_RXERR_KEYMISS;
+
 	if (!rx_stats->rs_datalen)
 		return false;
         /*

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-02-01 17:04       ` Re: Felix Fietkau
@ 2012-02-02  5:37         ` Mohammed Shafi
  2012-02-02 12:28           ` Re: Felix Fietkau
  0 siblings, 1 reply; 1546+ messages in thread
From: Mohammed Shafi @ 2012-02-02  5:37 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: John W. Linville, Laurent Bonnans, linux-wireless

Hi Felix,

On Wed, Feb 1, 2012 at 10:34 PM, Felix Fietkau <nbd@openwrt.org> wrote:
> On 2012-02-01 5:27 PM, John W. Linville wrote:
>> On Wed, Feb 01, 2012 at 04:44:08PM +0530, Mohammed Shafi wrote:
>>> On Tue, Jan 31, 2012 at 11:28 AM, Mohammed Shafi
>>> <shafi.wireless@gmail.com> wrote:
>>> > On Tue, Jan 31, 2012 at 1:13 AM, Laurent Bonnans <bonnans.l@gmail.com> wrote:
>>> >> Since the update from linux 3.2.1 to 3.2.2, dhcp stopped working on
>>> >> some APs on my laptop with an AR9285 Wireless card.
>>> >>
>>> >> dhcp works fine on an open wifi network but receives no response on a
>>> >> wep network I use. I haven't been able to test it on a third network
>>> >> for now.
>>> >
>>> >  reverting  "ath9k_hw: fix interpretation of the rx KeyMiss flag" does
>>> > helps.  i  need to analyze
>>> > if it exposes some real issue which need to be fixed.
>>> >
>>>
>>> this seems to be a problem in WEP alone, where the key miss is always
>>> set for this case and RX_FLAG_DECRYPTED is not set. mac80211 trys to
>>> decrypt,  but fails due to ICV mismatch.
>>
>> OK...any way to differentiate this case at that point in the code?
>>
>> John
> Please try this patch:
>
> ---
> --- a/drivers/net/wireless/ath/ath9k/recv.c
> +++ b/drivers/net/wireless/ath/ath9k/recv.c
> @@ -823,6 +823,15 @@ static bool ath9k_rx_accept(struct ath_c
>                (ATH9K_RXERR_DECRYPT | ATH9K_RXERR_CRC | ATH9K_RXERR_MIC |
>                 ATH9K_RXERR_KEYMISS));
>
> +       /*
> +        * First 4 slots are reserved for WEP, and for packets using them,
> +        * ATH9K_RXERR_KEYMISS can be reported even though decryption was
> +        * successful, since no MAC address based key cache lookup was
> +        * performed.
> +        */
> +       if (rx_stats->rs_keyix < 4)
> +               rx_stats->rs_status &= ~ATH9K_RXERR_KEYMISS;
> +
>        if (!rx_stats->rs_datalen)
>                return false;
>         /*


unfortunately as the rx_keyix is always 'INVALID' (as obtained from
the descriptor) this check does not seems to help

-- 
shafi

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-02-02  5:37         ` Re: Mohammed Shafi
@ 2012-02-02 12:28           ` Felix Fietkau
  2012-02-03 10:12             ` Re: Mohammed Shafi
  2012-02-03 14:44             ` Re: Laurent Bonnans
  0 siblings, 2 replies; 1546+ messages in thread
From: Felix Fietkau @ 2012-02-02 12:28 UTC (permalink / raw)
  To: Mohammed Shafi; +Cc: John W. Linville, Laurent Bonnans, linux-wireless

On 2012-02-02 6:37 AM, Mohammed Shafi wrote:
> Hi Felix,
> 
> On Wed, Feb 1, 2012 at 10:34 PM, Felix Fietkau <nbd@openwrt.org> wrote:
>> On 2012-02-01 5:27 PM, John W. Linville wrote:
>>> On Wed, Feb 01, 2012 at 04:44:08PM +0530, Mohammed Shafi wrote:
>>>> On Tue, Jan 31, 2012 at 11:28 AM, Mohammed Shafi
>>>> <shafi.wireless@gmail.com> wrote:
>>>> > On Tue, Jan 31, 2012 at 1:13 AM, Laurent Bonnans <bonnans.l@gmail.com> wrote:
>>>> >> Since the update from linux 3.2.1 to 3.2.2, dhcp stopped working on
>>>> >> some APs on my laptop with an AR9285 Wireless card.
>>>> >>
>>>> >> dhcp works fine on an open wifi network but receives no response on a
>>>> >> wep network I use. I haven't been able to test it on a third network
>>>> >> for now.
>>>> >
>>>> >  reverting  "ath9k_hw: fix interpretation of the rx KeyMiss flag" does
>>>> > helps.  i  need to analyze
>>>> > if it exposes some real issue which need to be fixed.
>>>> >
>>>>
>>>> this seems to be a problem in WEP alone, where the key miss is always
>>>> set for this case and RX_FLAG_DECRYPTED is not set. mac80211 trys to
>>>> decrypt,  but fails due to ICV mismatch.
>>>
>>> OK...any way to differentiate this case at that point in the code?
>>>
>>> John
>> Please try this patch:
>>
>> ---
>> --- a/drivers/net/wireless/ath/ath9k/recv.c
>> +++ b/drivers/net/wireless/ath/ath9k/recv.c
>> @@ -823,6 +823,15 @@ static bool ath9k_rx_accept(struct ath_c
>>                (ATH9K_RXERR_DECRYPT | ATH9K_RXERR_CRC | ATH9K_RXERR_MIC |
>>                 ATH9K_RXERR_KEYMISS));
>>
>> +       /*
>> +        * First 4 slots are reserved for WEP, and for packets using them,
>> +        * ATH9K_RXERR_KEYMISS can be reported even though decryption was
>> +        * successful, since no MAC address based key cache lookup was
>> +        * performed.
>> +        */
>> +       if (rx_stats->rs_keyix < 4)
>> +               rx_stats->rs_status &= ~ATH9K_RXERR_KEYMISS;
>> +
>>        if (!rx_stats->rs_datalen)
>>                return false;
>>         /*
> 
> 
> unfortunately as the rx_keyix is always 'INVALID' (as obtained from
> the descriptor) this check does not seems to help
You're right. I read up on what the other codebases do here, and I have
a better patch here:

--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -823,6 +823,14 @@ static bool ath9k_rx_accept(struct ath_c
 		(ATH9K_RXERR_DECRYPT | ATH9K_RXERR_CRC | ATH9K_RXERR_MIC |
 		 ATH9K_RXERR_KEYMISS));
 
+	/*
+	 * Key miss events are only relevant for pairwise keys where the
+	 * descriptor does contain a valid key index. This has been observed
+	 * mostly with CCMP encryption.
+	 */
+	if (rx_stats->rs_keyix == ATH9K_RXKEYIX_INVALID)
+		rx_stats->rs_status &= ~ATH9K_RXERR_KEYMISS;
+
 	if (!rx_stats->rs_datalen)
 		return false;
         /*


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-02-02 12:28           ` Re: Felix Fietkau
@ 2012-02-03 10:12             ` Mohammed Shafi
  2012-02-03 14:44             ` Re: Laurent Bonnans
  1 sibling, 0 replies; 1546+ messages in thread
From: Mohammed Shafi @ 2012-02-03 10:12 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: John W. Linville, Laurent Bonnans, linux-wireless

On Thu, Feb 2, 2012 at 5:58 PM, Felix Fietkau <nbd@openwrt.org> wrote:
> On 2012-02-02 6:37 AM, Mohammed Shafi wrote:
>> Hi Felix,
>>
>> On Wed, Feb 1, 2012 at 10:34 PM, Felix Fietkau <nbd@openwrt.org> wrote:
>>> On 2012-02-01 5:27 PM, John W. Linville wrote:
>>>> On Wed, Feb 01, 2012 at 04:44:08PM +0530, Mohammed Shafi wrote:
>>>>> On Tue, Jan 31, 2012 at 11:28 AM, Mohammed Shafi
>>>>> <shafi.wireless@gmail.com> wrote:
>>>>> > On Tue, Jan 31, 2012 at 1:13 AM, Laurent Bonnans <bonnans.l@gmail.com> wrote:
>>>>> >> Since the update from linux 3.2.1 to 3.2.2, dhcp stopped working on
>>>>> >> some APs on my laptop with an AR9285 Wireless card.
>>>>> >>
>>>>> >> dhcp works fine on an open wifi network but receives no response on a
>>>>> >> wep network I use. I haven't been able to test it on a third network
>>>>> >> for now.
>>>>> >
>>>>> >  reverting  "ath9k_hw: fix interpretation of the rx KeyMiss flag" does
>>>>> > helps.  i  need to analyze
>>>>> > if it exposes some real issue which need to be fixed.
>>>>> >
>>>>>
>>>>> this seems to be a problem in WEP alone, where the key miss is always
>>>>> set for this case and RX_FLAG_DECRYPTED is not set. mac80211 trys to
>>>>> decrypt,  but fails due to ICV mismatch.
>>>>
>>>> OK...any way to differentiate this case at that point in the code?
>>>>
>>>> John
>>> Please try this patch:
>>>
>>> ---
>>> --- a/drivers/net/wireless/ath/ath9k/recv.c
>>> +++ b/drivers/net/wireless/ath/ath9k/recv.c
>>> @@ -823,6 +823,15 @@ static bool ath9k_rx_accept(struct ath_c
>>>                (ATH9K_RXERR_DECRYPT | ATH9K_RXERR_CRC | ATH9K_RXERR_MIC |
>>>                 ATH9K_RXERR_KEYMISS));
>>>
>>> +       /*
>>> +        * First 4 slots are reserved for WEP, and for packets using them,
>>> +        * ATH9K_RXERR_KEYMISS can be reported even though decryption was
>>> +        * successful, since no MAC address based key cache lookup was
>>> +        * performed.
>>> +        */
>>> +       if (rx_stats->rs_keyix < 4)
>>> +               rx_stats->rs_status &= ~ATH9K_RXERR_KEYMISS;
>>> +
>>>        if (!rx_stats->rs_datalen)
>>>                return false;
>>>         /*
>>
>>
>> unfortunately as the rx_keyix is always 'INVALID' (as obtained from
>> the descriptor) this check does not seems to help
> You're right. I read up on what the other codebases do here, and I have
> a better patch here:
>
> --- a/drivers/net/wireless/ath/ath9k/recv.c
> +++ b/drivers/net/wireless/ath/ath9k/recv.c
> @@ -823,6 +823,14 @@ static bool ath9k_rx_accept(struct ath_c
>                (ATH9K_RXERR_DECRYPT | ATH9K_RXERR_CRC | ATH9K_RXERR_MIC |
>                 ATH9K_RXERR_KEYMISS));
>
> +       /*
> +        * Key miss events are only relevant for pairwise keys where the
> +        * descriptor does contain a valid key index. This has been observed
> +        * mostly with CCMP encryption.
> +        */
> +       if (rx_stats->rs_keyix == ATH9K_RXKEYIX_INVALID)
> +               rx_stats->rs_status &= ~ATH9K_RXERR_KEYMISS;
> +
>        if (!rx_stats->rs_datalen)
>                return false;
>         /*
>

this works for me (WEP key configured).

-- 
shafi

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-02-02 12:28           ` Re: Felix Fietkau
  2012-02-03 10:12             ` Re: Mohammed Shafi
@ 2012-02-03 14:44             ` Laurent Bonnans
  1 sibling, 0 replies; 1546+ messages in thread
From: Laurent Bonnans @ 2012-02-03 14:44 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: Mohammed Shafi, John W. Linville, linux-wireless

It works for me too.

On Thu, Feb 2, 2012 at 1:28 PM, Felix Fietkau <nbd@openwrt.org> wrote:
> On 2012-02-02 6:37 AM, Mohammed Shafi wrote:
>> Hi Felix,
>>
>> On Wed, Feb 1, 2012 at 10:34 PM, Felix Fietkau <nbd@openwrt.org> wrote:
>>> On 2012-02-01 5:27 PM, John W. Linville wrote:
>>>> On Wed, Feb 01, 2012 at 04:44:08PM +0530, Mohammed Shafi wrote:
>>>>> On Tue, Jan 31, 2012 at 11:28 AM, Mohammed Shafi
>>>>> <shafi.wireless@gmail.com> wrote:
>>>>> > On Tue, Jan 31, 2012 at 1:13 AM, Laurent Bonnans <bonnans.l@gmail.com> wrote:
>>>>> >> Since the update from linux 3.2.1 to 3.2.2, dhcp stopped working on
>>>>> >> some APs on my laptop with an AR9285 Wireless card.
>>>>> >>
>>>>> >> dhcp works fine on an open wifi network but receives no response on a
>>>>> >> wep network I use. I haven't been able to test it on a third network
>>>>> >> for now.
>>>>> >
>>>>> >  reverting  "ath9k_hw: fix interpretation of the rx KeyMiss flag" does
>>>>> > helps.  i  need to analyze
>>>>> > if it exposes some real issue which need to be fixed.
>>>>> >
>>>>>
>>>>> this seems to be a problem in WEP alone, where the key miss is always
>>>>> set for this case and RX_FLAG_DECRYPTED is not set. mac80211 trys to
>>>>> decrypt,  but fails due to ICV mismatch.
>>>>
>>>> OK...any way to differentiate this case at that point in the code?
>>>>
>>>> John
>>> Please try this patch:
>>>
>>> ---
>>> --- a/drivers/net/wireless/ath/ath9k/recv.c
>>> +++ b/drivers/net/wireless/ath/ath9k/recv.c
>>> @@ -823,6 +823,15 @@ static bool ath9k_rx_accept(struct ath_c
>>>                (ATH9K_RXERR_DECRYPT | ATH9K_RXERR_CRC | ATH9K_RXERR_MIC |
>>>                 ATH9K_RXERR_KEYMISS));
>>>
>>> +       /*
>>> +        * First 4 slots are reserved for WEP, and for packets using them,
>>> +        * ATH9K_RXERR_KEYMISS can be reported even though decryption was
>>> +        * successful, since no MAC address based key cache lookup was
>>> +        * performed.
>>> +        */
>>> +       if (rx_stats->rs_keyix < 4)
>>> +               rx_stats->rs_status &= ~ATH9K_RXERR_KEYMISS;
>>> +
>>>        if (!rx_stats->rs_datalen)
>>>                return false;
>>>         /*
>>
>>
>> unfortunately as the rx_keyix is always 'INVALID' (as obtained from
>> the descriptor) this check does not seems to help
> You're right. I read up on what the other codebases do here, and I have
> a better patch here:
>
> --- a/drivers/net/wireless/ath/ath9k/recv.c
> +++ b/drivers/net/wireless/ath/ath9k/recv.c
> @@ -823,6 +823,14 @@ static bool ath9k_rx_accept(struct ath_c
>                (ATH9K_RXERR_DECRYPT | ATH9K_RXERR_CRC | ATH9K_RXERR_MIC |
>                 ATH9K_RXERR_KEYMISS));
>
> +       /*
> +        * Key miss events are only relevant for pairwise keys where the
> +        * descriptor does contain a valid key index. This has been observed
> +        * mostly with CCMP encryption.
> +        */
> +       if (rx_stats->rs_keyix == ATH9K_RXKEYIX_INVALID)
> +               rx_stats->rs_status &= ~ATH9K_RXERR_KEYMISS;
> +
>        if (!rx_stats->rs_datalen)
>                return false;
>         /*
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2012-02-15 21:17 Irish Lotto
  0 siblings, 0 replies; 1546+ messages in thread
From: Irish Lotto @ 2012-02-15 21:17 UTC (permalink / raw)




You won £750,000 GBP. Send Name, Age, occupation, Country.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-02-22  6:50 Vlatka Petričec
@ 2012-02-22 15:28 ` Larry Finger
  0 siblings, 0 replies; 1546+ messages in thread
From: Larry Finger @ 2012-02-22 15:28 UTC (permalink / raw)
  To: Vlatka Petričec; +Cc: linux-wireless

On 02/22/2012 12:50 AM, Vlatka Petričec wrote:
> Hi,
> I have linux evolution  and I had a normal wireless connection until
> something changed in maybe network settings and it just stopped
> working. for example it says that my network is active but I cannot
> open any web adress or is just active but every function on computer
> says it is not active. I am thanking you in advance thank you for your
> time.

Vlatka,

I have some suggestions for you.

First, never submit an E-mail to anyone, and especially to a mailing list 
without a subject that is a good description of your problem. Most experts will 
have their mail filters set up to direct a subject-less message directly to the 
spam bucket. I think I need to do that too.

Second, you give no information that would let anyone help you. "It just stopped 
working" does no good. Knowing what changed is critical. If the kernel changed 
when it stopped, then this list might be the right place to ask. If something 
was updated by the distro, then get your help there as very few of us would know 
how evolution works. if nothing changed, then your settings just got corrupted, 
and you definitely need to contact the support at evolution.

If you think that it is a kernel or driver problem, then you absolutely must 
state what hardware you have, what driver it uses, and the PCI or USB ID that 
describes it. You also need to state what kernel works, and what version fails.

Larry

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-02-23 15:39 Pierre Frenkiel
@ 2012-02-23 16:34 ` Brad Midgley
  0 siblings, 0 replies; 1546+ messages in thread
From: Brad Midgley @ 2012-02-23 16:34 UTC (permalink / raw)
  To: Pierre Frenkiel; +Cc: linux-bluetooth

Jack,

The safari "reader" feature happens inside the web browser and is not
part of the site you are looking at. So it would work on generic web
pages through an extension to the browser or something that people
load somehow. The newspaper wouldn't coordinate it.

Brad

On Thu, Feb 23, 2012 at 8:39 AM, Pierre Frenkiel
<pierre.frenkiel@gmail.com> wrote:
> following the tutorial at
>   http://bluetooth-alsa.sourceforge.net/build.html
>
> I was able to use my bluetooth head with some programs like gxine,
> kaffeine, mplayer, but I had a problem trying to install plugz
> ./configure fails with the message:
>   configure: error: Package requirements \
>       (dbus-1 >= 0.36, dbus-glib-1 >= 0.36) were not met
> The installed package is libdbus-glib-1-2, and I suppose that the problem is
> just in naming, but I don't kown how to overcome it.
>
> --
> Pierre Frenkiel
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bluetooth"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Brad Midgley

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-05-08  0:54 (unknown), Tim Flavin
@ 2012-05-17 21:10 ` Josh Durgin
  0 siblings, 0 replies; 1546+ messages in thread
From: Josh Durgin @ 2012-05-17 21:10 UTC (permalink / raw)
  To: Tim Flavin; +Cc: ceph-devel

On 05/07/2012 05:54 PM, Tim Flavin wrote:
> The new site is great!  I like the Ceph documentation, however I found
> a couple of typos.  Is this the best place address them?  (Some of the
> apparent typos may be my not understanding what is going on.)
>
>
>
> http://ceph.com/docs/master/config-cluster/ceph-conf/
>
> The  "Hardware Recommendations" link near the bottom of the page gives
> a 404.  Did you want to point to
> http://ceph.com/docs/master/install/hardware-recommendations/ ?
>
>
> http://ceph.com/docs/master/config-ref/osd-config
>
> For  "osd client message size cap"  The default value is 500 MB but
> the description lists it a 200 MB.
>
>
> http://ceph.com/docs/master/api/librbdpy/
>
> The line of code: "size = 4 * 1024 * 1024  # 4 GiB" appears to be
> missing a * 1024, and the next line
>   is "rbd_inst.create('myimage', 4)" when it probably should be
> "rbd_inst.create('myimage', size)" This is repeated several times.

Thanks for the notes - I've fixed these in the master branch.

All the docs are in git under the doc directory - if you find other
problems, feel free to send a patch or a github pull request. You can
even edit it in a browser on github if you like.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2012-05-20 22:20 Mr. Peter Wong
  0 siblings, 0 replies; 1546+ messages in thread
From: Mr. Peter Wong @ 2012-05-20 22:20 UTC (permalink / raw)


Good-Day Friend,

I Mr. Peter Wong, I Need Your Assistance

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2012-05-20 22:20 Mr. Peter Wong
  0 siblings, 0 replies; 1546+ messages in thread
From: Mr. Peter Wong @ 2012-05-20 22:20 UTC (permalink / raw)


Good-Day Friend,

I Mr. Peter Wong, I Need Your Assistance

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2012-05-20 22:20 Mr. Peter Wong
  0 siblings, 0 replies; 1546+ messages in thread
From: Mr. Peter Wong @ 2012-05-20 22:20 UTC (permalink / raw)


Good-Day Friend,

I Mr. Peter Wong, I Need Your Assistance


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2012-05-22 14:39 skoffman
  0 siblings, 0 replies; 1546+ messages in thread
From: skoffman @ 2012-05-22 14:39 UTC (permalink / raw)
  To: ps, jianglai1, info, doctorbaum, huangtw, linux-ide, manqingchen

...Get rich working on-line  
http://www.mfi.es/html/facebook.news.php?irhotmailID=26j8



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2012-05-30 23:55 Yuniya
  0 siblings, 0 replies; 1546+ messages in thread
From: Yuniya @ 2012-05-30 23:55 UTC (permalink / raw)
  To: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	elenabb-k+OT61UuxXo, job-V2yTFB0W/3mtWN4W2nTtWg,
	elena-812005-o+MxOtu4lMCHXe+LvDLADg,
	dengler-9DQVCeAflyKHXe+LvDLADg, gring-tour-wlJVHIE9RCYvJsYlp49lxw,
	tkv-JxL4jwaQtks, slbes-pvs+CRRe+RBQFI55V6+gNQ,
	kozlo-uiYMGHyRbTyHXe+LvDLADg, olgan-k6CEpOK/D2hnA2hzVk6SfA,
	pajuul-PkbjNfxxIARBDgjK7y7TUQ, tamara-oqzwS2uSzc7j49nqgPla5Q,
	lad-sveta-o+MxOtu4lMCHXe+LvDLADg, nguena-YpEf5thNGg0h0qL9L3qPag,
	oxsana25-Re5JQEeQqe8AvxtiuMwx3w




секс интим знакомства

http://toaty.co.uk/er



_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-06-06 10:33 Sascha Hauer
@ 2012-06-06 14:39 ` Artem Bityutskiy
  2012-06-07 10:11   ` Re: Sascha Hauer
  0 siblings, 1 reply; 1546+ messages in thread
From: Artem Bityutskiy @ 2012-06-06 14:39 UTC (permalink / raw)
  To: Sascha Hauer; +Cc: Shawn Guo, linux-mtd, linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 1679 bytes --]

On Wed, 2012-06-06 at 12:33 +0200, Sascha Hauer wrote:
> The following adds i.MX53 nand support and generally devicetree
> based probing for i.MX5 boards. The first three patches should go
> via mtd, the last patch optionally aswell if all agree.
> 
> Sascha
> 
> The following changes since commit f8f5701bdaf9134b1f90e5044a82c66324d2073f:
> 
>   Linux 3.5-rc1 (2012-06-02 18:29:26 -0700)
> 
> are available in the git repository at:
> 
>   git://git.pengutronix.de/git/imx/linux-2.6.git imx/nand-mx53
> 
> for you to fetch changes up to d55d1479a3bfaedbb9f0c6c956f4dff6bb6d6d61:
> 
>   ARM i.MX5: Add nand oftree support (2012-06-06 12:20:24 +0200)

Do you want this to go via the MTD tree? Would you be able to collect
acks for the arch/arm bits? Meanwhile, please, take a look at these
sparse warnings added by this patch-set and detected by aiaiai:

--------------------------------------------------------------------------------

Successfully built configuration "arm-mxc-imx_defconfig,arm,arm-unknown-linux-gnueabi-", results:

--- before_patching.log
+++ after_patching.log
@@ @@
+drivers/mtd/nand/mxc_nand.c:1289:26: warning: incorrect type in initializer (different modifiers) [sparse]
+drivers/mtd/nand/mxc_nand.c:1289:26:    expected void *data [sparse]
+drivers/mtd/nand/mxc_nand.c:1289:26:    got struct mxc_nand_devtype_data static const [toplevel] *<noident> [sparse]
+drivers/mtd/nand/mxc_nand.c:1289:3: warning: initialization discards 'const' qualifier from pointer target type [enabled by default]

--------------------------------------------------------------------------------


-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-06-06 14:39 ` Artem Bityutskiy
@ 2012-06-07 10:11   ` Sascha Hauer
  2012-06-07 12:45     ` Re: Artem Bityutskiy
  0 siblings, 1 reply; 1546+ messages in thread
From: Sascha Hauer @ 2012-06-07 10:11 UTC (permalink / raw)
  To: Artem Bityutskiy; +Cc: Shawn Guo, linux-mtd, linux-arm-kernel

Hi Artem,

On Wed, Jun 06, 2012 at 05:39:07PM +0300, Artem Bityutskiy wrote:
> On Wed, 2012-06-06 at 12:33 +0200, Sascha Hauer wrote:
> > The following adds i.MX53 nand support and generally devicetree
> > based probing for i.MX5 boards. The first three patches should go
> > via mtd, the last patch optionally aswell if all agree.
> > 
> > Sascha
> > 
> > The following changes since commit f8f5701bdaf9134b1f90e5044a82c66324d2073f:
> > 
> >   Linux 3.5-rc1 (2012-06-02 18:29:26 -0700)
> > 
> > are available in the git repository at:
> > 
> >   git://git.pengutronix.de/git/imx/linux-2.6.git imx/nand-mx53
> > 
> > for you to fetch changes up to d55d1479a3bfaedbb9f0c6c956f4dff6bb6d6d61:
> > 
> >   ARM i.MX5: Add nand oftree support (2012-06-06 12:20:24 +0200)
> 
> Do you want this to go via the MTD tree? Would you be able to collect
> acks for the arch/arm bits? Meanwhile, please, take a look at these
> sparse warnings added by this patch-set and detected by aiaiai:
> 
> --------------------------------------------------------------------------------
> 
> Successfully built configuration "arm-mxc-imx_defconfig,arm,arm-unknown-linux-gnueabi-", results:
> 
> --- before_patching.log
> +++ after_patching.log
> @@ @@
> +drivers/mtd/nand/mxc_nand.c:1289:26: warning: incorrect type in initializer (different modifiers) [sparse]
> +drivers/mtd/nand/mxc_nand.c:1289:26:    expected void *data [sparse]
> +drivers/mtd/nand/mxc_nand.c:1289:26:    got struct mxc_nand_devtype_data static const [toplevel] *<noident> [sparse]
> +drivers/mtd/nand/mxc_nand.c:1289:3: warning: initialization discards 'const' qualifier from pointer target type [enabled by default]

Fixing these warnings in the nand driver does not seem to be the correct
approach. Initializing mxc_nand_devtype_data as const seems sane, the
problem is that struct of_device_id expects a void * instead of a const
void *. A patch fixing this is outstanding here:

http://permalink.gmane.org/gmane.linux.drivers.devicetree/15069

(this will also fix the other sparse warnings from this driver)

I asked Uwe to resend this.

So I only added Shawns Ack to the arm-i.MX part, you can pull this into
the mtd tree:


The following changes since commit f8f5701bdaf9134b1f90e5044a82c66324d2073f:

  Linux 3.5-rc1 (2012-06-02 18:29:26 -0700)

are available in the git repository at:

  git://git.pengutronix.de/git/imx/linux-2.6.git tags/mtd-imx53-nand-support

for you to fetch changes up to 25d097d575d7c06b76e4e6e2488718976b70c432:

  ARM i.MX5: Add nand oftree support (2012-06-07 11:59:19 +0200)

----------------------------------------------------------------
Nand support for i.MX53 and devicetree snippets for i.MX5 nand

----------------------------------------------------------------
Sascha Hauer (4):
      mtd nand mxc_nand: Use managed resources
      mtd nand mxc_nand: swap iomem resource order
      mtd nand mxc_nand: add i.MX53 support
      ARM i.MX5: Add nand oftree support

 arch/arm/boot/dts/imx51.dtsi                  |    7 ++
 arch/arm/boot/dts/imx53.dtsi                  |    7 ++
 arch/arm/mach-imx/clk-imx51-imx53.c           |    2 +
 arch/arm/plat-mxc/devices/platform-mxc_nand.c |   11 +-
 drivers/mtd/nand/mxc_nand.c                   |  137 ++++++++++++++-----------
 5 files changed, 97 insertions(+), 67 deletions(-)

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: Re:
  2012-06-07 10:11   ` Re: Sascha Hauer
@ 2012-06-07 12:45     ` Artem Bityutskiy
  0 siblings, 0 replies; 1546+ messages in thread
From: Artem Bityutskiy @ 2012-06-07 12:45 UTC (permalink / raw)
  To: Sascha Hauer; +Cc: Shawn Guo, linux-mtd, linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 582 bytes --]

On Thu, 2012-06-07 at 12:11 +0200, Sascha Hauer wrote:
> Fixing these warnings in the nand driver does not seem to be the correct
> approach. Initializing mxc_nand_devtype_data as const seems sane, the
> problem is that struct of_device_id expects a void * instead of a const
> void *. A patch fixing this is outstanding here:
> 
> http://permalink.gmane.org/gmane.linux.drivers.devicetree/15069
> 
> (this will also fix the other sparse warnings from this driver)
> 
> I asked Uwe to resend this.

Pushed to l2-mtd.git, thanks!

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-07-01 20:22         ` Chuanyu
@ 2012-07-02  9:35           ` Chuanyu Tsai
  0 siblings, 0 replies; 1546+ messages in thread
From: Chuanyu Tsai @ 2012-07-02  9:35 UTC (permalink / raw)
  To: ceph-devel

Chuanyu <chuanyu <at> cs.nctu.edu.tw> writes:
> Hi Yehuda, Florian,
> 
> I follow the wiki, and steps which you discussed,
> construct my ceph system with rados gateway,
> and I can use libs3 to upload file via radosgw, (thanks a lot!)
> but got "405 Method Not Allowed" when I use swift,
> 
> $ swift -v -A http://s3.paca.tw:80/auth -U paca:paca1 -K 
> UoJO4nFgdAoX+9nEftElIY+AMmDIkcrUBkycNKPA stat
> Auth GET failed: http://s3.paca.tw:80/auth/tokens 405 Method Not Allowed
> 
> ( Because there has no test step on wiki,
>  I follow the Florian's question, and guess the test command is above ?!)
> 
> my radosgw-admin config:
> $ radosgw-admin user info --uid=paca
> { "user_id": "paca",
>   "rados_uid": 0,
>   "display_name": "chuanyu",
>   "email": "chuanyu <at> cs.nctu.edu.tw",
>   "suspended": 0,
>   "subusers": [
>         { "id": "paca:paca1",
>           "permissions": "full-control"}],
I've correct the permissions problem, thanks Florian!
>   "keys": [
>         { "user": "paca",
>           "access_key": "DS932H4EI9HK7I1CTDNF",
>           "secret_key": "Rn\/5FqHzRPZFN6f9R\/LuTqvG0AYjbHtrurrGydVk"}],
>   "swift_keys": [
>         { "user": "paca:paca1",
>           "secret_key": "UoJO4nFgdAoX+9nEftElIY+AMmDIkcrUBkycNKPA"}]}
> 
> ceph.conf:
> [client.radosgw.gateway]
>     host = volume
>     keyring = /etc/ceph/keyring/radosgw.gateway.keyring
>     rgw socket path = /var/run/ceph/rgw.sock
>     log file = ""
>     syslog = true
>     debug rgw = 20
> 
> my log:
> http://pastebin.com/rhGhATmv
Hi,

I've noticed that the log shows I'm using *POST* method to getting op?
   req 9:0.000277:swift-auth:POST /auth/tokens::getting op 

But the code shows I'll always get NULL return

/ceph/src/rgw/rgw_swift_auth.cc:239
239 RGWOp *RGWHandler_SWIFT_Auth::get_op()
240 {
241   RGWOp *op;
242   switch (s->op) {
243    case OP_GET:
244      op = &rgw_swift_auth_get;
245      break;
246    default:
247      return NULL;
248   }


So 405 error occurs,
/ceph/src/rgw/rgw_main.cc:273
273   req->log(s, "getting op");
274   op = handler->get_op();
275   if (!op) {
276     abort_early(s, -ERR_METHOD_NOT_ALLOWED);
277     goto done;

My swift version (Version: 1.4.8-0ubuntu2, Ubuntu 12.04)
$ swift --version
swift 1.0

Does the version mismatch, or something else goes wrong?
I'll try curl connection directly later,

Thanks!
Chuanyu Tsai.

> 
> Any advice would be appreciate!
> Tthanks,
> Chuanyu



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-07-06 16:57 Pablo Trujillo
@ 2012-07-07  9:08 ` Vladimir 'φ-coder/phcoder' Serbinenko
  0 siblings, 0 replies; 1546+ messages in thread
From: Vladimir 'φ-coder/phcoder' Serbinenko @ 2012-07-07  9:08 UTC (permalink / raw)
  To: grub-devel

[-- Attachment #1: Type: text/plain, Size: 408 bytes --]

On 06.07.2012 18:57, Pablo Trujillo wrote:

> Hi all,
> 
> I've no idea if this is posible with grub, i hope you can help me:
> 
> I need to parse tha output from the command lspci and take some
> considerations to boot diferents kernels
> 
> it is posible to get something similar to grep or wak in grub?

Yes, it is. Write a patch.



-- 
Regards
Vladimir 'φ-coder/phcoder' Serbinenko


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 294 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-07-31 23:52 (unknown), Ricardo Neri
@ 2012-07-31 23:58 ` Ricardo Neri
  0 siblings, 0 replies; 1546+ messages in thread
From: Ricardo Neri @ 2012-07-31 23:58 UTC (permalink / raw)
  To: Ricardo Neri; +Cc: tomi.valkeinen, archit, s-guiriec, linux-omap

On 07/31/2012 06:52 PM, Ricardo Neri wrote:
>  From 8b0f9153d078b7182efd604ef8525d50899ce1a3 Mon Sep 17 00:00:00 2001
> From: Ricardo Neri<ricardo.neri@ti.com>
> Date: Mon, 30 Jul 2012 17:54:59 -0500
> Subject: [PATCH v3] OMAPDSS: DISPC: Improvements to DIGIT sync signal selection

A small issue while sending the patch. I resubmitted correctly. Sorry 
for the spam!

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-08-06 16:59 anish kumar
@ 2012-08-06 17:05 ` Maarten Lankhorst
  0 siblings, 0 replies; 1546+ messages in thread
From: Maarten Lankhorst @ 2012-08-06 17:05 UTC (permalink / raw)
  To: anish kumar
  Cc: cw00.choi, myungjoo.ham, jic23, linux-kernel, linux-iio,
	anish kumar

Op 06-08-12 18:59, anish kumar schreef:
> From: anish kumar <anish198519851985@gmail.com>
>
-ESUBJECT

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-08-17 14:59   ` David Sterba
@ 2012-08-17 15:30     ` Liu Bo
  0 siblings, 0 replies; 1546+ messages in thread
From: Liu Bo @ 2012-08-17 15:30 UTC (permalink / raw)
  To: David Sterba; +Cc: Lluís Batlle i Rossell, Btrfs mailing list, andrei.popa

On 08/17/2012 10:59 PM, David Sterba wrote:
> On Fri, Aug 17, 2012 at 09:45:20AM +0800, Liu Bo wrote:
>> On 08/15/2012 06:12 PM, Lluís Batlle i Rossell wrote:
>>> some time ago we discussed on #btrfs that the nocow attribute for files wasn't
>>> working (around 3.3 or 3.4 kernels). That was evident by files fragmenting even
>>> with the attribute set.
>>>
>>> Chris mentioned to find a fix quickly for that, and posted some lines of change
>>> into irc. But recently someone mentioned that 3.6-rc looks like still not
>>> respecting nocow for files.
>>>
>>> Is there really a fix upstream for that? Do nocow attribute on files work for
>>> anyone already?
>>>
>>
>> Dave had post a patch to fix it but only enabling NOCOW with zero sized file.
>>
>> FYI, the patch is http://article.gmane.org/gmane.comp.file-systems.btrfs/17351
>>
>> With the patch, you don't need to mount with nodatacow any more :)
>>
>> And why it is only for only zero sized file:
>> http://permalink.gmane.org/gmane.comp.file-systems.btrfs/18046
> 
> the original patch http://permalink.gmane.org/gmane.comp.file-systems.btrfs/18031
> did two things, the reasoning why it is not allowed to set nodatasum in
> general applies only to the second hunk but this
> 
> @@ -139,7 +139,7 @@ void btrfs_inherit_iflags(struct inode *inode, struct inode *dir)
>  	}
> 
>  	if (flags & BTRFS_INODE_NODATACOW)
> -		BTRFS_I(inode)->flags |= BTRFS_INODE_NODATACOW;
> +		BTRFS_I(inode)->flags |= BTRFS_INODE_NODATACOW | BTRFS_INODE_NODATASUM;
> 
>  	btrfs_update_iflags(inode);
>  }
> ---
> 
> is sufficient to create nocow files via a directory with NOCOW attribute
> set, and all new files will inherit it (they are automatically
> zero-sized so it's safe). This usecase is similar to setting the
> COMPRESS attribute on a directory and all new files will inherit the
> flag.
> 
> If Andrei wants to resend just this particular hunk, I'm giving it my ACK.
> 

IMO the following is better, just make use of the original check.  If you agree with this,
I'll send it as a patch :)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 6e8f416..d4e58df 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4721,8 +4721,10 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 		if (btrfs_test_opt(root, NODATASUM))
 			BTRFS_I(inode)->flags |= BTRFS_INODE_NODATASUM;
 		if (btrfs_test_opt(root, NODATACOW) ||
-		    (BTRFS_I(dir)->flags & BTRFS_INODE_NODATACOW))
+		    (BTRFS_I(dir)->flags & BTRFS_INODE_NODATACOW)) {
 			BTRFS_I(inode)->flags |= BTRFS_INODE_NODATACOW;
+			BTRFS_I(inode)->flags |= BTRFS_INODE_NODATASUM;
+		}
 	}
 
 	insert_inode_hash(inode);


> 
> david
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply related	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <s5hmx1526mg.wl%tiwai@suse.de>
@ 2012-09-06  6:02 ` Markus Trippelsdorf
  2012-09-06  6:33   ` (no subject) Daniel Mack
  0 siblings, 1 reply; 1546+ messages in thread
From: Markus Trippelsdorf @ 2012-09-06  6:02 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Linus Torvalds, linux-kernel, Daniel Mack, alsa-devel

On 2012.09.04 at 16:40 +0200, Takashi Iwai wrote:
> ----------------------------------------------------------------
> Sound fixes for 3.6-rc5
> 
> There are nothing scaring, contains only small fixes for HD-audio and
> USB-audio:
> - EPSS regression fix and GPIO fix for HD-audio IDT codecs
> - A series of USB-audio regression fixes that are found since 3.5 kernel
> 
> ----------------------------------------------------------------
> Daniel Mack (4):
>       ALSA: snd-usb: Fix URB cancellation at stream start
>       ALSA: snd-usb: restore delay information
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
The commit fbcfbf5f above causes the following lines to be printed
whenever I start a new song:

delay: estimated 0, actual 352
delay: estimated 353, actual 705

(44.1 * 8 = 352.8)

This happens with an USB-DAC that identifies itself as "C-Media USB
Headphone Set".

-- 
Markus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-09-06  6:33   ` (no subject) Daniel Mack
@ 2012-09-06  6:45     ` Markus Trippelsdorf
  2012-09-06  6:48     ` (no subject) Takashi Iwai
  1 sibling, 0 replies; 1546+ messages in thread
From: Markus Trippelsdorf @ 2012-09-06  6:45 UTC (permalink / raw)
  To: Daniel Mack
  Cc: Takashi Iwai, Linus Torvalds, linux-kernel, alsa-devel,
	Pierre-Louis Bossart

On 2012.09.06 at 08:33 +0200, Daniel Mack wrote:
> On 06.09.2012 08:02, Markus Trippelsdorf wrote:
> > On 2012.09.04 at 16:40 +0200, Takashi Iwai wrote:
> >> ----------------------------------------------------------------
> >> Sound fixes for 3.6-rc5
> >>
> >> There are nothing scaring, contains only small fixes for HD-audio and
> >> USB-audio:
> >> - EPSS regression fix and GPIO fix for HD-audio IDT codecs
> >> - A series of USB-audio regression fixes that are found since 3.5 kernel
> >>
> >> ----------------------------------------------------------------
> >> Daniel Mack (4):
> >>       ALSA: snd-usb: Fix URB cancellation at stream start
> >>       ALSA: snd-usb: restore delay information
> >         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
> > The commit fbcfbf5f above causes the following lines to be printed
> > whenever I start a new song:
> 
> Copied Pierre-Louis Bossart - he wrote the code in 294c4fb8 which this
> patch (fbcfbf5f) brings back now.
> 
> > delay: estimated 0, actual 352
> > delay: estimated 353, actual 705
> > 
> > (44.1 * 8 = 352.8)
> > 
> > This happens with an USB-DAC that identifies itself as "C-Media USB
> > Headphone Set".
> 
> And you didn't you see these lines with 3.4?

No.

-- 
Markus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-09-06  6:48     ` (no subject) Takashi Iwai
@ 2012-09-06  6:53       ` Markus Trippelsdorf
  0 siblings, 0 replies; 1546+ messages in thread
From: Markus Trippelsdorf @ 2012-09-06  6:53 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Daniel Mack, Linus Torvalds, linux-kernel, alsa-devel,
	Pierre-Louis Bossart

On 2012.09.06 at 08:48 +0200, Takashi Iwai wrote:
> At Thu, 06 Sep 2012 08:33:30 +0200,
> Daniel Mack wrote:
> > 
> > On 06.09.2012 08:02, Markus Trippelsdorf wrote:
> > > On 2012.09.04 at 16:40 +0200, Takashi Iwai wrote:
> > >> ----------------------------------------------------------------
> > >> Sound fixes for 3.6-rc5
> > >>
> > >> There are nothing scaring, contains only small fixes for HD-audio and
> > >> USB-audio:
> > >> - EPSS regression fix and GPIO fix for HD-audio IDT codecs
> > >> - A series of USB-audio regression fixes that are found since 3.5 kernel
> > >>
> > >> ----------------------------------------------------------------
> > >> Daniel Mack (4):
> > >>       ALSA: snd-usb: Fix URB cancellation at stream start
> > >>       ALSA: snd-usb: restore delay information
> > >         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
> > > The commit fbcfbf5f above causes the following lines to be printed
> > > whenever I start a new song:
> > 
> > Copied Pierre-Louis Bossart - he wrote the code in 294c4fb8 which this
> > patch (fbcfbf5f) brings back now.
> > 
> > > delay: estimated 0, actual 352
> > > delay: estimated 353, actual 705
> > > 
> > > (44.1 * 8 = 352.8)
> > > 
> > > This happens with an USB-DAC that identifies itself as "C-Media USB
> > > Headphone Set".
> > 
> > And you didn't you see these lines with 3.4?
> 
> Maybe the difference of start condition?
> 
> Markus, does the patch below fix anything?

Unfortunately no.
However reverting the following fixes the problem:

commit 245baf983cc39524cce39c24d01b276e6e653c9e
Author: Daniel Mack <zonque@gmail.com>
Date:   Thu Aug 30 18:52:30 2012 +0200

    ALSA: snd-usb: fix calls to next_packet_size

-- 
Markus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-10-03 16:02 James M Leddy
@ 2012-10-03 17:53 ` Luis R. Rodriguez
  2012-10-03 18:15   ` Re: James M Leddy
  0 siblings, 1 reply; 1546+ messages in thread
From: Luis R. Rodriguez @ 2012-10-03 17:53 UTC (permalink / raw)
  To: James M Leddy; +Cc: backports

On Wed, Oct 3, 2012 at 9:02 AM, James M Leddy <james.leddy@canonical.com> wrote:
> subscribe backports

return -EOPNOTSUPP;

You want to poke majordomo@vger.kernel.org, not the actual mailing list.

  Luis

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-10-03 17:53 ` Luis R. Rodriguez
@ 2012-10-03 18:15   ` James M Leddy
  0 siblings, 0 replies; 1546+ messages in thread
From: James M Leddy @ 2012-10-03 18:15 UTC (permalink / raw)
  To: Luis R. Rodriguez; +Cc: backports

On 10/03/2012 01:53 PM, Luis R. Rodriguez wrote:
> On Wed, Oct 3, 2012 at 9:02 AM, James M Leddy <james.leddy@canonical.com> wrote:
>> subscribe backports
> 
> return -EOPNOTSUPP;
> 
> You want to poke majordomo@vger.kernel.org, not the actual mailing list.

That's embarrassing. Would you consider taking out the first sentence
after "subscribing to backports" on the wiki? It seems redundant with
the information immediately above it.

https://backports.wiki.kernel.org/index.php/Mailing_list

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-10-06 23:15 (unknown), David Howells
@ 2012-10-07  6:36 ` Geert Uytterhoeven
  2012-10-11  9:57   ` Re: Will Deacon
  0 siblings, 1 reply; 1546+ messages in thread
From: Geert Uytterhoeven @ 2012-10-07  6:36 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, arnd, hpa, catalin.marinas, linux-arch, linux-kernel,
	ralf, ddaney.cavm, Paul Mundt

On Sun, Oct 7, 2012 at 1:15 AM, David Howells <dhowells@redhat.com> wrote:
>  (3) m68k turned out to have a header installation problem due to it lacking a
>      kvm_para.h file.

Sh also.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-10-07  6:36 ` Geert Uytterhoeven
@ 2012-10-11  9:57   ` Will Deacon
  0 siblings, 0 replies; 1546+ messages in thread
From: Will Deacon @ 2012-10-11  9:57 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: David Howells, torvalds@osdl.org, arnd@arndb.de, hpa@zytor.com,
	Catalin Marinas, linux-arch@vger.kernel.org,
	linux-kernel@vger.kernel.org, ralf@linux-mips.org,
	ddaney.cavm@gmail.com, Paul Mundt

On Sun, Oct 07, 2012 at 07:36:20AM +0100, Geert Uytterhoeven wrote:
> On Sun, Oct 7, 2012 at 1:15 AM, David Howells <dhowells@redhat.com> wrote:
> >  (3) m68k turned out to have a header installation problem due to it lacking a
> >      kvm_para.h file.
> 
> Sh also.

and arm64 iirc. It should also affect arm, but we have a horrible dummy
header to get around it (just includes the asm-generic variant).

I posted a fix, but then it got derailed by the wildcarding used to generate
generic headers for kvm (which I was going some way to removing):

  https://lkml.org/lkml/2012/8/2/173

  http://marc.info/?l=linux-kernel&m=134393963216492&w=2

Will

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-10-23  4:12 (unknown), jie sun
@ 2012-10-23 11:50 ` Wido den Hollander
  2012-10-24  5:48   ` Re: jie sun
  0 siblings, 1 reply; 1546+ messages in thread
From: Wido den Hollander @ 2012-10-23 11:50 UTC (permalink / raw)
  To: jie sun; +Cc: ceph-devel

On 10/23/2012 06:12 AM, jie sun wrote:
> Hi,
>
> I created and mounted a rbd for a virtual machine. And it can be used
> as a block device normally, but often prompt some log like below:

Could you provide us a bit more information?

What kernel are you using?

What does "ceph -s" show you?

Are you running KVM virtual machines and connecting /dev/rbd0 as a 
device to the virtual machine?

Wido

> "Oct 23 10:30:22 ubuntu12 kernel: [321506.941606] libceph: osd3
> 10.100.211.146:6810 socket closed
> Oct 23 10:30:59 ubuntu12 kernel: [321544.337856] libceph: osd9
> 10.100.211.68:6809 socket closed
> Oct 23 10:45:22 ubuntu12 kernel: [322407.233090] libceph: osd3
> 10.100.211.146:6810 socket closed
> Oct 23 10:45:59 ubuntu12 kernel: [322444.766796] libceph: osd9
> 10.100.211.68:6809 socket closed
> Oct 23 11:00:22 ubuntu12 kernel: [323307.529098] libceph: osd3
> 10.100.211.146:6810 socket closed
> Oct 23 11:01:00 ubuntu12 kernel: [323345.241679] libceph: osd9
> 10.100.211.68:6809 socket closed
> Oct 23 11:15:22 ubuntu12 kernel: [324207.821113] libceph: osd3
> 10.100.211.146:6810 socket closed
> Oct 23 11:16:00 ubuntu12 kernel: [324245.717747] libceph: osd9
> 10.100.211.68:6809 socket closed
> Oct 23 11:17:01 ubuntu12 CRON[10529]: (root) CMD (   cd / && run-parts
> --report /etc/cron.hourly)
> Oct 23 11:30:23 ubuntu12 kernel: [325108.117134] libceph: osd3
> 10.100.211.146:6810 socket closed"
>
> These log also can be found in "/var/log/syslog".
> I google something about this problem,but didn't understand what
> you've wrote in "http://tracker.newdream.net/issues/2260" exactly.
> How can I resolve this problem? My ceph version is 0.48.
> Should I change some files, or modify some content of some file?
>
> Thank you !
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-10-23 11:50 ` Wido den Hollander
@ 2012-10-24  5:48   ` jie sun
  2012-10-24  5:58     ` Re: Gregory Farnum
  0 siblings, 1 reply; 1546+ messages in thread
From: jie sun @ 2012-10-24  5:48 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: ceph-devel

My vm kernel version is "Linux ubuntu12 3.2.0-23-generic".

"ceph-s" shows
"  health HEALTH_OK
   monmap e1: 1 mons at {a=10.100.211.146:6789/0}, election epoch 0, quorum 0 a
   osdmap e152: 10 osds: 9 up, 9 in
    pgmap v48479: 2112 pgs: 2112 active+clean; 23161 MB data, 46323 MB
used, 2451 GB / 2514 GB avail
   mdsmap e31: 1/1/1 up {0=a=up:active} "

In my vm, I do operations like:
I install 4 debs on my vm, such as libnss3, libnspr4, librados2,
librbd1. And then execute "modprobe rbd" so that I can map a image to
my vm.
Then "rbd create foo --size 10240 -m $monIP(my ceph mon IP)",
         "rbd map foo -m $monIP" ------ Here a device /dev/rbd0 can be
used as a local device
         "mkfs -t ext4 /dev/rbd0"
         "mount /dev/rbd0 /mnt(or some other directory)"
After the operations above, I can use this device. But it oftern
prompt some log like "libceph: osd9 10.100.211.68:6809 socket closed".
I just want to mount a device to my vm, so I didn't install a ceph
client. Is this proper to do so?

Thank  you for youer answer!




2012/10/23 Wido den Hollander <wido@widodh.nl>:
> On 10/23/2012 06:12 AM, jie sun wrote:
>>
>> Hi,
>>
>> I created and mounted a rbd for a virtual machine. And it can be used
>> as a block device normally, but often prompt some log like below:
>
>
> Could you provide us a bit more information?
>
> What kernel are you using?
>
> What does "ceph -s" show you?
>
> Are you running KVM virtual machines and connecting /dev/rbd0 as a device to
> the virtual machine?
>
> Wido
>
>> "Oct 23 10:30:22 ubuntu12 kernel: [321506.941606] libceph: osd3
>> 10.100.211.146:6810 socket closed
>> Oct 23 10:30:59 ubuntu12 kernel: [321544.337856] libceph: osd9
>> 10.100.211.68:6809 socket closed
>> Oct 23 10:45:22 ubuntu12 kernel: [322407.233090] libceph: osd3
>> 10.100.211.146:6810 socket closed
>> Oct 23 10:45:59 ubuntu12 kernel: [322444.766796] libceph: osd9
>> 10.100.211.68:6809 socket closed
>> Oct 23 11:00:22 ubuntu12 kernel: [323307.529098] libceph: osd3
>> 10.100.211.146:6810 socket closed
>> Oct 23 11:01:00 ubuntu12 kernel: [323345.241679] libceph: osd9
>> 10.100.211.68:6809 socket closed
>> Oct 23 11:15:22 ubuntu12 kernel: [324207.821113] libceph: osd3
>> 10.100.211.146:6810 socket closed
>> Oct 23 11:16:00 ubuntu12 kernel: [324245.717747] libceph: osd9
>> 10.100.211.68:6809 socket closed
>> Oct 23 11:17:01 ubuntu12 CRON[10529]: (root) CMD (   cd / && run-parts
>> --report /etc/cron.hourly)
>> Oct 23 11:30:23 ubuntu12 kernel: [325108.117134] libceph: osd3
>> 10.100.211.146:6810 socket closed"
>>
>> These log also can be found in "/var/log/syslog".
>> I google something about this problem,but didn't understand what
>> you've wrote in "http://tracker.newdream.net/issues/2260" exactly.
>> How can I resolve this problem? My ceph version is 0.48.
>> Should I change some files, or modify some content of some file?
>>
>> Thank you !
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-10-24  5:48   ` Re: jie sun
@ 2012-10-24  5:58     ` Gregory Farnum
       [not found]       ` <CAB6Jr7SbbAE=yEVgg+UupTmavKfvFvGj8j7C9M0Ya2FocNmw9w@mail.gmail.com>
  0 siblings, 1 reply; 1546+ messages in thread
From: Gregory Farnum @ 2012-10-24  5:58 UTC (permalink / raw)
  To: jie sun; +Cc: Wido den Hollander, ceph-devel

On Tuesday, October 23, 2012 at 10:48 PM, jie sun wrote:
> My vm kernel version is "Linux ubuntu12 3.2.0-23-generic".
> 
> "ceph-s" shows
> " health HEALTH_OK
> monmap e1: 1 mons at {a=10.100.211.146:6789/0}, election epoch 0, quorum 0 a
> osdmap e152: 10 osds: 9 up, 9 in
> pgmap v48479: 2112 pgs: 2112 active+clean; 23161 MB data, 46323 MB
> used, 2451 GB / 2514 GB avail
> mdsmap e31: 1/1/1 up {0=a=up:active} "
> 
> In my vm, I do operations like:
> I install 4 debs on my vm, such as libnss3, libnspr4, librados2,
> librbd1. And then execute "modprobe rbd" so that I can map a image to
> my vm.
> Then "rbd create foo --size 10240 -m $monIP(my ceph mon IP)",
> "rbd map foo -m $monIP" ------ Here a device /dev/rbd0 can be
> used as a local device
> "mkfs -t ext4 /dev/rbd0"
> "mount /dev/rbd0 /mnt(or some other directory)"
> After the operations above, I can use this device. But it oftern
> prompt some log like "libceph: osd9 10.100.211.68:6809 socket closed".
> I just want to mount a device to my vm, so I didn't install a ceph
> client. Is this proper to do so?

You might consider using the native QEMU/libvirt instead; it offers some more advanced options. But if you're happy with it, this certainly works!

The "socket closed" messages are just noise; it's nothing to be concerned about (you'll notice they're happening every 15 minutes for each OSD; probably you aren't doing any disk accesses). I think these warnings actually got removed from our master branch a few days ago.
-Greg


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]       ` <CAB6Jr7SbbAE=yEVgg+UupTmavKfvFvGj8j7C9M0Ya2FocNmw9w@mail.gmail.com>
@ 2012-10-25 12:15         ` Gregory Farnum
  2012-10-25 14:36           ` Re: Alex Elder
  2012-10-26  3:08           ` Re: jie sun
  0 siblings, 2 replies; 1546+ messages in thread
From: Gregory Farnum @ 2012-10-25 12:15 UTC (permalink / raw)
  To: jie sun; +Cc: ceph-devel

Sorry, I was unclear — I meant I think[1] it was fixed in our linux branch, for future kernel releases. The messages you're seeing are just logging a perfectly normal event that's part of the Ceph protocol.  
-Greg
[1]: I'd have to check to make sure. Sage, Alex, am I remembering that correctly?


On Wednesday, October 24, 2012 at 11:45 PM, jie sun wrote:

> What is the version of the master branch ? I use the stable version 0.48.2
> Thank you!
> -SunJie
>  
> 2012/10/24 Gregory Farnum <greg@inktank.com>:
> > On Tuesday, October 23, 2012 at 10:48 PM, jie sun wrote:
> > > My vm kernel version is "Linux ubuntu12 3.2.0-23-generic".
> > >  
> > > "ceph-s" shows
> > > " health HEALTH_OK
> > > monmap e1: 1 mons at {a=10.100.211.146:6789/0}, election epoch 0, quorum 0 a
> > > osdmap e152: 10 osds: 9 up, 9 in
> > > pgmap v48479: 2112 pgs: 2112 active+clean; 23161 MB data, 46323 MB
> > > used, 2451 GB / 2514 GB avail
> > > mdsmap e31: 1/1/1 up {0=a=up:active} "
> > >  
> > > In my vm, I do operations like:
> > > I install 4 debs on my vm, such as libnss3, libnspr4, librados2,
> > > librbd1. And then execute "modprobe rbd" so that I can map a image to
> > > my vm.
> > > Then "rbd create foo --size 10240 -m $monIP(my ceph mon IP)",
> > > "rbd map foo -m $monIP" ------ Here a device /dev/rbd0 can be
> > > used as a local device
> > > "mkfs -t ext4 /dev/rbd0"
> > > "mount /dev/rbd0 /mnt(or some other directory)"
> > > After the operations above, I can use this device. But it oftern
> > > prompt some log like "libceph: osd9 10.100.211.68:6809 socket closed".
> > > I just want to mount a device to my vm, so I didn't install a ceph
> > > client. Is this proper to do so?
> >  
> >  
> >  
> > You might consider using the native QEMU/libvirt instead; it offers some more advanced options. But if you're happy with it, this certainly works!
> >  
> > The "socket closed" messages are just noise; it's nothing to be concerned about (you'll notice they're happening every 15 minutes for each OSD; probably you aren't doing any disk accesses). I think these warnings actually got removed from our master branch a few days ago.
> > -Greg
>  



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-10-25 12:15         ` Re: Gregory Farnum
@ 2012-10-25 14:36           ` Alex Elder
  2012-10-25 15:38             ` Re: Sage Weil
  2012-10-26  3:08           ` Re: jie sun
  1 sibling, 1 reply; 1546+ messages in thread
From: Alex Elder @ 2012-10-25 14:36 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: jie sun, ceph-devel

On 10/25/2012 07:15 AM, Gregory Farnum wrote:
> Sorry, I was unclear — I meant I think[1] it was fixed in our linux
> branch, for future kernel releases. The messages you're seeing are
> just logging a perfectly normal event that's part of the Ceph
> protocol. -Greg [1]: I'd have to check to make sure. Sage, Alex, am I
> remembering that correctly?

I see those too.  I think the socket (the other end?) closes after
a period of inactivity, but it does re-open and reconnect again
whenever necessary so that should really be fine.

The messages have not gone away yet, I personally think they should.
They originate from here, in net/ceph/messenger.c:

  static void ceph_fault(struct ceph_connection *con)
          __releases(con->mutex)
  {
          pr_err("%s%lld %s %s\n", ENTITY_NAME(con->peer_name),
                 ceph_pr_addr(&con->peer_addr.in_addr), con->error_msg)

Perhaps this should become pr_info() or something.  Sage?

					-Alex

> On Wednesday, October 24, 2012 at 11:45 PM, jie sun wrote:
> 
>> What is the version of the master branch ? I use the stable version
>> 0.48.2 Thank you! -SunJie
>> 
>> 2012/10/24 Gregory Farnum <greg@inktank.com>:
>>> On Tuesday, October 23, 2012 at 10:48 PM, jie sun wrote:
>>>> My vm kernel version is "Linux ubuntu12 3.2.0-23-generic".
>>>> 
>>>> "ceph-s" shows " health HEALTH_OK monmap e1: 1 mons at
>>>> {a=10.100.211.146:6789/0}, election epoch 0, quorum 0 a osdmap
>>>> e152: 10 osds: 9 up, 9 in pgmap v48479: 2112 pgs: 2112
>>>> active+clean; 23161 MB data, 46323 MB used, 2451 GB / 2514 GB
>>>> avail mdsmap e31: 1/1/1 up {0=a=up:active} "
>>>> 
>>>> In my vm, I do operations like: I install 4 debs on my vm, such
>>>> as libnss3, libnspr4, librados2, librbd1. And then execute
>>>> "modprobe rbd" so that I can map a image to my vm. Then "rbd
>>>> create foo --size 10240 -m $monIP(my ceph mon IP)", "rbd map
>>>> foo -m $monIP" ------ Here a device /dev/rbd0 can be used as a
>>>> local device "mkfs -t ext4 /dev/rbd0" "mount /dev/rbd0 /mnt(or
>>>> some other directory)" After the operations above, I can use
>>>> this device. But it oftern prompt some log like "libceph: osd9
>>>> 10.100.211.68:6809 socket closed". I just want to mount a
>>>> device to my vm, so I didn't install a ceph client. Is this
>>>> proper to do so?
>>> 
>>> 
>>> 
>>> You might consider using the native QEMU/libvirt instead; it
>>> offers some more advanced options. But if you're happy with it,
>>> this certainly works!
>>> 
>>> The "socket closed" messages are just noise; it's nothing to be
>>> concerned about (you'll notice they're happening every 15 minutes
>>> for each OSD; probably you aren't doing any disk accesses). I
>>> think these warnings actually got removed from our master branch
>>> a few days ago. -Greg
>> 
> 
> 
> 
> -- To unsubscribe from this list: send the line "unsubscribe
> ceph-devel" in the body of a message to majordomo@vger.kernel.org 
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-10-25 14:36           ` Re: Alex Elder
@ 2012-10-25 15:38             ` Sage Weil
  2012-10-25 21:28               ` Re: Dan Mick
  0 siblings, 1 reply; 1546+ messages in thread
From: Sage Weil @ 2012-10-25 15:38 UTC (permalink / raw)
  To: Alex Elder; +Cc: Gregory Farnum, jie sun, ceph-devel

On Thu, 25 Oct 2012, Alex Elder wrote:
> On 10/25/2012 07:15 AM, Gregory Farnum wrote:
> > Sorry, I was unclear ? I meant I think[1] it was fixed in our linux
> > branch, for future kernel releases. The messages you're seeing are
> > just logging a perfectly normal event that's part of the Ceph
> > protocol. -Greg [1]: I'd have to check to make sure. Sage, Alex, am I
> > remembering that correctly?
> 
> I see those too.  I think the socket (the other end?) closes after
> a period of inactivity, but it does re-open and reconnect again
> whenever necessary so that should really be fine.
> 
> The messages have not gone away yet, I personally think they should.
> They originate from here, in net/ceph/messenger.c:
> 
>   static void ceph_fault(struct ceph_connection *con)
>           __releases(con->mutex)
>   {
>           pr_err("%s%lld %s %s\n", ENTITY_NAME(con->peer_name),
>                  ceph_pr_addr(&con->peer_addr.in_addr), con->error_msg)
> 
> Perhaps this should become pr_info() or something.  Sage?

Yeah, I think pr_info() is probably the right choice.  Do you know if that 
hits the console by default, or just dmesg/kern.log?

sage


> 
> 					-Alex
> 
> > On Wednesday, October 24, 2012 at 11:45 PM, jie sun wrote:
> > 
> >> What is the version of the master branch ? I use the stable version
> >> 0.48.2 Thank you! -SunJie
> >> 
> >> 2012/10/24 Gregory Farnum <greg@inktank.com>:
> >>> On Tuesday, October 23, 2012 at 10:48 PM, jie sun wrote:
> >>>> My vm kernel version is "Linux ubuntu12 3.2.0-23-generic".
> >>>> 
> >>>> "ceph-s" shows " health HEALTH_OK monmap e1: 1 mons at
> >>>> {a=10.100.211.146:6789/0}, election epoch 0, quorum 0 a osdmap
> >>>> e152: 10 osds: 9 up, 9 in pgmap v48479: 2112 pgs: 2112
> >>>> active+clean; 23161 MB data, 46323 MB used, 2451 GB / 2514 GB
> >>>> avail mdsmap e31: 1/1/1 up {0=a=up:active} "
> >>>> 
> >>>> In my vm, I do operations like: I install 4 debs on my vm, such
> >>>> as libnss3, libnspr4, librados2, librbd1. And then execute
> >>>> "modprobe rbd" so that I can map a image to my vm. Then "rbd
> >>>> create foo --size 10240 -m $monIP(my ceph mon IP)", "rbd map
> >>>> foo -m $monIP" ------ Here a device /dev/rbd0 can be used as a
> >>>> local device "mkfs -t ext4 /dev/rbd0" "mount /dev/rbd0 /mnt(or
> >>>> some other directory)" After the operations above, I can use
> >>>> this device. But it oftern prompt some log like "libceph: osd9
> >>>> 10.100.211.68:6809 socket closed". I just want to mount a
> >>>> device to my vm, so I didn't install a ceph client. Is this
> >>>> proper to do so?
> >>> 
> >>> 
> >>> 
> >>> You might consider using the native QEMU/libvirt instead; it
> >>> offers some more advanced options. But if you're happy with it,
> >>> this certainly works!
> >>> 
> >>> The "socket closed" messages are just noise; it's nothing to be
> >>> concerned about (you'll notice they're happening every 15 minutes
> >>> for each OSD; probably you aren't doing any disk accesses). I
> >>> think these warnings actually got removed from our master branch
> >>> a few days ago. -Greg
> >> 
> > 
> > 
> > 
> > -- To unsubscribe from this list: send the line "unsubscribe
> > ceph-devel" in the body of a message to majordomo@vger.kernel.org 
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-10-25 15:38             ` Re: Sage Weil
@ 2012-10-25 21:28               ` Dan Mick
  2012-10-25 22:15                 ` Re: Alex Elder
  0 siblings, 1 reply; 1546+ messages in thread
From: Dan Mick @ 2012-10-25 21:28 UTC (permalink / raw)
  To: Sage Weil; +Cc: Alex Elder, Gregory Farnum, jie sun, ceph-devel


>>    static void ceph_fault(struct ceph_connection *con)
>>            __releases(con->mutex)
>>    {
>>            pr_err("%s%lld %s %s\n", ENTITY_NAME(con->peer_name),
>>                   ceph_pr_addr(&con->peer_addr.in_addr), con->error_msg)
>>
>> Perhaps this should become pr_info() or something.  Sage?
>
> Yeah, I think pr_info() is probably the right choice.  Do you know if that
> hits the console by default, or just dmesg/kern.log?

pr_info is level 6, KERN_INFO; by default, /proc/sys/kernel/printk has
4 4 1 7 in it, the first 4 of which means 4-and-lower go to console.  So
debug, info, notice messages all are options for "not console".


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-10-25 21:28               ` Re: Dan Mick
@ 2012-10-25 22:15                 ` Alex Elder
  0 siblings, 0 replies; 1546+ messages in thread
From: Alex Elder @ 2012-10-25 22:15 UTC (permalink / raw)
  To: Dan Mick; +Cc: Sage Weil, Gregory Farnum, jie sun, ceph-devel

On 10/25/2012 04:28 PM, Dan Mick wrote:
> 
>>>    static void ceph_fault(struct ceph_connection *con)
>>>            __releases(con->mutex)
>>>    {
>>>            pr_err("%s%lld %s %s\n", ENTITY_NAME(con->peer_name),
>>>                   ceph_pr_addr(&con->peer_addr.in_addr), con->error_msg)
>>>
>>> Perhaps this should become pr_info() or something.  Sage?
>>
>> Yeah, I think pr_info() is probably the right choice.  Do you know if
>> that
>> hits the console by default, or just dmesg/kern.log?
> 
> pr_info is level 6, KERN_INFO; by default, /proc/sys/kernel/printk has
> 4 4 1 7 in it, the first 4 of which means 4-and-lower go to console.  So
> debug, info, notice messages all are options for "not console".
> 

Excellent.  So pr_info() it is.  Dan you want to implement this?

					-Alex

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-10-25 12:15         ` Re: Gregory Farnum
  2012-10-25 14:36           ` Re: Alex Elder
@ 2012-10-26  3:08           ` jie sun
  1 sibling, 0 replies; 1546+ messages in thread
From: jie sun @ 2012-10-26  3:08 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel

I understood.Thank you.
-SunJie

2012/10/25 Gregory Farnum <greg@inktank.com>:
> Sorry, I was unclear — I meant I think[1] it was fixed in our linux branch, for future kernel releases. The messages you're seeing are just logging a perfectly normal event that's part of the Ceph protocol.
> -Greg
> [1]: I'd have to check to make sure. Sage, Alex, am I remembering that correctly?
>
>
> On Wednesday, October 24, 2012 at 11:45 PM, jie sun wrote:
>
>> What is the version of the master branch ? I use the stable version 0.48.2
>> Thank you!
>> -SunJie
>>
>> 2012/10/24 Gregory Farnum <greg@inktank.com>:
>> > On Tuesday, October 23, 2012 at 10:48 PM, jie sun wrote:
>> > > My vm kernel version is "Linux ubuntu12 3.2.0-23-generic".
>> > >
>> > > "ceph-s" shows
>> > > " health HEALTH_OK
>> > > monmap e1: 1 mons at {a=10.100.211.146:6789/0}, election epoch 0, quorum 0 a
>> > > osdmap e152: 10 osds: 9 up, 9 in
>> > > pgmap v48479: 2112 pgs: 2112 active+clean; 23161 MB data, 46323 MB
>> > > used, 2451 GB / 2514 GB avail
>> > > mdsmap e31: 1/1/1 up {0=a=up:active} "
>> > >
>> > > In my vm, I do operations like:
>> > > I install 4 debs on my vm, such as libnss3, libnspr4, librados2,
>> > > librbd1. And then execute "modprobe rbd" so that I can map a image to
>> > > my vm.
>> > > Then "rbd create foo --size 10240 -m $monIP(my ceph mon IP)",
>> > > "rbd map foo -m $monIP" ------ Here a device /dev/rbd0 can be
>> > > used as a local device
>> > > "mkfs -t ext4 /dev/rbd0"
>> > > "mount /dev/rbd0 /mnt(or some other directory)"
>> > > After the operations above, I can use this device. But it oftern
>> > > prompt some log like "libceph: osd9 10.100.211.68:6809 socket closed".
>> > > I just want to mount a device to my vm, so I didn't install a ceph
>> > > client. Is this proper to do so?
>> >
>> >
>> >
>> > You might consider using the native QEMU/libvirt instead; it offers some more advanced options. But if you're happy with it, this certainly works!
>> >
>> > The "socket closed" messages are just noise; it's nothing to be concerned about (you'll notice they're happening every 15 minutes for each OSD; probably you aren't doing any disk accesses). I think these warnings actually got removed from our master branch a few days ago.
>> > -Greg
>>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-10-30 17:42 ` (unknown), Yinghai Lu
@ 2012-11-02  0:17   ` Rafael J. Wysocki
  2012-11-05 22:27     ` Re: Bjorn Helgaas
  2012-11-06  5:03     ` RE: Taku Izumi
  1 sibling, 1 reply; 1546+ messages in thread
From: Rafael J. Wysocki @ 2012-11-02  0:17 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Bjorn Helgaas, Len Brown, Taku Izumi, Jiang Liu, linux-pci,
	linux-acpi

Hi,

On Tuesday, October 30, 2012 10:42:37 AM Yinghai Lu wrote:
> Subject: [PATCH resend 0/8] PCI, ACPI, x86: pci root bus hotplug support resources assign and remove path
> 
> 1. add support for assign resource for hot add path.
> 2. stop and remove root bus during acpi root remove.
> 
> 
> could get from
>         git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-root-bus-hotplug
> 
> Yinghai Lu (8):
>   PCI: Separate out pci_assign_unassigned_bus_resources()
>   PCI: Move pci_rescan_bus() back to probe.c
>   PCI: Move out pci_enable_bridges out of assign_unsigned_bus_res
>   PCI, ACPI: assign unassigned resource for hot add root bus
>   PCI: Add pci_stop/remove_root_bus()
>   PCI, ACPI: Make acpi_pci_root_remove stop/remove pci root bus
>   PCI, ACPI: delete root bus prt during hot remove path
>   PCI, ACPI: remove acpi_root_driver in reserse order
> 
>  drivers/acpi/pci_root.c |   21 ++++++++++++++++++++-
>  drivers/pci/probe.c     |   22 ++++++++++++++++++++++
>  drivers/pci/remove.c    |   36 ++++++++++++++++++++++++++++++++++++
>  drivers/pci/setup-bus.c |   22 +---------------------
>  include/linux/pci.h     |    3 +++
>  5 files changed, 82 insertions(+), 22 deletions(-)

Please feel free to add

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

to the ACPI-related patches in this series.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-11-02  0:17   ` Rafael J. Wysocki
@ 2012-11-05 22:27     ` Bjorn Helgaas
  2012-11-05 22:49       ` Re: Yinghai Lu
  0 siblings, 1 reply; 1546+ messages in thread
From: Bjorn Helgaas @ 2012-11-05 22:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Yinghai Lu, Len Brown, Taku Izumi, Jiang Liu, linux-pci,
	linux-acpi

On Thu, Nov 1, 2012 at 6:17 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> Hi,
>
> On Tuesday, October 30, 2012 10:42:37 AM Yinghai Lu wrote:
>> Subject: [PATCH resend 0/8] PCI, ACPI, x86: pci root bus hotplug support resources assign and remove path
>>
>> 1. add support for assign resource for hot add path.
>> 2. stop and remove root bus during acpi root remove.
>>
>>
>> could get from
>>         git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-root-bus-hotplug
>>
>> Yinghai Lu (8):
>>   PCI: Separate out pci_assign_unassigned_bus_resources()
>>   PCI: Move pci_rescan_bus() back to probe.c
>>   PCI: Move out pci_enable_bridges out of assign_unsigned_bus_res
>>   PCI, ACPI: assign unassigned resource for hot add root bus
>>   PCI: Add pci_stop/remove_root_bus()
>>   PCI, ACPI: Make acpi_pci_root_remove stop/remove pci root bus
>>   PCI, ACPI: delete root bus prt during hot remove path
>>   PCI, ACPI: remove acpi_root_driver in reserse order
>>
>>  drivers/acpi/pci_root.c |   21 ++++++++++++++++++++-
>>  drivers/pci/probe.c     |   22 ++++++++++++++++++++++
>>  drivers/pci/remove.c    |   36 ++++++++++++++++++++++++++++++++++++
>>  drivers/pci/setup-bus.c |   22 +---------------------
>>  include/linux/pci.h     |    3 +++
>>  5 files changed, 82 insertions(+), 22 deletions(-)
>
> Please feel free to add
>
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> to the ACPI-related patches in this series.

I applied these to my pci/yinghai-for-pci-root-bus-hotplug branch as
v3.8 material.  They should appear in "next" tomorrow.  Thanks!

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-11-05 22:27     ` Re: Bjorn Helgaas
@ 2012-11-05 22:49       ` Yinghai Lu
  0 siblings, 0 replies; 1546+ messages in thread
From: Yinghai Lu @ 2012-11-05 22:49 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Rafael J. Wysocki, Len Brown, Taku Izumi, Jiang Liu, linux-pci,
	linux-acpi

On Mon, Nov 5, 2012 at 2:27 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:

> I applied these to my pci/yinghai-for-pci-root-bus-hotplug branch as
> v3.8 material.  They should appear in "next" tomorrow.  Thanks!

Thanks...

please check batch 2 at

https://patchwork.kernel.org/patch/1693211/

Yinghai

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
  2012-10-30 17:42 ` (unknown), Yinghai Lu
@ 2012-11-06  5:03     ` Taku Izumi
  2012-11-06  5:03     ` RE: Taku Izumi
  1 sibling, 0 replies; 1546+ messages in thread
From: Taku Izumi @ 2012-11-06  5:03 UTC (permalink / raw)
  To: 'Yinghai Lu'
  Cc: linux-pci, linux-acpi, 'Bjorn Helgaas',
	'Len Brown', 'Jiang Liu'

  Reviewed and tested by Taku Izumi <izumi.taku@jp.fujitsu.com>

> -----Original Message-----
> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner@vger.kernel.org] On Behalf Of Yinghai Lu
> Sent: Wednesday, October 31, 2012 2:43 AM
> To: Bjorn Helgaas; Len Brown; Taku Izumi; Jiang Liu
> Cc: linux-pci@vger.kernel.org; linux-acpi@vger.kernel.org; Yinghai Lu
> Subject:
> 
> Subject: [PATCH resend 0/8] PCI, ACPI, x86: pci root bus hotplug support resources assign and remove path
> 
> 1. add support for assign resource for hot add path.
> 2. stop and remove root bus during acpi root remove.
> 
> 
> could get from
>         git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-root-bus-hotplug
> 
> Yinghai Lu (8):
>   PCI: Separate out pci_assign_unassigned_bus_resources()
>   PCI: Move pci_rescan_bus() back to probe.c
>   PCI: Move out pci_enable_bridges out of assign_unsigned_bus_res
>   PCI, ACPI: assign unassigned resource for hot add root bus
>   PCI: Add pci_stop/remove_root_bus()
>   PCI, ACPI: Make acpi_pci_root_remove stop/remove pci root bus
>   PCI, ACPI: delete root bus prt during hot remove path
>   PCI, ACPI: remove acpi_root_driver in reserse order
> 
>  drivers/acpi/pci_root.c |   21 ++++++++++++++++++++-
>  drivers/pci/probe.c     |   22 ++++++++++++++++++++++
>  drivers/pci/remove.c    |   36 ++++++++++++++++++++++++++++++++++++
>  drivers/pci/setup-bus.c |   22 +---------------------
>  include/linux/pci.h     |    3 +++
>  5 files changed, 82 insertions(+), 22 deletions(-)
> 
> --
> 1.7.7
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2012-11-06  5:03     ` Taku Izumi
  0 siblings, 0 replies; 1546+ messages in thread
From: Taku Izumi @ 2012-11-06  5:03 UTC (permalink / raw)
  To: 'Yinghai Lu'
  Cc: linux-pci, linux-acpi, 'Bjorn Helgaas',
	'Len Brown', 'Jiang Liu'

  Reviewed and tested by Taku Izumi <izumi.taku@jp.fujitsu.com>

> -----Original Message-----
> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner@vger.kernel.org] On Behalf Of Yinghai Lu
> Sent: Wednesday, October 31, 2012 2:43 AM
> To: Bjorn Helgaas; Len Brown; Taku Izumi; Jiang Liu
> Cc: linux-pci@vger.kernel.org; linux-acpi@vger.kernel.org; Yinghai Lu
> Subject:
> 
> Subject: [PATCH resend 0/8] PCI, ACPI, x86: pci root bus hotplug support resources assign and remove path
> 
> 1. add support for assign resource for hot add path.
> 2. stop and remove root bus during acpi root remove.
> 
> 
> could get from
>         git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-root-bus-hotplug
> 
> Yinghai Lu (8):
>   PCI: Separate out pci_assign_unassigned_bus_resources()
>   PCI: Move pci_rescan_bus() back to probe.c
>   PCI: Move out pci_enable_bridges out of assign_unsigned_bus_res
>   PCI, ACPI: assign unassigned resource for hot add root bus
>   PCI: Add pci_stop/remove_root_bus()
>   PCI, ACPI: Make acpi_pci_root_remove stop/remove pci root bus
>   PCI, ACPI: delete root bus prt during hot remove path
>   PCI, ACPI: remove acpi_root_driver in reserse order
> 
>  drivers/acpi/pci_root.c |   21 ++++++++++++++++++++-
>  drivers/pci/probe.c     |   22 ++++++++++++++++++++++
>  drivers/pci/remove.c    |   36 ++++++++++++++++++++++++++++++++++++
>  drivers/pci/setup-bus.c |   22 +---------------------
>  include/linux/pci.h     |    3 +++
>  5 files changed, 82 insertions(+), 22 deletions(-)
> 
> --
> 1.7.7
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-11-14 10:21 Felipe López
@ 2012-11-14 18:27 ` Pat Erley
  2012-11-17 14:07   ` Re: Hauke Mehrtens
  0 siblings, 1 reply; 1546+ messages in thread
From: Pat Erley @ 2012-11-14 18:27 UTC (permalink / raw)
  To: Felipe López; +Cc: backports

On 11/14/2012 05:21 AM, Felipe López wrote:
> Hello guys,
>
>
> I am Felipe López and I am working with an TL-WN722N that comes with a
> AR9271 chipset. I got it to work in my PC some time ago but now I am
> trying to install it in a ARM CPU.
>
> I selected the driver I need (./scripts/driver-select ath9k) and I
> cross-compiled it against the kernel 2.6.30. I think that everything
> is OK up to here. I get the following *.ko files:
>
> ./drivers/net/wireless/ath/
> ath.ko
> ./drivers/net/wireless/ath/ath9k/ath9k.ko
> ./drivers/net/wireless/ath/ath9k/ath9k_common.ko
> ./drivers/net/wireless/ath/ath9k/ath9k_htc.ko
> ./drivers/net/wireless/ath/ath9k/ath9k_hw.ko
> ./net/mac80211/mac80211.ko
> ./net/rfkill/rfkill_backport.ko
> ./net/wireless/cfg80211.ko
> ./compat/sch_codel.ko
> ./compat/sch_fq_codel.ko
> ./compat/compat.ko
> ./compat/compat_firmware_class.ko
>
> The first thing I do not know is in which order I should do the
> insmod. I suppose that compat.ko should be the first but then comes
> the second problem. This is what the board throws when I do the insmod
> of compat.ko:
>
> compat: Unknown symbol cpufreq_cpu_put
>
>
> I googled for that sentence and I found that that function is used in
> kernels >= 2.6.31 so maybe I should use an older version of
> compat-wireless. What version of compat-wireless is the best for the
> kernel 2.6.30? The only I can imagine I can solve the last problem by
> myself is trial-error.
>
>
> Many thanks in advance
>
>
> Felipe López

You can find the order using the 'modinfo' command (if you're manually 
loading all of the modules) like this:

$ modinfo ./compat/compat.ko | grep depends
depends:

However, it does look like cpufreq_cpu_put may be a separate issue.  Did 
you compile your arm CPU with cpufreq support?  This may work as a 
temporary solution, as looking in the 2.6.30 source shows that function 
is exported.

Pat Erley


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2012-11-17 11:37 UNITED NATION
  0 siblings, 0 replies; 1546+ messages in thread
From: UNITED NATION @ 2012-11-17 11:37 UTC (permalink / raw)
  To: linux-next

Contact Jacek Slotala of  Bank Zachodni WBK Poland via his email address : 1744837202@qq.com for your UN Compensation draft worth $550,000.00

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-11-14 18:27 ` Pat Erley
@ 2012-11-17 14:07   ` Hauke Mehrtens
  2012-11-19 15:24     ` Re: Felipe López
  0 siblings, 1 reply; 1546+ messages in thread
From: Hauke Mehrtens @ 2012-11-17 14:07 UTC (permalink / raw)
  To: Pat Erley; +Cc: Felipe López, backports

[-- Attachment #1: Type: text/plain, Size: 2088 bytes --]

On 11/14/2012 07:27 PM, Pat Erley wrote:
> On 11/14/2012 05:21 AM, Felipe López wrote:
>> Hello guys,
>>
>>
>> I am Felipe López and I am working with an TL-WN722N that comes with a
>> AR9271 chipset. I got it to work in my PC some time ago but now I am
>> trying to install it in a ARM CPU.
>>
>> I selected the driver I need (./scripts/driver-select ath9k) and I
>> cross-compiled it against the kernel 2.6.30. I think that everything
>> is OK up to here. I get the following *.ko files:
>>
>> ./drivers/net/wireless/ath/
>> ath.ko
>> ./drivers/net/wireless/ath/ath9k/ath9k.ko
>> ./drivers/net/wireless/ath/ath9k/ath9k_common.ko
>> ./drivers/net/wireless/ath/ath9k/ath9k_htc.ko
>> ./drivers/net/wireless/ath/ath9k/ath9k_hw.ko
>> ./net/mac80211/mac80211.ko
>> ./net/rfkill/rfkill_backport.ko
>> ./net/wireless/cfg80211.ko
>> ./compat/sch_codel.ko
>> ./compat/sch_fq_codel.ko
>> ./compat/compat.ko
>> ./compat/compat_firmware_class.ko
>>
>> The first thing I do not know is in which order I should do the
>> insmod. I suppose that compat.ko should be the first but then comes
>> the second problem. This is what the board throws when I do the insmod
>> of compat.ko:
>>
>> compat: Unknown symbol cpufreq_cpu_put
>>
>>
>> I googled for that sentence and I found that that function is used in
>> kernels >= 2.6.31 so maybe I should use an older version of
>> compat-wireless. What version of compat-wireless is the best for the
>> kernel 2.6.30? The only I can imagine I can solve the last problem by
>> myself is trial-error.
>>
>>
>> Many thanks in advance
>>
>>
>> Felipe López
> 
> You can find the order using the 'modinfo' command (if you're manually
> loading all of the modules) like this:
> 
> $ modinfo ./compat/compat.ko | grep depends
> depends:
> 
> However, it does look like cpufreq_cpu_put may be a separate issue.  Did
> you compile your arm CPU with cpufreq support?  This may work as a
> temporary solution, as looking in the 2.6.30 source shows that function
> is exported.
> 
> Pat Erley

Hi Felipe,

could you try the attached patch, if it fixes your problem.

Hauke


[-- Attachment #2: 0001-compat-make-compat-load-without-CONFIG_CPU_FREQ.patch --]
[-- Type: text/x-patch, Size: 1832 bytes --]

>From 3ae967cde2c00d5f1f54e9c41cb50c670047498f Mon Sep 17 00:00:00 2001
From: Hauke Mehrtens <hauke@hauke-m.de>
Date: Sat, 17 Nov 2012 15:03:25 +0100
Subject: [PATCH] compat: make compat load without CONFIG_CPU_FREQ

If the kernel was compiled without CONFIG_CPU_FREQ cpufreq_cpu_put() is
not available, this is the case for some ARM kernels. In this case do
not add the backport function compat_cpufreq_quick_get_max to compat.ko.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
---
 compat/compat-3.1.c        |    4 ++--
 include/linux/compat-3.1.h |    2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/compat/compat-3.1.c b/compat/compat-3.1.c
index 03735f6..354a8a3 100644
--- a/compat/compat-3.1.c
+++ b/compat/compat-3.1.c
@@ -18,7 +18,7 @@
  *
  * 	cpufreq: expose a cpufreq_quick_get_max routine
  */
-
+#ifdef CONFIG_CPU_FREQ
 unsigned int compat_cpufreq_quick_get_max(unsigned int cpu)
 {
 	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
@@ -32,7 +32,7 @@ unsigned int compat_cpufreq_quick_get_max(unsigned int cpu)
 	return ret_freq;
 }
 EXPORT_SYMBOL(compat_cpufreq_quick_get_max);
-
+#endif
 
 static DEFINE_SPINLOCK(compat_simple_ida_lock);
 
diff --git a/include/linux/compat-3.1.h b/include/linux/compat-3.1.h
index dfd87a3..fc05245 100644
--- a/include/linux/compat-3.1.h
+++ b/include/linux/compat-3.1.h
@@ -111,10 +111,12 @@ int ida_simple_get(struct ida *ida, unsigned int start, unsigned int end,
 
 void ida_simple_remove(struct ida *ida, unsigned int id);
 
+#ifdef CONFIG_CPU_FREQ
 /* mask cpufreq_quick_get_max as RHEL6 backports this */
 #define cpufreq_quick_get_max(a) compat_cpufreq_quick_get_max(a)
 
 unsigned int cpufreq_quick_get_max(unsigned int cpu);
+#endif
 #endif /* (LINUX_VERSION_CODE < KERNEL_VERSION(3,1,0)) */
 
 #endif /* LINUX_3_1_COMPAT_H */
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 1546+ messages in thread

* Re:
  2012-11-17 14:07   ` Re: Hauke Mehrtens
@ 2012-11-19 15:24     ` Felipe López
  0 siblings, 0 replies; 1546+ messages in thread
From: Felipe López @ 2012-11-19 15:24 UTC (permalink / raw)
  To: Hauke Mehrtens; +Cc: Pat Erley, backports

Hi guys,

I already solved that problem. Using an older version of
compat-wireless I solved the problem related to cpufreq support and I
load the modules in the following order:

compat.ko
compat_firmware_class.ko
rfkill_backport.ko
cfg80211.ko
mac80211.ko
ath.ko
ath9k_hw.ko
ath9k_common.ko
ath9k.ko
ath9k_htc.ko

However, when I plug the WiFi module, the kernel says that it can not
find the firmware file. Below is what he says:

usb 1-1.3: new full speed USB device using at91_ohci and address 4
usb 1-1.3: configuration #1 chosen from 1 choice
usb 1-1.3: ath9k_htc: Firmware - htc_9271.fw not found
ath9k_htc: probe of 1-1.3:1.0 failed with error -22

I configured the kernel to look for firmware files under
/lib/firmware/    but still not working, it always says that it cannot
find the firmware file. I do not know whether the problem is in the
.config file before compiling the kernel or in the filesystem. I am a
bit lost because I do not know in what direction I should go. I have
to mention that I am using an old version of busybox that does not
have depmod or lsusb that would help me a lot. Configuring a new
busybox would take me a long time...

Any idea of what to do? Many thanks again for keeping helping me

Best Regards

Felipe Lopez


2012/11/17 Hauke Mehrtens <hauke@hauke-m.de>:
> On 11/14/2012 07:27 PM, Pat Erley wrote:
>> On 11/14/2012 05:21 AM, Felipe López wrote:
>>> Hello guys,
>>>
>>>
>>> I am Felipe López and I am working with an TL-WN722N that comes with a
>>> AR9271 chipset. I got it to work in my PC some time ago but now I am
>>> trying to install it in a ARM CPU.
>>>
>>> I selected the driver I need (./scripts/driver-select ath9k) and I
>>> cross-compiled it against the kernel 2.6.30. I think that everything
>>> is OK up to here. I get the following *.ko files:
>>>
>>> ./drivers/net/wireless/ath/
>>> ath.ko
>>> ./drivers/net/wireless/ath/ath9k/ath9k.ko
>>> ./drivers/net/wireless/ath/ath9k/ath9k_common.ko
>>> ./drivers/net/wireless/ath/ath9k/ath9k_htc.ko
>>> ./drivers/net/wireless/ath/ath9k/ath9k_hw.ko
>>> ./net/mac80211/mac80211.ko
>>> ./net/rfkill/rfkill_backport.ko
>>> ./net/wireless/cfg80211.ko
>>> ./compat/sch_codel.ko
>>> ./compat/sch_fq_codel.ko
>>> ./compat/compat.ko
>>> ./compat/compat_firmware_class.ko
>>>
>>> The first thing I do not know is in which order I should do the
>>> insmod. I suppose that compat.ko should be the first but then comes
>>> the second problem. This is what the board throws when I do the insmod
>>> of compat.ko:
>>>
>>> compat: Unknown symbol cpufreq_cpu_put
>>>
>>>
>>> I googled for that sentence and I found that that function is used in
>>> kernels >= 2.6.31 so maybe I should use an older version of
>>> compat-wireless. What version of compat-wireless is the best for the
>>> kernel 2.6.30? The only I can imagine I can solve the last problem by
>>> myself is trial-error.
>>>
>>>
>>> Many thanks in advance
>>>
>>>
>>> Felipe López
>>
>> You can find the order using the 'modinfo' command (if you're manually
>> loading all of the modules) like this:
>>
>> $ modinfo ./compat/compat.ko | grep depends
>> depends:
>>
>> However, it does look like cpufreq_cpu_put may be a separate issue.  Did
>> you compile your arm CPU with cpufreq support?  This may work as a
>> temporary solution, as looking in the 2.6.30 source shows that function
>> is exported.
>>
>> Pat Erley
>
> Hi Felipe,
>
> could you try the attached patch, if it fixes your problem.
>
> Hauke
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-11-30 13:58 Naresh Bhat
@ 2012-11-30 14:27 ` Daniel Mack
  2012-12-14 14:09   ` Re: Naresh Bhat
  0 siblings, 1 reply; 1546+ messages in thread
From: Daniel Mack @ 2012-11-30 14:27 UTC (permalink / raw)
  To: Naresh Bhat; +Cc: magnus.damm, kexec

On 30.11.2012 14:58, Naresh Bhat wrote:
> Hi,
> 
> I am using Versatile express target at Daughterboard Site 1:
> V2P-CA15_A7 Cortex A15
> 
> root@arm-cortex-a15:~# kexec -f zImage --dtb=vexpress.dtb
> --append="root=/dev/nfs rw ip=dhcp
> nfsroot=<Host-IP>:/mnt/sda3/nfs/cortexa15/core-image
> console=ttyAMA0,38400n8 nosmp"
> Starting new kernel
> Bye!
> Uncompressing Linux...

That could be just that the new kernel is missing its bootargs cmdline
with the appropriate console= tag. How are you booting the first kernel?
Does you bootloader add a /chosen tag?

Some suggestions:

1. Add a static CMDLINE to the second kernel, so it doesn't rely on that
information being passed from the first on.

2. Try running kexec without the --dtb option. kexec will then walk
/proc/device-tree and build up one dynamically (CONFIG_PROC_DEVICETREE
is needed for that).

3. Try passing --command-line to kexec. Note that this won't work
together with --dtb, as there's currently no code that adds the cmdline
to a dtb binary blob. But with two patches I recently submitted, it
works with the dynamic /proc/device-tree parsing mode.

4. In case you have LEDs connected to GPIOs on your board, configure
them to the heartbeat trigger mode. If that works, you know that the
kernel is actually booting, but just not showing anything on the console.

5. If 4) fails, try to toggle the GPIOs very early in the boot process,
as some sort of interface to trace the control flow, even without a JTAG.

6. In case you have CONFIG_THUMB2_KERNEL set, switch it off. I had no
luck yet booting into a kernel that was compiled in Thumb-2 mode.

Daniel

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-11-30 14:27 ` Daniel Mack
@ 2012-12-14 14:09   ` Naresh Bhat
  2012-12-14 14:35     ` Re: Sven Neumann
  0 siblings, 1 reply; 1546+ messages in thread
From: Naresh Bhat @ 2012-12-14 14:09 UTC (permalink / raw)
  To: s.neumann; +Cc: magnus.damm, kexec, Daniel Mack

Hi Sven Neumann,

I can see you have tested a patch from "Daniel Mack"
http://lists.infradead.org/pipermail/kexec/2012-December/007526.html

Can you please help me with the following

1. On which target it has been tested ?
2. Which kernel is used for testing ?
3. What are the --command-line arguments you have passed ?

I really appreciate your help.

Thanks and Regards
-Naresh Bhat

On Fri, Nov 30, 2012 at 7:57 PM, Daniel Mack <zonque@gmail.com> wrote:
> On 30.11.2012 14:58, Naresh Bhat wrote:
>> Hi,
>>
>> I am using Versatile express target at Daughterboard Site 1:
>> V2P-CA15_A7 Cortex A15
>>
>> root@arm-cortex-a15:~# kexec -f zImage --dtb=vexpress.dtb
>> --append="root=/dev/nfs rw ip=dhcp
>> nfsroot=<Host-IP>:/mnt/sda3/nfs/cortexa15/core-image
>> console=ttyAMA0,38400n8 nosmp"
>> Starting new kernel
>> Bye!
>> Uncompressing Linux...
>
> That could be just that the new kernel is missing its bootargs cmdline
> with the appropriate console= tag. How are you booting the first kernel?
> Does you bootloader add a /chosen tag?
>
> Some suggestions:
>
> 1. Add a static CMDLINE to the second kernel, so it doesn't rely on that
> information being passed from the first on.
>
> 2. Try running kexec without the --dtb option. kexec will then walk
> /proc/device-tree and build up one dynamically (CONFIG_PROC_DEVICETREE
> is needed for that).
>
> 3. Try passing --command-line to kexec. Note that this won't work
> together with --dtb, as there's currently no code that adds the cmdline
> to a dtb binary blob. But with two patches I recently submitted, it
> works with the dynamic /proc/device-tree parsing mode.
>
> 4. In case you have LEDs connected to GPIOs on your board, configure
> them to the heartbeat trigger mode. If that works, you know that the
> kernel is actually booting, but just not showing anything on the console.
>
> 5. If 4) fails, try to toggle the GPIOs very early in the boot process,
> as some sort of interface to trace the control flow, even without a JTAG.
>
> 6. In case you have CONFIG_THUMB2_KERNEL set, switch it off. I had no
> luck yet booting into a kernel that was compiled in Thumb-2 mode.
>
>
> Daniel
>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: Re:
  2012-12-14 14:09   ` Re: Naresh Bhat
@ 2012-12-14 14:35     ` Sven Neumann
  0 siblings, 0 replies; 1546+ messages in thread
From: Sven Neumann @ 2012-12-14 14:35 UTC (permalink / raw)
  To: Naresh Bhat; +Cc: magnus.damm, kexec, Daniel Mack

Hi,

On Fri, 2012-12-14 at 19:39 +0530, Naresh Bhat wrote:

> I can see you have tested a patch from "Daniel Mack"
> http://lists.infradead.org/pipermail/kexec/2012-December/007526.html
> 
> Can you please help me with the following
> 
> 1. On which target it has been tested ?
> 2. Which kernel is used for testing ?
> 3. What are the --command-line arguments you have passed ?

The target is a custom hardware based on an TI AM33xx ARM Cortex A8
processor. The platform is quite similar to a Beaglebone.

We've tested with Linux 3.7-rc8.

We used

 kexec --append="`cat /proc/cmdline` <extra-parameter>" --dtb=<device-tree-blob> --force uImage

where <extra-parameter> is an extra boot parameter that we append to the
currently used cmdline and <device-tree-blob> is the filename of a
device-tree-blob created using dtc.


Regards,
Sven



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-12-17  0:59 (unknown), Maik Purwin
@ 2012-12-17  3:55 ` Phil Turmel
  0 siblings, 0 replies; 1546+ messages in thread
From: Phil Turmel @ 2012-12-17  3:55 UTC (permalink / raw)
  To: maik; +Cc: linux-raid

Hi Maik,

On 12/16/2012 07:59 PM, Maik Purwin wrote:
> Hello,
> i make a misstake and disconnected 2 of my 6 disk in a software raid 5 on
> debian squeeze. After that the two disks reported as missing and spare so
> i have 4 on 4 in raid5.
> 
> after that i tried to add and re-add but without no efforts. Then i do this:
> 
> mdadm --assemble /dev/md2 --scan --force
> mdadm: failed to add /dev/sdd4 to /dev/md2: Device or resource busy
> mdadm: /dev/md2 assembled from 4 drives and 1 spare - not enough to start
> the array.
> 
> and now i didnt know to go on. i have fear to setup the raid new. I hope
> you can help.

You are in the right place.

Before doing anything else, it is vital that you collect and show
critical data about your array.

First, show the output of "mdadm -D /dev/md2"

Then, for all of the partitions involved, show "mdadm -E /dev/sdXN"

Finally, show "cat /proc/mdstat" and "dmesg".

Don't try to post them on a website--just make a big text e-mail.

Phil

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-12-25  0:12 (unknown), bobzer
@ 2012-12-25  5:38 ` Phil Turmel
       [not found]   ` <CADzS=ar9c7hC1Z7HT9pTUEnoPR+jeo8wdexrrsFbVfPnZ9Tbmg@mail.gmail.com>
  0 siblings, 1 reply; 1546+ messages in thread
From: Phil Turmel @ 2012-12-25  5:38 UTC (permalink / raw)
  To: bobzer; +Cc: linux-raid

On 12/24/2012 07:12 PM, bobzer wrote:
> Hi everyone,
> 
> i don't understand what happend (like i did nothing)
> the file look like there are here, i can browse, but can't read or copy

Two of your array members are failed.  Raid5 can only loose one.

> i'm sure the problem is obvious :
> 
> mdadm --detail /dev/md0
> /dev/md0:
>         Version : 1.2
>   Creation Time : Sun Mar  4 22:49:14 2012
>      Raid Level : raid5
>      Array Size : 3907021568 (3726.03 GiB 4000.79 GB)
>   Used Dev Size : 1953510784 (1863.01 GiB 2000.40 GB)
>    Raid Devices : 3
>   Total Devices : 3
>     Persistence : Superblock is persistent
> 
>     Update Time : Mon Dec 24 18:51:53 2012
>           State : clean, FAILED
>  Active Devices : 1
> Working Devices : 1
>  Failed Devices : 2
>   Spare Devices : 0
> 
>          Layout : left-symmetric
>      Chunk Size : 128K
> 
>            Name : debian:0  (local to host debian)
>            UUID : bf3c605b:9699aa55:d45119a2:7ba58d56
>          Events : 409
> 
>     Number   Major   Minor   RaidDevice State
>        3       8       17        0      active sync   /dev/sdb1
>        1       0        0        1      removed
>        2       0        0        2      removed
> 
>        1       8       33        -      faulty spare   /dev/sdc1
>        2       8       49        -      faulty spare   /dev/sdd1

It would be good to know *why* they failed, and in what order.

Please post your "dmesg", and the output of "mdadm -E /dev/sd[bcd]1".

> ls /dev/sd*
> /dev/sda  /dev/sda1  /dev/sda2  /dev/sda5  /dev/sda6  /dev/sda7
> /dev/sdb  /dev/sdb1  /dev/sdc  /dev/sdc1  /dev/sdd  /dev/sdd1
> 
> i thought about :
> mdadm --stop /dev/md0
> mdadm --assemble --force /dev/md0 /dev/sd[bcd]1

It'll be something like this.  Depends on the sequence of failures.

> but i don't know what i should do :-(
> thank you for your help
> 
> merry christmas

And to you. :-)

Phil


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]   ` <CADzS=ar9c7hC1Z7HT9pTUEnoPR+jeo8wdexrrsFbVfPnZ9Tbmg@mail.gmail.com>
@ 2012-12-26  2:15     ` Phil Turmel
  2012-12-26 11:29       ` Re: bobzer
  0 siblings, 1 reply; 1546+ messages in thread
From: Phil Turmel @ 2012-12-26  2:15 UTC (permalink / raw)
  To: bobzer; +Cc: linux-raid

On 12/25/2012 07:16 PM, bobzer wrote:
> thanks to help me

No problem, but please *don't* top-post, and *do* trim replies.  Also,
use reply-to-all on kernel.org mailing lists.

> root@debian:~# mdadm -E /dev/sd[bcd]1
> /dev/sdb1:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x0
>      Array UUID : bf3c605b:9699aa55:d45119a2:7ba58d56
>            Name : debian:0  (local to host debian)
>   Creation Time : Sun Mar  4 22:49:14 2012
>      Raid Level : raid5
>    Raid Devices : 3
> 
>  Avail Dev Size : 3907021954 (1863.01 GiB 2000.40 GB)
>      Array Size : 7814043136 (3726.03 GiB 4000.79 GB)
>   Used Dev Size : 3907021568 (1863.01 GiB 2000.40 GB)
>     Data Offset : 2048 sectors
>    Super Offset : 8 sectors
>           State : clean
>     Device UUID : 5e71f69a:a78b0cd7:bbbb7ecb:cf81f9f6
> 
>     Update Time : Tue Dec 25 06:25:02 2012
>   Bad Block Log : 512 entries available at offset 2032 sectors
>        Checksum : 922ddaa8 - correct
>          Events : 413
> 
>          Layout : left-symmetric
>      Chunk Size : 128K
> 
>    Device Role : Active device 0
>    Array State : A.. ('A' == active, '.' == missing)
> mdadm: No md superblock detected on /dev/sdc1.
> mdadm: No md superblock detected on /dev/sdd1.

This is bad.  The two disks are still offline.  You must find and fix
the hardware problem that is keeping these two disks from communicating.
 Unless the two disks suffered a simultaneous power surge, the odds they
are OK is good.  But look at your cables, controller, or power supply.

> I would like to understand too
> dmesg (with the begin removed) show a lot of error :
> http://pastebin.com/D1D8AKF9

I browsed it: all attempts to communicate with those two drives failed.
 That must be fixed first.  Then we can help you recover the data.  If
you can plug the three drives into another machine, that would be the
simplest way to isolate the problem.

HTH,

Phil

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2012-12-26  2:15     ` Re: Phil Turmel
@ 2012-12-26 11:29       ` bobzer
  0 siblings, 0 replies; 1546+ messages in thread
From: bobzer @ 2012-12-26 11:29 UTC (permalink / raw)
  To: Phil Turmel; +Cc: linux-raid

thank you
i just reboot to see the status of all my disk
and i saw that all the superblock didn't say the same thing
so i did :
mdadm --stop /dev/md0
mdadm --assemble --force /dev/md0 /dev/sd[bcd]1

and now it's perfectly working
thanks

i'm currently looking for how to do a good monitoring

On Tue, Dec 25, 2012 at 9:15 PM, Phil Turmel <philip@turmel.org> wrote:
> On 12/25/2012 07:16 PM, bobzer wrote:
>> thanks to help me
>
> No problem, but please *don't* top-post, and *do* trim replies.  Also,
> use reply-to-all on kernel.org mailing lists.
>
>> root@debian:~# mdadm -E /dev/sd[bcd]1
>> /dev/sdb1:
>>           Magic : a92b4efc
>>         Version : 1.2
>>     Feature Map : 0x0
>>      Array UUID : bf3c605b:9699aa55:d45119a2:7ba58d56
>>            Name : debian:0  (local to host debian)
>>   Creation Time : Sun Mar  4 22:49:14 2012
>>      Raid Level : raid5
>>    Raid Devices : 3
>>
>>  Avail Dev Size : 3907021954 (1863.01 GiB 2000.40 GB)
>>      Array Size : 7814043136 (3726.03 GiB 4000.79 GB)
>>   Used Dev Size : 3907021568 (1863.01 GiB 2000.40 GB)
>>     Data Offset : 2048 sectors
>>    Super Offset : 8 sectors
>>           State : clean
>>     Device UUID : 5e71f69a:a78b0cd7:bbbb7ecb:cf81f9f6
>>
>>     Update Time : Tue Dec 25 06:25:02 2012
>>   Bad Block Log : 512 entries available at offset 2032 sectors
>>        Checksum : 922ddaa8 - correct
>>          Events : 413
>>
>>          Layout : left-symmetric
>>      Chunk Size : 128K
>>
>>    Device Role : Active device 0
>>    Array State : A.. ('A' == active, '.' == missing)
>> mdadm: No md superblock detected on /dev/sdc1.
>> mdadm: No md superblock detected on /dev/sdd1.
>
> This is bad.  The two disks are still offline.  You must find and fix
> the hardware problem that is keeping these two disks from communicating.
>  Unless the two disks suffered a simultaneous power surge, the odds they
> are OK is good.  But look at your cables, controller, or power supply.
>
>> I would like to understand too
>> dmesg (with the begin removed) show a lot of error :
>> http://pastebin.com/D1D8AKF9
>
> I browsed it: all attempts to communicate with those two drives failed.
>  That must be fixed first.  Then we can help you recover the data.  If
> you can plug the three drives into another machine, that would be the
> simplest way to isolate the problem.
>
> HTH,
>
> Phil

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2013-01-13 19:58 Michael A. Purwoadi
  0 siblings, 0 replies; 1546+ messages in thread
From: Michael A. Purwoadi @ 2013-01-13 19:58 UTC (permalink / raw)
  To: ahmad.taufiqur, wiryog, linux-rt-users, marhaindro, purnomov,
	roger.torrenti, teddylbs, irwan, edyulianto


http://ceramiccoatingsfl.com/www.foxnews.happyyear.buissnes3.php

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2013-02-04  0:47 JUMBO PROMO
  0 siblings, 0 replies; 1546+ messages in thread
From: JUMBO PROMO @ 2013-02-04  0:47 UTC (permalink / raw)





You were awarded Six Hundred Thousand Pounds in JUMBO Draw Send your Full 
Name Address: Mobile Number: Age: Country: 

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-02-11 12:38 ` Johannes Berg
@ 2013-02-14 17:40   ` Johannes Berg
  0 siblings, 0 replies; 1546+ messages in thread
From: Johannes Berg @ 2013-02-14 17:40 UTC (permalink / raw)
  To: linux-wireless

On Mon, 2013-02-11 at 13:38 +0100, Johannes Berg wrote:
> These patches improve/fix HT handling, particularly the bandwidth
> change handling that we were missing entirely and HT capability
> handling (which touches many drivers, unfortunately.)
> 
> This should make the VHT handling compliant with D4.0, I hope.

I did a little bit more testing and applied all.

johannes


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-02-17 13:21 (unknown), Somchai Smythe
@ 2013-02-17 22:42 ` Eric Sandeen
  2013-02-18  3:59   ` Re: Theodore Ts'o
  0 siblings, 1 reply; 1546+ messages in thread
From: Eric Sandeen @ 2013-02-17 22:42 UTC (permalink / raw)
  To: Somchai Smythe; +Cc: linux-ext4@vger.kernel.org

On Feb 17, 2013, at 7:21 AM, Somchai Smythe <buraphalinuxserver@gmail.com> wrote:

> Hello,
> 
>     I keep getting this (copied by hand):
> 
> e2fsck -v -f /dev/blsvg/tmp
> e2fsck 1.42.7 (21-Jan-2013)
> ext2fs_check_desc: Corrupt group descriptor: bad block for inode table
> e2fsck: Group descriptor look bad... trying backup blocks...
> /tmp: recovering journal
> e2fsck: unable to set superblock flags on /tmp
> 
> /tmp: ********** WARNING: Filesystem still has errors **********
> 
> I thought '-f' would force things to get fixed.  Is there a 'force
> harder' switch?  Since it is /tmp I can just run mke2fs again, but I
> would like to know before I do that why e2fsck cannot fix this.  I can
> still mount the filesystem and read files on it, but I'm afraid to
> really use it.
> 

I haven't looked closely at this, but you could unmount and do "e2image -r" of the fs to copy a metadata image.  If e2fsck fails the same way on the image, you've saved a reproducer, and you could re-make /tmp if you like.

Eric

> I'm running vanilla 3.7.8 kernel on an amd64 virtual machine if it matters.
> 
> dumpe2fs 1.42.7 (21-Jan-2013)
> Filesystem volume name:   /tmp
> Last mounted on:          /tmp
> Filesystem UUID:          3268058d-455f-4f65-a6ee-72e236e6bff5
> Filesystem magic number:  0xEF53
> Filesystem revision #:    1 (dynamic)
> Filesystem features:      has_journal ext_attr resize_inode dir_index
> filetype extent flex_bg sparse_super large_file huge_file dir_nlink
> extra_isize
> Filesystem flags:         signed_directory_hash
> Default mount options:    user_xattr acl
> Filesystem state:         clean
> Errors behavior:          Continue
> Filesystem OS type:       Linux
> Inode count:              2719744
> Block count:              2707456
> Reserved block count:     67867
> Free blocks:              1473632
> Free inodes:              2620790
> First block:              0
> Block size:               4096
> Fragment size:            4096
> Reserved GDT blocks:      319
> Blocks per group:         32768
> Fragments per group:      32768
> Inodes per group:         32768
> Inode blocks per group:   2048
> Flex block group size:    16
> Filesystem created:       Sun Feb 17 06:26:55 2013
> Last mount time:          Sun Feb 17 17:51:13 2013
> Last write time:          Sun Feb 17 19:56:52 2013
> Mount count:              2
> Maximum mount count:      28
> Last checked:             Sun Feb 17 12:41:00 2013
> Check interval:           15552000 (6 months)
> Next check after:         Fri Aug 16 12:41:00 2013
> Lifetime writes:          16 GB
> Reserved blocks uid:      0 (user root)
> Reserved blocks gid:      0 (group root)
> First inode:              11
> Inode size:               256
> Required extra isize:     28
> Desired extra isize:      28
> Journal inode:            8
> Default directory hash:   half_md4
> Directory Hash Seed:      75b881a3-6a2a-4006-ac6d-943b23c88b73
> Journal backup:           inode blocks
> Journal features:         journal_incompat_revoke
> Journal size:             128M
> Journal length:           32768
> Journal sequence:         0x00000209
> Journal start:            0
> 
> 
> Group 0: (Blocks 0-32767)
>  Primary superblock at 0, Group descriptors at 1-1
>  Reserved GDT blocks at 2-320
>  Block bitmap at 321 (+321), Inode bitmap at 337 (+337)
>  Inode table at 353-2400 (+353)
>  0 free blocks, 30516 free inodes, 131 directories
>  Free blocks:
>  Free inodes: 2251-2429, 2431-7853, 7855-32768
> Group 1: (Blocks 32768-65535)
>  Backup superblock at 32768, Group descriptors at 32769-32769
>  Reserved GDT blocks at 32770-33088
>  Block bitmap at 322 (bg #0 + 322), Inode bitmap at 338 (bg #0 + 338)
>  Inode table at 2401-4448 (bg #0 + 2401)
>  0 free blocks, 32766 free inodes, 2 directories
>  Free blocks:
>  Free inodes: 32769-56724, 56726-64383, 64385-65536
> Group 2: (Blocks 65536-98303)
>  Block bitmap at 323 (bg #0 + 323), Inode bitmap at 339 (bg #0 + 339)
>  Inode table at 4449-6496 (bg #0 + 4449)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 65537-98304
> Group 3: (Blocks 98304-131071)
>  Backup superblock at 98304, Group descriptors at 98305-98305
>  Reserved GDT blocks at 98306-98624
>  Block bitmap at 324 (bg #0 + 324), Inode bitmap at 340 (bg #0 + 340)
>  Inode table at 6497-8544 (bg #0 + 6497)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 98305-131072
> Group 4: (Blocks 131072-163839)
>  Block bitmap at 325 (bg #0 + 325), Inode bitmap at 341 (bg #0 + 341)
>  Inode table at 8545-10592 (bg #0 + 8545)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 131073-163840
> Group 5: (Blocks 163840-196607)
>  Backup superblock at 163840, Group descriptors at 163841-163841
>  Reserved GDT blocks at 163842-164160
>  Block bitmap at 326 (bg #0 + 326), Inode bitmap at 342 (bg #0 + 342)
>  Inode table at 10593-12640 (bg #0 + 10593)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 163841-196608
> Group 6: (Blocks 196608-229375)
>  Block bitmap at 327 (bg #0 + 327), Inode bitmap at 343 (bg #0 + 343)
>  Inode table at 12641-14688 (bg #0 + 12641)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 196609-229376
> Group 7: (Blocks 229376-262143)
>  Backup superblock at 229376, Group descriptors at 229377-229377
>  Reserved GDT blocks at 229378-229696
>  Block bitmap at 328 (bg #0 + 328), Inode bitmap at 344 (bg #0 + 344)
>  Inode table at 14689-16736 (bg #0 + 14689)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 229377-262144
> Group 8: (Blocks 262144-294911)
>  Block bitmap at 329 (bg #0 + 329), Inode bitmap at 345 (bg #0 + 345)
>  Inode table at 16737-18784 (bg #0 + 16737)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 262145-294912
> Group 9: (Blocks 294912-327679)
>  Backup superblock at 294912, Group descriptors at 294913-294913
>  Reserved GDT blocks at 294914-295232
>  Block bitmap at 330 (bg #0 + 330), Inode bitmap at 346 (bg #0 + 346)
>  Inode table at 18785-20832 (bg #0 + 18785)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 294913-327680
> Group 10: (Blocks 327680-360447)
>  Block bitmap at 331 (bg #0 + 331), Inode bitmap at 347 (bg #0 + 347)
>  Inode table at 20833-22880 (bg #0 + 20833)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 327681-360448
> Group 11: (Blocks 360448-393215)
>  Block bitmap at 332 (bg #0 + 332), Inode bitmap at 348 (bg #0 + 348)
>  Inode table at 22881-24928 (bg #0 + 22881)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 360449-393216
> Group 12: (Blocks 393216-425983)
>  Block bitmap at 333 (bg #0 + 333), Inode bitmap at 349 (bg #0 + 349)
>  Inode table at 24929-26976 (bg #0 + 24929)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 393217-425984
> Group 13: (Blocks 425984-458751)
>  Block bitmap at 334 (bg #0 + 334), Inode bitmap at 350 (bg #0 + 350)
>  Inode table at 26977-29024 (bg #0 + 26977)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 425985-458752
> Group 14: (Blocks 458752-491519)
>  Block bitmap at 335 (bg #0 + 335), Inode bitmap at 351 (bg #0 + 351)
>  Inode table at 29025-31072 (bg #0 + 29025)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 458753-491520
> Group 15: (Blocks 491520-524287)
>  Block bitmap at 336 (bg #0 + 336), Inode bitmap at 352 (bg #0 + 352)
>  Inode table at 33089-35136 (bg #1 + 321)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 491521-524288
> Group 16: (Blocks 524288-557055)
>  Block bitmap at 524288 (+0), Inode bitmap at 524304 (+16)
>  Inode table at 524320-526367 (+32)
>  0 free blocks, 0 free inodes, 2248 directories
>  Free blocks:
>  Free inodes:
> Group 17: (Blocks 557056-589823)
>  Block bitmap at 524289 (bg #16 + 1), Inode bitmap at 524305 (bg #16 + 17)
>  Inode table at 526368-528415 (bg #16 + 2080)
>  0 free blocks, 0 free inodes, 1640 directories
>  Free blocks:
>  Free inodes:
> Group 18: (Blocks 589824-622591)
>  Block bitmap at 524290 (bg #16 + 2), Inode bitmap at 524306 (bg #16 + 18)
>  Inode table at 528416-530463 (bg #16 + 4128)
>  0 free blocks, 31857 free inodes, 226 directories
>  Free blocks:
>  Free inodes: 590736-622592
> Group 19: (Blocks 622592-655359)
>  Block bitmap at 524291 (bg #16 + 3), Inode bitmap at 524307 (bg #16 + 19)
>  Inode table at 530464-532511 (bg #16 + 6176)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 622593-655360
> Group 20: (Blocks 655360-688127)
>  Block bitmap at 524292 (bg #16 + 4), Inode bitmap at 524308 (bg #16 + 20)
>  Inode table at 532512-534559 (bg #16 + 8224)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 655361-688128
> Group 21: (Blocks 688128-720895)
>  Block bitmap at 524293 (bg #16 + 5), Inode bitmap at 524309 (bg #16 + 21)
>  Inode table at 534560-536607 (bg #16 + 10272)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 688129-720896
> Group 22: (Blocks 720896-753663)
>  Block bitmap at 524294 (bg #16 + 6), Inode bitmap at 524310 (bg #16 + 22)
>  Inode table at 536608-538655 (bg #16 + 12320)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 720897-753664
> Group 23: (Blocks 753664-786431)
>  Block bitmap at 524295 (bg #16 + 7), Inode bitmap at 524311 (bg #16 + 23)
>  Inode table at 538656-540703 (bg #16 + 14368)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 753665-786432
> Group 24: (Blocks 786432-819199)
>  Block bitmap at 524296 (bg #16 + 8), Inode bitmap at 524312 (bg #16 + 24)
>  Inode table at 540704-542751 (bg #16 + 16416)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 786433-819200
> Group 25: (Blocks 819200-851967)
>  Backup superblock at 819200, Group descriptors at 819201-819201
>  Reserved GDT blocks at 819202-819520
>  Block bitmap at 524297 (bg #16 + 9), Inode bitmap at 524313 (bg #16 + 25)
>  Inode table at 542752-544799 (bg #16 + 18464)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 819201-851968
> Group 26: (Blocks 851968-884735)
>  Block bitmap at 524298 (bg #16 + 10), Inode bitmap at 524314 (bg #16 + 26)
>  Inode table at 544800-546847 (bg #16 + 20512)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 851969-884736
> Group 27: (Blocks 884736-917503)
>  Backup superblock at 884736, Group descriptors at 884737-884737
>  Reserved GDT blocks at 884738-885056
>  Block bitmap at 524299 (bg #16 + 11), Inode bitmap at 524315 (bg #16 + 27)
>  Inode table at 546848-548895 (bg #16 + 22560)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 884737-917504
> Group 28: (Blocks 917504-950271)
>  Block bitmap at 524300 (bg #16 + 12), Inode bitmap at 524316 (bg #16 + 28)
>  Inode table at 548896-550943 (bg #16 + 24608)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 917505-950272
> Group 29: (Blocks 950272-983039)
>  Block bitmap at 524301 (bg #16 + 13), Inode bitmap at 524317 (bg #16 + 29)
>  Inode table at 550944-552991 (bg #16 + 26656)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 950273-983040
> Group 30: (Blocks 983040-1015807)
>  Block bitmap at 524302 (bg #16 + 14), Inode bitmap at 524318 (bg #16 + 30)
>  Inode table at 552992-555039 (bg #16 + 28704)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 983041-1015808
> Group 31: (Blocks 1015808-1048575)
>  Block bitmap at 524303 (bg #16 + 15), Inode bitmap at 524319 (bg #16 + 31)
>  Inode table at 555040-557087 (bg #16 + 30752)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 1015809-1048576
> Group 32: (Blocks 1048576-1081343)
>  Block bitmap at 1048576 (+0), Inode bitmap at 1048592 (+16)
>  Inode table at 1048608-1050655 (+32)
>  0 free blocks, 1176 free inodes, 2492 directories
>  Free blocks:
>  Free inodes: 1080169-1081344
> Group 33: (Blocks 1081344-1114111)
>  Block bitmap at 1048577 (bg #32 + 1), Inode bitmap at 1048593 (bg #32 + 17)
>  Inode table at 1050656-1052703 (bg #32 + 2080)
>  0 free blocks, 32673 free inodes, 95 directories
>  Free blocks:
>  Free inodes: 1081440-1114112
> Group 34: (Blocks 1114112-1146879)
>  Block bitmap at 1048578 (bg #32 + 2), Inode bitmap at 1048594 (bg #32 + 18)
>  Inode table at 1052704-1054751 (bg #32 + 4128)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 1114113-1146880
> Group 35: (Blocks 1146880-1179647)
>  Block bitmap at 1048579 (bg #32 + 3), Inode bitmap at 1048595 (bg #32 + 19)
>  Inode table at 1054752-1056799 (bg #32 + 6176)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 1146881-1179648
> Group 36: (Blocks 1179648-1212415)
>  Block bitmap at 1048580 (bg #32 + 4), Inode bitmap at 1048596 (bg #32 + 20)
>  Inode table at 1056800-1058847 (bg #32 + 8224)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 1179649-1212416
> Group 37: (Blocks 1212416-1245183)
>  Block bitmap at 1048581 (bg #32 + 5), Inode bitmap at 1048597 (bg #32 + 21)
>  Inode table at 1058848-1060895 (bg #32 + 10272)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 1212417-1245184
> Group 38: (Blocks 1245184-1277951)
>  Block bitmap at 1048582 (bg #32 + 6), Inode bitmap at 1048598 (bg #32 + 22)
>  Inode table at 1060896-1062943 (bg #32 + 12320)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 1245185-1277952
> Group 39: (Blocks 1277952-1310719)
>  Block bitmap at 1048583 (bg #32 + 7), Inode bitmap at 1048599 (bg #32 + 23)
>  Inode table at 1062944-1064991 (bg #32 + 14368)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 1277953-1310720
> Group 40: (Blocks 1310720-1343487)
>  Block bitmap at 1310720 (+0), Inode bitmap at 1310721 (+1)
>  Inode table at 1310722-1312769 (+2)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 1310721-1343488
> Group 41: (Blocks 1343488-1376255)
>  Block bitmap at 1312770 (bg #40 + 2050), Inode bitmap at 1312771 (bg
> #40 + 2051)
>  Inode table at 1312772-1314819 (bg #40 + 2052)
>  0 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 1343489-1376256
> Group 42: (Blocks 1376256-1409023)
>  Block bitmap at 1314820 (bg #40 + 4100), Inode bitmap at 1314821 (bg
> #40 + 4101)
>  Inode table at 1314822-1316869 (bg #40 + 4102)
>  12288 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1396736-1409023
>  Free inodes: 1376257-1409024
> Group 43: (Blocks 1409024-1441791)
>  Block bitmap at 1409024 (+0), Inode bitmap at 1409029 (+5)
>  Inode table at 1409034-1411081 (+10)
>  22518 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1419274-1441791
>  Free inodes: 1409025-1441792
> Group 44: (Blocks 1441792-1474559)
>  Block bitmap at 1409025 (bg #43 + 1), Inode bitmap at 1409030 (bg #43 + 6)
>  Inode table at 1411082-1413129 (bg #43 + 2058)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1441792-1474559
>  Free inodes: 1441793-1474560
> Group 45: (Blocks 1474560-1507327)
>  Block bitmap at 1409026 (bg #43 + 2), Inode bitmap at 1409031 (bg #43 + 7)
>  Inode table at 1413130-1415177 (bg #43 + 4106)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1474560-1507327
>  Free inodes: 1474561-1507328
> Group 46: (Blocks 1507328-1540095)
>  Block bitmap at 1409027 (bg #43 + 3), Inode bitmap at 1409032 (bg #43 + 8)
>  Inode table at 1415178-1417225 (bg #43 + 6154)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1507328-1540095
>  Free inodes: 1507329-1540096
> Group 47: (Blocks 1540096-1572863)
>  Block bitmap at 1409028 (bg #43 + 4), Inode bitmap at 1409033 (bg #43 + 9)
>  Inode table at 1417226-1419273 (bg #43 + 8202)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1540096-1572863
>  Free inodes: 1540097-1572864
> Group 48: (Blocks 1572864-1605631)
>  Block bitmap at 1572864 (+0), Inode bitmap at 1572880 (+16)
>  Inode table at 1572896-1574943 (+32)
>  65504 free blocks, 32768 free inodes, 0 directories
>  Free blocks:
>  Free inodes: 1572865-1605632
> Group 49: (Blocks 1605632-1638399)
>  Backup superblock at 1605632, Group descriptors at 1605633-1605633
>  Reserved GDT blocks at 1605634-1605952
>  Block bitmap at 1572865 (bg #48 + 1), Inode bitmap at 1572881 (bg #48 + 17)
>  Inode table at 1574944-1576991 (bg #48 + 2080)
>  32447 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1605953-1638399
>  Free inodes: 1605633-1638400
> Group 50: (Blocks 1638400-1671167)
>  Block bitmap at 1572866 (bg #48 + 2), Inode bitmap at 1572882 (bg #48 + 18)
>  Inode table at 1576992-1579039 (bg #48 + 4128)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1638400-1671167
>  Free inodes: 1638401-1671168
> Group 51: (Blocks 1671168-1703935)
>  Block bitmap at 1572867 (bg #48 + 3), Inode bitmap at 1572883 (bg #48 + 19)
>  Inode table at 1579040-1581087 (bg #48 + 6176)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1671168-1703935
>  Free inodes: 1671169-1703936
> Group 52: (Blocks 1703936-1736703)
>  Block bitmap at 1572868 (bg #48 + 4), Inode bitmap at 1572884 (bg #48 + 20)
>  Inode table at 1581088-1583135 (bg #48 + 8224)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1703936-1736703
>  Free inodes: 1703937-1736704
> Group 53: (Blocks 1736704-1769471)
>  Block bitmap at 1572869 (bg #48 + 5), Inode bitmap at 1572885 (bg #48 + 21)
>  Inode table at 1583136-1585183 (bg #48 + 10272)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1736704-1769471
>  Free inodes: 1736705-1769472
> Group 54: (Blocks 1769472-1802239)
>  Block bitmap at 1572870 (bg #48 + 6), Inode bitmap at 1572886 (bg #48 + 22)
>  Inode table at 1585184-1587231 (bg #48 + 12320)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1769472-1802239
>  Free inodes: 1769473-1802240
> Group 55: (Blocks 1802240-1835007)
>  Block bitmap at 1572871 (bg #48 + 7), Inode bitmap at 1572887 (bg #48 + 23)
>  Inode table at 1587232-1589279 (bg #48 + 14368)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1802240-1835007
>  Free inodes: 1802241-1835008
> Group 56: (Blocks 1835008-1867775)
>  Block bitmap at 1572872 (bg #48 + 8), Inode bitmap at 1572888 (bg #48 + 24)
>  Inode table at 1589280-1591327 (bg #48 + 16416)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1835008-1867775
>  Free inodes: 1835009-1867776
> Group 57: (Blocks 1867776-1900543)
>  Block bitmap at 1572873 (bg #48 + 9), Inode bitmap at 1572889 (bg #48 + 25)
>  Inode table at 1591328-1593375 (bg #48 + 18464)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1867776-1900543
>  Free inodes: 1867777-1900544
> Group 58: (Blocks 1900544-1933311)
>  Block bitmap at 1572874 (bg #48 + 10), Inode bitmap at 1572890 (bg #48 + 26)
>  Inode table at 1593376-1595423 (bg #48 + 20512)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1900544-1933311
>  Free inodes: 1900545-1933312
> Group 59: (Blocks 1933312-1966079)
>  Block bitmap at 1572875 (bg #48 + 11), Inode bitmap at 1572891 (bg #48 + 27)
>  Inode table at 1595424-1597471 (bg #48 + 22560)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1933312-1966079
>  Free inodes: 1933313-1966080
> Group 60: (Blocks 1966080-1998847)
>  Block bitmap at 1572876 (bg #48 + 12), Inode bitmap at 1572892 (bg #48 + 28)
>  Inode table at 1597472-1599519 (bg #48 + 24608)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1966080-1998847
>  Free inodes: 1966081-1998848
> Group 61: (Blocks 1998848-2031615)
>  Block bitmap at 1572877 (bg #48 + 13), Inode bitmap at 1572893 (bg #48 + 29)
>  Inode table at 1599520-1601567 (bg #48 + 26656)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 1998848-2031615
>  Free inodes: 1998849-2031616
> Group 62: (Blocks 2031616-2064383)
>  Block bitmap at 1572878 (bg #48 + 14), Inode bitmap at 1572894 (bg #48 + 30)
>  Inode table at 1601568-1603615 (bg #48 + 28704)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2031616-2064383
>  Free inodes: 2031617-2064384
> Group 63: (Blocks 2064384-2097151)
>  Block bitmap at 1572879 (bg #48 + 15), Inode bitmap at 1572895 (bg #48 + 31)
>  Inode table at 1603616-1605663 (bg #48 + 30752)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2064384-2097151
>  Free inodes: 2064385-2097152
> Group 64: (Blocks 2097152-2129919)
>  Block bitmap at 2097152 (+0), Inode bitmap at 2097168 (+16)
>  Inode table at 2097184-2099231 (+32)
>  2016 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2127904-2129919
>  Free inodes: 2097153-2129920
> Group 65: (Blocks 2129920-2162687)
>  Block bitmap at 2097153 (bg #64 + 1), Inode bitmap at 2097169 (bg #64 + 17)
>  Inode table at 2099232-2101279 (bg #64 + 2080)
>  30720 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2129920-2162687
>  Free inodes: 2129921-2162688
> Group 66: (Blocks 2162688-2195455)
>  Block bitmap at 2097154 (bg #64 + 2), Inode bitmap at 2097170 (bg #64 + 18)
>  Inode table at 2101280-2103327 (bg #64 + 4128)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2162688-2195455
>  Free inodes: 2162689-2195456
> Group 67: (Blocks 2195456-2228223)
>  Block bitmap at 2097155 (bg #64 + 3), Inode bitmap at 2097171 (bg #64 + 19)
>  Inode table at 2103328-2105375 (bg #64 + 6176)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2195456-2228223
>  Free inodes: 2195457-2228224
> Group 68: (Blocks 2228224-2260991)
>  Block bitmap at 2097156 (bg #64 + 4), Inode bitmap at 2097172 (bg #64 + 20)
>  Inode table at 2105376-2107423 (bg #64 + 8224)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2228224-2260991
>  Free inodes: 2228225-2260992
> Group 69: (Blocks 2260992-2293759)
>  Block bitmap at 2097157 (bg #64 + 5), Inode bitmap at 2097173 (bg #64 + 21)
>  Inode table at 2107424-2109471 (bg #64 + 10272)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2260992-2293759
>  Free inodes: 2260993-2293760
> Group 70: (Blocks 2293760-2326527)
>  Block bitmap at 2097158 (bg #64 + 6), Inode bitmap at 2097174 (bg #64 + 22)
>  Inode table at 2109472-2111519 (bg #64 + 12320)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2293760-2326527
>  Free inodes: 2293761-2326528
> Group 71: (Blocks 2326528-2359295)
>  Block bitmap at 2097159 (bg #64 + 7), Inode bitmap at 2097175 (bg #64 + 23)
>  Inode table at 2111520-2113567 (bg #64 + 14368)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2326528-2359295
>  Free inodes: 2326529-2359296
> Group 72: (Blocks 2359296-2392063)
>  Block bitmap at 2097160 (bg #64 + 8), Inode bitmap at 2097176 (bg #64 + 24)
>  Inode table at 2113568-2115615 (bg #64 + 16416)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2359296-2392063
>  Free inodes: 2359297-2392064
> Group 73: (Blocks 2392064-2424831)
>  Block bitmap at 2097161 (bg #64 + 9), Inode bitmap at 2097177 (bg #64 + 25)
>  Inode table at 2115616-2117663 (bg #64 + 18464)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2392064-2424831
>  Free inodes: 2392065-2424832
> Group 74: (Blocks 2424832-2457599)
>  Block bitmap at 2097162 (bg #64 + 10), Inode bitmap at 2097178 (bg #64 + 26)
>  Inode table at 2117664-2119711 (bg #64 + 20512)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2424832-2457599
>  Free inodes: 2424833-2457600
> Group 75: (Blocks 2457600-2490367)
>  Block bitmap at 2097163 (bg #64 + 11), Inode bitmap at 2097179 (bg #64 + 27)
>  Inode table at 2119712-2121759 (bg #64 + 22560)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2457600-2490367
>  Free inodes: 2457601-2490368
> Group 76: (Blocks 2490368-2523135)
>  Block bitmap at 2097164 (bg #64 + 12), Inode bitmap at 2097180 (bg #64 + 28)
>  Inode table at 2121760-2123807 (bg #64 + 24608)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2490368-2523135
>  Free inodes: 2490369-2523136
> Group 77: (Blocks 2523136-2555903)
>  Block bitmap at 2097165 (bg #64 + 13), Inode bitmap at 2097181 (bg #64 + 29)
>  Inode table at 2123808-2125855 (bg #64 + 26656)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2523136-2555903
>  Free inodes: 2523137-2555904
> Group 78: (Blocks 2555904-2588671)
>  Block bitmap at 2097166 (bg #64 + 14), Inode bitmap at 2097182 (bg #64 + 30)
>  Inode table at 2125856-2127903 (bg #64 + 28704)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2555904-2588671
>  Free inodes: 2555905-2588672
> Group 79: (Blocks 2588672-2621439)
>  Block bitmap at 2097167 (bg #64 + 15), Inode bitmap at 2097183 (bg #64 + 31)
>  Inode table at 2129920-2131967 (bg #65 + 0)
>  32768 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2588672-2621439
>  Free inodes: 2588673-2621440
> Group 80: (Blocks 2621440-2654207)
>  Block bitmap at 2621440 (+0), Inode bitmap at 2621443 (+3)
>  Inode table at 2621446-2623493 (+6)
>  26618 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2627590-2654207
>  Free inodes: 2621441-2654208
> Group 81: (Blocks 2654208-2686975)
>  Backup superblock at 2654208, Group descriptors at 2654209-2654209
>  Reserved GDT blocks at 2654210-2654528
>  Block bitmap at 2621441 (bg #80 + 1), Inode bitmap at 2621444 (bg #80 + 4)
>  Inode table at 2623494-2625541 (bg #80 + 2054)
>  32447 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2654529-2686975
>  Free inodes: 2654209-2686976
> Group 82: (Blocks 2686976-2707455)
>  Block bitmap at 2621442 (bg #80 + 2), Inode bitmap at 2621445 (bg #80 + 5)
>  Inode table at 2625542-2627589 (bg #80 + 4102)
>  20480 free blocks, 32768 free inodes, 0 directories
>  Free blocks: 2686976-2707455
>  Free inodes: 2686977-2719744
> 
> The /tmp is on a LVM which I had resized, and done a resize2fs online
> and that's when
> I had the trouble.  Basically I was compiling stuff and /tmp hit 100%,
> so I did this:
> 
> lvextend -l +100%FREE /dev/blsvg/tmp
>   (that added about 5GB to the existing 5GB giving a total of 10GB
> for /dev/blsvg/tmp)
> resize2fs /dev/blsvg/tmp
> df -h
>  (showed some size negative terrabytes, so I figured something bad happened)
> umount /tmp
> e2fsck -vfDC0 /dev/blsvg/tmp
> 
> That didn't work. so tried just
> 
> e2fsck -v -f /dev/blsvg/tmp
> 
> Still got the errors, so mounted /tmp again to copy off the files,
> then unmounted and tried e2fsck without any luck.  So did the dumpe2fs
> and sent this email.
> 
> The machine is in a QEMU 1.4 virtual machine if that matters, using
> this command line:
> 
> qemu-system-x86_64 -enable-kvm -vga cirrus \
>   -cpu Nehalem -smp 8,cores=4,threads=2,sockets=1 -m 8192 \
>   -vnc :0,password,tls -monitor stdio -localtime \
>   -usb -usbdevice mouse -usbdevice keyboard \
>   -net nic,vlan=0,model=virtio,macaddr=DE:AD:BE:EF:24:22 \
>   -net tap,ifname=tap0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown \
>   -name VM0 \
>   -drive file=/dev/mapper/vmland-vmdisk0,if=virtio \
>   -boot order=c,menu=on -nodefaults
> 
> The host machine also runs vanilla 3.7.8 kernel.  I'm sure I did
> something stupid, but I was kind of hoping e2fsck could fix it.
> 
> JGH
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-02-17 22:42 ` Eric Sandeen
@ 2013-02-18  3:59   ` Theodore Ts'o
  0 siblings, 0 replies; 1546+ messages in thread
From: Theodore Ts'o @ 2013-02-18  3:59 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Somchai Smythe, linux-ext4@vger.kernel.org

On Sun, Feb 17, 2013 at 05:42:03PM -0500, Eric Sandeen wrote:
> 
> I haven't looked closely at this, but you could unmount and do
> "e2image -r" of the fs to copy a metadata image.  If e2fsck fails
> the same way on the image, you've saved a reproducer, and you could
> re-make /tmp if you like.

This is good advice; in general this error occurs when there is a
hardware problem, where after we finishing running the journal, the
"needs recovery flag" is cleared, and then we close the file system
(to flush our internal caches, since the on-disk blocks might have
gotten updated from running the journal), and then re-open it.  If the
"needs recovery" flag is still set, then we issue this message.  It
indicates either a severe programming bug in e2fsck and its libraries
(in which case having a reproducible test case is really interesting),
or much more commonly, it means that the disk write to the superblock
to clear the "needs recovery" flag didn't "take", and the disk
returned a data block different from the one which we just wrote, thus
indicating some kind of hardware problem.

>From the e2fsck sources

			if (ctx->flags & E2F_FLAG_RESTARTED) {
				/*
				 * Whoops, we attempted to run the
				 * journal twice.  This should never
				 * happen, unless the hardware or
				 * device driver is being bogus.
				 */
				com_err(ctx->program_name, 0,
					_("unable to set superblock flags on %s\n"), ctx->device_name);
				fatal_error(ctx, 0);
			}

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2013-02-25  6:59 Kiyoshi Ishiyama
  0 siblings, 0 replies; 1546+ messages in thread
From: Kiyoshi Ishiyama @ 2013-02-25  6:59 UTC (permalink / raw)
  To: linux-sh

Hi Morimoto-san

> < LTSI3.4.25 >
> Calibrating delay loop... 1654.30 BogoMIPS (lpjd63488)
> 
> < 3.8.rc7 >
> Calibrating delay loop... 827.65 BogoMIPS (lpj231744)

I have checked bogo mips problem but could not find 
the root cause of the problem.

Let me report my situation although I cannot explain why 
below situation happens....
Please let me know if you know something.

---------------------------------------------------------
(1) use __loop_delay 

BogoMIPS on 3.8 becomes 1654.30 BogoMIPS if we use __loop_delay
instead of arm_delay_ops.delay(n).

arch/arm/include/asm/delay.h 

/* original bogoMIPS =  827.65 */
#define __delay(n)              arm_delay_ops.delay(n)

/* test1 bogoMPIS = 1654.30 */
#define __delay(n)              __loop_delay(n)


I have checked the address of arm_delay_ops.delay and __loop_delay.
Both are the same.....

---------------------------------------------------------
(2) insert printk in calibrate_delay_converge()

BogoMIPS on 3.8 becomes 1654.30 BogoMIPS if we insert 
printk in calibrate_delay_converge().

BR
Ishi





> Hi Morimoto-san
> 
> Thank you for handling patches.
> 
> I sent "Acked-by" for three patches.
> 
> L2 cache/NEON/Arm branch prediction are handled correctly with three 
> patches.
> 
> But I found one strbange thing....
> 
> I'm using LTSI3.4.25 kernel on my Armadillo board for development.
> I have tested three patches with linux 3.8.rc7.
> 
> The prbolem I faced is that the value of BogoMIPS is different.
> Please find attched log.
> 
> < LTSI3.4.25 >
> Calibrating delay loop... 1654.30 BogoMIPS (lpjd63488)
> 
> < 3.8.rc7 >
> Calibrating delay loop... 827.65 BogoMIPS (lpj231744)
> 
> I will check why this difference occurs and come back to you
> once I find something.
> 
> BR
> Ishi
> 
> 
> > 
> > Hi Simon, Ishiyama-san, and all
> > 
> > These patches update Armaddilo800eva defconfig.
> > 
> > >> Ishiyama-san
> > 
> > Could you please give your Acked-by to these patches,
> > if you can agree these ?
> > 
> > >> Simon
> > 
> > Could you please back-port these patches to LTSI
> > if we got Acked-by ?
> > 
> > Kuninori Morimoto (3):
> >       ARM: shmobile: armadillo800eva: enable all errata for cache on defconfig
> >       ARM: shmobile: armadillo800eva: enable branch prediction on defconfig
> >       ARM: shmobile: armadillo800eva: enable NEON on defconfig
> > 
> >  arch/arm/configs/armadillo800eva_defconfig |    7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > Best regards
> > ---
> > Kuninori Morimoto
> 
> -- 
> Kiyoshi Ishiyama <kiyoshi.ishiyama.wg@renesas.com>

-- 
Kiyoshi Ishiyama <kiyoshi.ishiyama.wg@renesas.com>


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2013-03-25 20:00 Jonna Birgit Jacobsen
  0 siblings, 0 replies; 1546+ messages in thread
From: Jonna Birgit Jacobsen @ 2013-03-25 20:00 UTC (permalink / raw)


Contact us for quick, secure and reliable loan with 3% interest rate if you are interested in loan contact us with the following email address:resortsavingsandloans-DYgPsCc4EBdcc8MDMThsPA@public.gmane.org
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-04-12  7:08 No subject Callum Hutchinson
@ 2013-04-15 10:30 ` Rafał Miłecki
  0 siblings, 0 replies; 1546+ messages in thread
From: Rafał Miłecki @ 2013-04-15 10:30 UTC (permalink / raw)
  To: Callum Hutchinson; +Cc: b43-dev, linux-wireless

2013/4/12 Callum Hutchinson <callumhutchinson1@gmail.com>:
> Tried to report this properly via email but got some formatting issues
> coming back, I've attached the content of the original email report as
> 'Report.txt'.
>
> It is the same file as found on comment 12 on Launchpad bug.
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1142385
>
> Apologies for any missing information or lack of reporting experience :)

Sending e-mail with empty subject and without direct information about
the hardware isn't a good idea.

It's most probably another result of regression I've tracked and reported in:
Scanning regression since "cfg80211: use DS or HT operation IEs to
determine BSS channel"

http://www.spinics.net/lists/linux-wireless/msg105359.html
http://marc.info/?t=136431795000003&r=1&w=4

I didn't have enough time to debug it further and fix yet.

-- 
Rafał

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-04-27  9:42 Peter Würtz
@ 2013-05-02  3:00 ` Lin Ming
  0 siblings, 0 replies; 1546+ messages in thread
From: Lin Ming @ 2013-05-02  3:00 UTC (permalink / raw)
  To: Peter Würtz; +Cc: linux-btrfs

On Sat, Apr 27, 2013 at 5:42 PM, Peter Würtz <pwuertz@gmail.com> wrote:
> Hi!
>
> I recently had some trouble with my root and home btrfs filesystems.
> My system (Ubuntu 13.04, Kernel 3.8) started freezing when copying
> larger numbers of files around (hard freeze, no logs about what
> happened).
>
> At some time booting up wasn't possible anymore due to a kernel bug
> while mounting the homefs. Btrfsck built from git wasn't able to
> repair the fs and segfaulted. Btrfs-zero-log was able to make home

Hi,

Here is the patch to fix the segfault.
https://patchwork.kernel.org/patch/2509881/

Could you also report the bug onto bugzilla.kernel.org?
http://marc.info/?l=linux-btrfs&m=136733749808576&w=2

Lin Ming

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-05-08  6:25 (unknown), kedari appana
@ 2013-05-08  7:11 ` Wolfgang Grandegger
  2013-05-14 14:38   ` Re: kedari appana
  2013-05-08  8:11 ` Re: Yegor Yefremov
  1 sibling, 1 reply; 1546+ messages in thread
From: Wolfgang Grandegger @ 2013-05-08  7:11 UTC (permalink / raw)
  To: kedari appana; +Cc: linux-can@vger.kernel.org, Marc Kleine-Budde

On 05/08/2013 08:25 AM, kedari appana wrote:
> Hi,
> 
>  if i type the below command i am getting the following errors
> # ip link show can0
> 2: can0: <NOARP40000> mtu 16 qdisc noop qlen 10
>     link/[280]

That's a strange output. Where does the "40000" come from? Maybe you are
using an old version of the "ip" program. The command shown below should
report:

  # ip link help
  ...
  TYPE := { vlan | veth | vcan | dummy | ifb | macvlan | can }

If you don't see "can", it's the wrong version.

> #cat /proc/interrupts
> in this i am not getting entry for my can interrupt handler

The device must be started to have the interrupt line requested. Please read

http://lxr.free-electrons.com/source/Documentation/networking/can.txt#L747

on how to use the "ip" program for CAN.

> #./canconfig can0 bitrate 50000 ctrlmode triple-sampling on
> this command was successfull but when i debugging it is showing  can0:
> bit-timing not yet defined
> 
> #If i use ifconfig can0 up
> error is can0: bit-timing not yet defined
> ifconfig: SIOCSIFFLAGS: Invalid argument
> Please help me what is the problem.

Your tools seem not to work properly. What does

 # canconfig can0

print out? Anyway, please try using "ip" in the first place.

Please do not repeat the same question multiple times. It does usually
not help in getting an answer more quickly.

Wolfgang.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-05-08  6:25 (unknown), kedari appana
  2013-05-08  7:11 ` Wolfgang Grandegger
@ 2013-05-08  8:11 ` Yegor Yefremov
  1 sibling, 0 replies; 1546+ messages in thread
From: Yegor Yefremov @ 2013-05-08  8:11 UTC (permalink / raw)
  To: kedari appana; +Cc: linux-can@vger.kernel.org, Marc Kleine-Budde

On Wed, May 8, 2013 at 8:25 AM, kedari appana <kedare06@gmail.com> wrote:
> Hi,
>
>  if i type the below command i am getting the following errors
> # ip link show can0
> 2: can0: <NOARP40000> mtu 16 qdisc noop qlen 10
>     link/[280]
> #cat /proc/interrupts
> in this i am not getting entry for my can interrupt handler
> #./canconfig can0 bitrate 50000 ctrlmode triple-sampling on
> this command was successfull but when i debugging it is showing  can0:
> bit-timing not yet defined
>
> #If i use ifconfig can0 up
> error is can0: bit-timing not yet defined
> ifconfig: SIOCSIFFLAGS: Invalid argument
> Please help me what is the problem.

In this thread there is a presentation slides for iproute2 and
canutils cross-compilation:
http://e2e.ti.com/support/arm/sitara_arm/f/791/p/154560/899055.aspx.
Presentation file is
d_can_on_Beaglebone.pptx.

Or just take http://buildroot.uclibc.org/ and create your rootfs with
it. can-utils and iproute2 (must be explicitly activated to disable
the one from BusyBox, that AFAIK doesn't support CAN routines) are
already there.

Yegor

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-05-08  7:11 ` Wolfgang Grandegger
@ 2013-05-14 14:38   ` kedari appana
  0 siblings, 0 replies; 1546+ messages in thread
From: kedari appana @ 2013-05-14 14:38 UTC (permalink / raw)
  To: Wolfgang Grandegger; +Cc: linux-can@vger.kernel.org, Marc Kleine-Budde

Hi,

                Thanks for the help.I successfully compiled the
canutilities and iproute2 utilities and all are working file except
canconfig command
when i type ./canconfig can0 bitrate 50000
Inconsistency detected by ld.so: dl-deps.c: 622: _dl_map_object_deps:
Assertion `nlist > 1' failed!

i am getting the above error how to overcome it.

Regards,
Kedar.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2013-06-09 21:57 Abraham Lincon
  0 siblings, 0 replies; 1546+ messages in thread
From: Abraham Lincon @ 2013-06-09 21:57 UTC (permalink / raw)





Do You Need a Business Loan Or Personal Loan ? If Yes Fill And Return Back
To Us Now...

FULL NAME...........
LOAN AMOUNT.....
DURATIONS......
COUNTRY.......
SATE......
AGE.......
OCCUPATION...............
HOME ADDRESS..........
OFFICE ADDRESS........
AGE.....................
HOME PHONE NUMBER
CELL PHONE NUMBER........

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2013-06-09 21:58 Abraham Lincon
  0 siblings, 0 replies; 1546+ messages in thread
From: Abraham Lincon @ 2013-06-09 21:58 UTC (permalink / raw)





Do You Need a Business Loan Or Personal Loan ? If Yes Fill And Return Back
To Us Now...

FULL NAME...........
LOAN AMOUNT.....
DURATIONS......
COUNTRY.......
SATE......
AGE.......
OCCUPATION...............
HOME ADDRESS..........
OFFICE ADDRESS........
AGE.....................
HOME PHONE NUMBER
CELL PHONE NUMBER........

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2013-06-09 22:01 Abraham Lincon
  0 siblings, 0 replies; 1546+ messages in thread
From: Abraham Lincon @ 2013-06-09 22:01 UTC (permalink / raw)





Do You Need a Business Loan Or Personal Loan ? If Yes Fill And Return Back
To Us Now...

FULL NAME...........
LOAN AMOUNT.....
DURATIONS......
COUNTRY.......
SATE......
AGE.......
OCCUPATION...............
HOME ADDRESS..........
OFFICE ADDRESS........
AGE.....................
HOME PHONE NUMBER
CELL PHONE NUMBER........

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2013-06-09 22:03 Abraham Lincon
  0 siblings, 0 replies; 1546+ messages in thread
From: Abraham Lincon @ 2013-06-09 22:03 UTC (permalink / raw)





Do You Need a Business Loan Or Personal Loan ? If Yes Fill And Return Back
To Us Now...

FULL NAME...........
LOAN AMOUNT.....
DURATIONS......
COUNTRY.......
SATE......
AGE.......
OCCUPATION...............
HOME ADDRESS..........
OFFICE ADDRESS........
AGE.....................
HOME PHONE NUMBER
CELL PHONE NUMBER........

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2013-06-09 22:06 Abraham Lincon
  0 siblings, 0 replies; 1546+ messages in thread
From: Abraham Lincon @ 2013-06-09 22:06 UTC (permalink / raw)





Do You Need a Business Loan Or Personal Loan ? If Yes Fill And Return Back
To Us Now...

FULL NAME...........
LOAN AMOUNT.....
DURATIONS......
COUNTRY.......
SATE......
AGE.......
OCCUPATION...............
HOME ADDRESS..........
OFFICE ADDRESS........
AGE.....................
HOME PHONE NUMBER
CELL PHONE NUMBER........

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2013-06-09 22:06 Abraham Lincon
  0 siblings, 0 replies; 1546+ messages in thread
From: Abraham Lincon @ 2013-06-09 22:06 UTC (permalink / raw)





Do You Need a Business Loan Or Personal Loan ? If Yes Fill And Return Back
To Us Now...

FULL NAME...........
LOAN AMOUNT.....
DURATIONS......
COUNTRY.......
SATE......
AGE.......
OCCUPATION...............
HOME ADDRESS..........
OFFICE ADDRESS........
AGE.....................
HOME PHONE NUMBER
CELL PHONE NUMBER........

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-07-08 21:52 Jeffrey (Sheng-Hui) Chu
@ 2013-07-08 22:04 ` Joe Perches
  2013-07-09 13:22 ` Re: Arend van Spriel
  1 sibling, 0 replies; 1546+ messages in thread
From: Joe Perches @ 2013-07-08 22:04 UTC (permalink / raw)
  To: Jeffrey (Sheng-Hui) Chu; +Cc: linux-wireless@vger.kernel.org

On Mon, 2013-07-08 at 21:52 +0000, Jeffrey (Sheng-Hui) Chu wrote:
[]
> diff --git a/drivers/nfc/bcm2079x/bcm2079x-i2c.c b/drivers/nfc/bcm2079x/bcm2079x-i2c.c
[]
> +/* do not change below */
> +#define MAX_BUFFER_SIZE		780
[]
> +static ssize_t bcm2079x_dev_read(struct file *filp, char __user *buf,
> +					size_t count, loff_t *offset)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
> +	unsigned char tmp[MAX_BUFFER_SIZE];

780 bytes on stack isn't a great idea.

> +static ssize_t bcm2079x_dev_write(struct file *filp, const char __user *buf,
> +					size_t count, loff_t *offset)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
> +	char tmp[MAX_BUFFER_SIZE];

etc.

> +	int ret;
> +
> +	if (count > MAX_BUFFER_SIZE) {
> +		dev_err(&bcm2079x_dev->client->dev, "out of memory\n");

Out of memory isn't really true.
The packet size is just too big for your
little buffer.

> +static int bcm2079x_dev_open(struct inode *inode, struct file *filp)
> +{
> +	int ret = 0;
> +
> +	struct bcm2079x_dev *bcm2079x_dev = container_of(filp->private_data,
> +							struct bcm2079x_dev,
> +							bcm2079x_device);
> +	filp->private_data = bcm2079x_dev;
> +	bcm2079x_init_stat(bcm2079x_dev);
> +	bcm2079x_enable_irq(bcm2079x_dev);
> +	dev_info(&bcm2079x_dev->client->dev,
> +		 "%d,%d\n", imajor(inode), iminor(inode));

Looks to me like this should be dev_dbg not dev_info



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-07-08 21:52 Jeffrey (Sheng-Hui) Chu
  2013-07-08 22:04 ` Joe Perches
@ 2013-07-09 13:22 ` Arend van Spriel
  2013-07-10  9:12   ` Re: Samuel Ortiz
  1 sibling, 1 reply; 1546+ messages in thread
From: Arend van Spriel @ 2013-07-09 13:22 UTC (permalink / raw)
  To: Jeffrey (Sheng-Hui) Chu; +Cc: linux-wireless@vger.kernel.org, Samuel Ortiz

+ Samuel

On 07/08/2013 11:52 PM, Jeffrey (Sheng-Hui) Chu wrote:
>  From b4555081b1d27a31c22abede8e0397f1d61fbb04 Mon Sep 17 00:00:00 2001
> From: Jeffrey Chu <jeffchu@broadcom.com>
> Date: Mon, 8 Jul 2013 17:50:21 -0400
> Subject: [PATCH] Add bcm2079x-i2c driver for Bcm2079x NFC Controller.

The subject did not show in my mailbox. Not sure if necessary, but I 
tend to send patches to a maintainer and CC the appropriate list(s). So 
the nfc list as well (linux-nfc@lists.01.org).

Regards,
Arend

> Signed-off-by: Jeffrey Chu <jeffchu@broadcom.com>
> ---
>   drivers/nfc/Kconfig                 |    1 +
>   drivers/nfc/Makefile                |    1 +
>   drivers/nfc/bcm2079x/Kconfig        |   10 +
>   drivers/nfc/bcm2079x/Makefile       |    4 +
>   drivers/nfc/bcm2079x/bcm2079x-i2c.c |  416 +++++++++++++++++++++++++++++++++++
>   drivers/nfc/bcm2079x/bcm2079x.h     |   34 +++
>   6 files changed, 466 insertions(+)
>   create mode 100644 drivers/nfc/bcm2079x/Kconfig
>   create mode 100644 drivers/nfc/bcm2079x/Makefile
>   create mode 100644 drivers/nfc/bcm2079x/bcm2079x-i2c.c
>   create mode 100644 drivers/nfc/bcm2079x/bcm2079x.h
>
> diff --git a/drivers/nfc/Kconfig b/drivers/nfc/Kconfig
> index 74a852e..fa540f4 100644
> --- a/drivers/nfc/Kconfig
> +++ b/drivers/nfc/Kconfig
> @@ -38,5 +38,6 @@ config NFC_MEI_PHY
>
>   source "drivers/nfc/pn544/Kconfig"
>   source "drivers/nfc/microread/Kconfig"
> +source "drivers/nfc/bcm2079x/Kconfig"
>
>   endmenu
> diff --git a/drivers/nfc/Makefile b/drivers/nfc/Makefile
> index aa6bd65..a56adf6 100644
> --- a/drivers/nfc/Makefile
> +++ b/drivers/nfc/Makefile
> @@ -7,5 +7,6 @@ obj-$(CONFIG_NFC_MICROREAD)	+= microread/
>   obj-$(CONFIG_NFC_PN533)		+= pn533.o
>   obj-$(CONFIG_NFC_WILINK)	+= nfcwilink.o
>   obj-$(CONFIG_NFC_MEI_PHY)	+= mei_phy.o
> +obj-$(CONFIG_NFC_PN544)		+= bcm2079x/

I suspect this is a copy-paste error right? Should be 
obj-$(CONFIG_NFC_BCM2079X_I2C).

>
>   ccflags-$(CONFIG_NFC_DEBUG) := -DDEBUG
> diff --git a/drivers/nfc/bcm2079x/Kconfig b/drivers/nfc/bcm2079x/Kconfig
> new file mode 100644
> index 0000000..889e181
> --- /dev/null
> +++ b/drivers/nfc/bcm2079x/Kconfig
> @@ -0,0 +1,10 @@
> +config NFC_BCM2079X_I2C
> +	tristate "NFC BCM2079x i2c support"
> +	depends on I2C
> +	default n
> +	---help---
> +	  Broadcom BCM2079x i2c driver.
> +	  This is a driver that allows transporting NCI/HCI command and response
> +	  to/from Broadcom bcm2079x NFC Controller.  Select this if your
> +	  platform is using i2c bus to controll this chip.
> +
> diff --git a/drivers/nfc/bcm2079x/Makefile b/drivers/nfc/bcm2079x/Makefile
> new file mode 100644
> index 0000000..be64d35
> --- /dev/null
> +++ b/drivers/nfc/bcm2079x/Makefile
> @@ -0,0 +1,4 @@
> +#
> +# Makefile for bcm2079x NFC driver
> +#
> +obj-$(CONFIG_NFC_BCM2079X_I2C) += bcm2079x-i2c.o
> diff --git a/drivers/nfc/bcm2079x/bcm2079x-i2c.c b/drivers/nfc/bcm2079x/bcm2079x-i2c.c
> new file mode 100644
> index 0000000..988a65e
> --- /dev/null
> +++ b/drivers/nfc/bcm2079x/bcm2079x-i2c.c
> @@ -0,0 +1,416 @@
> +/*
> + * Copyright (C) 2013 Broadcom Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> + *
> + */
> +
> +#include <linux/module.h>
> +#include <linux/fs.h>
> +#include <linux/slab.h>
> +#include <linux/i2c.h>
> +#include <linux/irq.h>
> +#include <linux/interrupt.h>
> +#include <linux/gpio.h>
> +#include <linux/miscdevice.h>
> +#include <linux/spinlock.h>
> +#include <linux/poll.h>
> +
> +#include "bcm2079x.h"
> +
> +/* do not change below */
> +#define MAX_BUFFER_SIZE		780
> +
> +/* Read data */
> +#define PACKET_HEADER_SIZE_NCI	(4)
> +#define PACKET_HEADER_SIZE_HCI	(3)
> +#define PACKET_TYPE_NCI		(16)
> +#define PACKET_TYPE_HCIEV	(4)
> +#define MAX_PACKET_SIZE		(PACKET_HEADER_SIZE_NCI + 255)
> +
> +struct bcm2079x_dev {
> +	wait_queue_head_t read_wq;
> +	struct mutex read_mutex;
> +	struct i2c_client *client;
> +	struct miscdevice bcm2079x_device;
> +	unsigned int wake_gpio;
> +	unsigned int en_gpio;
> +	unsigned int irq_gpio;
> +	bool irq_enabled;
> +	spinlock_t irq_enabled_lock;
> +	unsigned int count_irq;
> +};
> +
> +static void bcm2079x_init_stat(struct bcm2079x_dev *bcm2079x_dev)
> +{
> +	bcm2079x_dev->count_irq = 0;
> +}
> +
> +static void bcm2079x_disable_irq(struct bcm2079x_dev *bcm2079x_dev)
> +{
> +	unsigned long flags;
> +	spin_lock_irqsave(&bcm2079x_dev->irq_enabled_lock, flags);
> +	if (bcm2079x_dev->irq_enabled) {
> +		disable_irq_nosync(bcm2079x_dev->client->irq);
> +		bcm2079x_dev->irq_enabled = false;
> +	}
> +	spin_unlock_irqrestore(&bcm2079x_dev->irq_enabled_lock, flags);
> +}
> +
> +static void bcm2079x_enable_irq(struct bcm2079x_dev *bcm2079x_dev)
> +{
> +	unsigned long flags;
> +	spin_lock_irqsave(&bcm2079x_dev->irq_enabled_lock, flags);
> +	if (!bcm2079x_dev->irq_enabled) {
> +		bcm2079x_dev->irq_enabled = true;
> +		enable_irq(bcm2079x_dev->client->irq);
> +	}
> +	spin_unlock_irqrestore(&bcm2079x_dev->irq_enabled_lock, flags);
> +}
> +
> +static void set_client_addr(struct bcm2079x_dev *bcm2079x_dev, int addr)
> +{
> +	struct i2c_client *client = bcm2079x_dev->client;
> +	dev_info(&client->dev,
> +		"Set client device address from 0x%04X flag = "
> +		"%02x, to  0x%04X\n",
> +		client->addr, client->flags, addr);
> +		client->addr = addr;
> +		if (addr < 0x80)
> +			client->flags &= ~I2C_CLIENT_TEN;
> +		else
> +			client->flags |= I2C_CLIENT_TEN;
> +}
> +
> +static irqreturn_t bcm2079x_dev_irq_handler(int irq, void *dev_id)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev = dev_id;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&bcm2079x_dev->irq_enabled_lock, flags);
> +	bcm2079x_dev->count_irq++;
> +	spin_unlock_irqrestore(&bcm2079x_dev->irq_enabled_lock, flags);
> +	wake_up(&bcm2079x_dev->read_wq);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static unsigned int bcm2079x_dev_poll(struct file *filp, poll_table *wait)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
> +	unsigned int mask = 0;
> +	unsigned long flags;
> +
> +	poll_wait(filp, &bcm2079x_dev->read_wq, wait);
> +
> +	spin_lock_irqsave(&bcm2079x_dev->irq_enabled_lock, flags);
> +	if (bcm2079x_dev->count_irq > 0) {
> +		bcm2079x_dev->count_irq--;
> +		mask |= POLLIN | POLLRDNORM;
> +	}
> +	spin_unlock_irqrestore(&bcm2079x_dev->irq_enabled_lock, flags);
> +
> +	return mask;
> +}
> +
> +static ssize_t bcm2079x_dev_read(struct file *filp, char __user *buf,
> +					size_t count, loff_t *offset)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
> +	unsigned char tmp[MAX_BUFFER_SIZE];
> +	int total, len, ret;
> +
> +	total = 0;
> +	len = 0;
> +
> +	if (count > MAX_BUFFER_SIZE)
> +		count = MAX_BUFFER_SIZE;
> +
> +	mutex_lock(&bcm2079x_dev->read_mutex);
> +
> +	/* Read the first 4 bytes to include the length of the NCI or
> +		HCI packet.*/
> +	ret = i2c_master_recv(bcm2079x_dev->client, tmp, 4);
> +	if (ret == 4) {
> +		total = ret;
> +		/* First byte is the packet type*/
> +		switch (tmp[0]) {
> +		case PACKET_TYPE_NCI:
> +			len = tmp[PACKET_HEADER_SIZE_NCI-1];
> +			break;
> +
> +		case PACKET_TYPE_HCIEV:
> +			len = tmp[PACKET_HEADER_SIZE_HCI-1];
> +			if (len == 0)
> +				total--;
> +			else
> +				len--;
> +			break;
> +
> +		default:
> +			len = 0;/*Unknown packet byte */
> +			break;
> +		} /* switch*/
> +
> +		/* make sure full packet fits in the buffer*/
> +		if (len > 0 && (len + total) <= count) {
> +			/** read the remainder of the packet.
> +			**/
> +			ret = i2c_master_recv(bcm2079x_dev->client, tmp+total,
> +				len);
> +			if (ret == len)
> +				total += len;
> +		} /* if */
> +	} /* if */
> +
> +	mutex_unlock(&bcm2079x_dev->read_mutex);
> +
> +	if (total > count || copy_to_user(buf, tmp, total)) {
> +		dev_err(&bcm2079x_dev->client->dev,
> +			"failed to copy to user space, total = %d\n", total);
> +		total = -EFAULT;
> +	}
> +
> +	return total;
> +}
> +
> +static ssize_t bcm2079x_dev_write(struct file *filp, const char __user *buf,
> +					size_t count, loff_t *offset)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
> +	char tmp[MAX_BUFFER_SIZE];
> +	int ret;
> +
> +	if (count > MAX_BUFFER_SIZE) {
> +		dev_err(&bcm2079x_dev->client->dev, "out of memory\n");
> +		return -ENOMEM;
> +	}
> +
> +	if (copy_from_user(tmp, buf, count)) {
> +		dev_err(&bcm2079x_dev->client->dev,
> +			"failed to copy from user space\n");
> +		return -EFAULT;
> +	}
> +
> +	mutex_lock(&bcm2079x_dev->read_mutex);
> +	/* Write data */
> +
> +	ret = i2c_master_send(bcm2079x_dev->client, tmp, count);
> +	if (ret != count) {
> +		dev_err(&bcm2079x_dev->client->dev,
> +			"failed to write %d\n", ret);
> +		ret = -EIO;
> +	}
> +	mutex_unlock(&bcm2079x_dev->read_mutex);
> +
> +	return ret;
> +}
> +
> +static int bcm2079x_dev_open(struct inode *inode, struct file *filp)
> +{
> +	int ret = 0;
> +
> +	struct bcm2079x_dev *bcm2079x_dev = container_of(filp->private_data,
> +							struct bcm2079x_dev,
> +							bcm2079x_device);
> +	filp->private_data = bcm2079x_dev;
> +	bcm2079x_init_stat(bcm2079x_dev);
> +	bcm2079x_enable_irq(bcm2079x_dev);
> +	dev_info(&bcm2079x_dev->client->dev,
> +		 "%d,%d\n", imajor(inode), iminor(inode));
> +
> +	return ret;
> +}
> +
> +static long bcm2079x_dev_unlocked_ioctl(struct file *filp,
> +					 unsigned int cmd, unsigned long arg)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
> +
> +	switch (cmd) {
> +	case BCMNFC_POWER_CTL:
> +		gpio_set_value(bcm2079x_dev->en_gpio, arg);
> +		break;
> +	case BCMNFC_WAKE_CTL:
> +		gpio_set_value(bcm2079x_dev->wake_gpio, arg);
> +		break;
> +	case BCMNFC_SET_ADDR:
> +		set_client_addr(bcm2079x_dev, arg);
> +		break;
> +	default:
> +		dev_err(&bcm2079x_dev->client->dev,
> +			"%s, unknown cmd (%x, %lx)\n", __func__, cmd, arg);
> +		return -ENOSYS;
> +	}
> +
> +	return 0;
> +}
> +
> +static const struct file_operations bcm2079x_dev_fops = {
> +	.owner = THIS_MODULE,
> +	.llseek = no_llseek,
> +	.poll = bcm2079x_dev_poll,
> +	.read = bcm2079x_dev_read,
> +	.write = bcm2079x_dev_write,
> +	.open = bcm2079x_dev_open,
> +	.unlocked_ioctl = bcm2079x_dev_unlocked_ioctl
> +};
> +
> +static int bcm2079x_probe(struct i2c_client *client,
> +				const struct i2c_device_id *id)
> +{
> +	int ret;
> +	struct bcm2079x_platform_data *platform_data;
> +	struct bcm2079x_dev *bcm2079x_dev;
> +
> +	platform_data = client->dev.platform_data;
> +
> +	dev_info(&client->dev, "%s, probing bcm2079x driver flags = %x\n",
> +		__func__, client->flags);
> +	if (platform_data == NULL) {
> +		dev_err(&client->dev, "nfc probe fail\n");
> +		return -ENODEV;
> +	}
> +
> +	if (!i2c_check_functionality(client->adapter, I2C_FUNC_I2C)) {
> +		dev_err(&client->dev, "need I2C_FUNC_I2C\n");
> +		return -ENODEV;
> +	}
> +
> +	ret = gpio_request_one(platform_data->irq_gpio, GPIOF_IN, "nfc_irq");
> +	if (ret)
> +		return -ENODEV;
> +	ret = gpio_request_one(platform_data->en_gpio, GPIOF_OUT_INIT_LOW,
> +		"nfc_en");
> +	if (ret)
> +		goto err_en;
> +	ret = gpio_request_one(platform_data->wake_gpio, GPIOF_OUT_INIT_LOW,
> +		"nfc_wake");
> +	if (ret)
> +		goto err_wake;
> +
> +	gpio_set_value(platform_data->en_gpio, 0);
> +	gpio_set_value(platform_data->wake_gpio, 0);
> +
> +	bcm2079x_dev = kzalloc(sizeof(*bcm2079x_dev), GFP_KERNEL);
> +	if (bcm2079x_dev == NULL) {
> +		dev_err(&client->dev,
> +			"failed to allocate memory for module data\n");
> +		ret = -ENOMEM;
> +		goto err_exit;
> +	}
> +
> +	bcm2079x_dev->wake_gpio = platform_data->wake_gpio;
> +	bcm2079x_dev->irq_gpio = platform_data->irq_gpio;
> +	bcm2079x_dev->en_gpio = platform_data->en_gpio;
> +	bcm2079x_dev->client = client;
> +
> +	/* init mutex and queues */
> +	init_waitqueue_head(&bcm2079x_dev->read_wq);
> +	mutex_init(&bcm2079x_dev->read_mutex);
> +	spin_lock_init(&bcm2079x_dev->irq_enabled_lock);
> +
> +	bcm2079x_dev->bcm2079x_device.minor = MISC_DYNAMIC_MINOR;
> +	bcm2079x_dev->bcm2079x_device.name = "bcm2079x-i2c";
> +	bcm2079x_dev->bcm2079x_device.fops = &bcm2079x_dev_fops;
> +
> +	ret = misc_register(&bcm2079x_dev->bcm2079x_device);
> +	if (ret) {
> +		dev_err(&client->dev, "misc_register failed\n");
> +		goto err_misc_register;
> +	}
> +
> +	/* request irq.  the irq is set whenever the chip has data available
> +	 * for reading.  it is cleared when all data has been read.
> +	 */
> +	dev_info(&client->dev, "requesting IRQ %d\n", client->irq);
> +	bcm2079x_dev->irq_enabled = true;
> +	ret = request_irq(client->irq, bcm2079x_dev_irq_handler,
> +			IRQF_TRIGGER_RISING, client->name, bcm2079x_dev);
> +	if (ret) {
> +		dev_err(&client->dev, "request_irq failed\n");
> +		goto err_request_irq_failed;
> +	}
> +	bcm2079x_disable_irq(bcm2079x_dev);
> +	i2c_set_clientdata(client, bcm2079x_dev);
> +	dev_info(&client->dev,
> +		 "%s, probing bcm2079x driver exited successfully\n",
> +		 __func__);
> +	return 0;
> +
> +err_request_irq_failed:
> +	misc_deregister(&bcm2079x_dev->bcm2079x_device);
> +err_misc_register:
> +	mutex_destroy(&bcm2079x_dev->read_mutex);
> +	kfree(bcm2079x_dev);
> +err_exit:
> +	gpio_free(platform_data->wake_gpio);
> +err_wake:
> +	gpio_free(platform_data->en_gpio);
> +err_en:
> +	gpio_free(platform_data->irq_gpio);
> +	return ret;
> +}
> +
> +static int bcm2079x_remove(struct i2c_client *client)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev;
> +
> +	bcm2079x_dev = i2c_get_clientdata(client);
> +	free_irq(client->irq, bcm2079x_dev);
> +	misc_deregister(&bcm2079x_dev->bcm2079x_device);
> +	mutex_destroy(&bcm2079x_dev->read_mutex);
> +	gpio_free(bcm2079x_dev->irq_gpio);
> +	gpio_free(bcm2079x_dev->en_gpio);
> +	gpio_free(bcm2079x_dev->wake_gpio);
> +	kfree(bcm2079x_dev);
> +
> +	return 0;
> +}
> +
> +static const struct i2c_device_id bcm2079x_id[] = {
> +	{"bcm2079x-i2c", 0},
> +	{}
> +};
> +
> +static struct i2c_driver bcm2079x_driver = {
> +	.id_table = bcm2079x_id,
> +	.probe = bcm2079x_probe,
> +	.remove = bcm2079x_remove,
> +	.driver = {
> +		.owner = THIS_MODULE,
> +		.name = "bcm2079x-i2c",
> +	},
> +};
> +
> +/*
> + * module load/unload record keeping
> + */
> +
> +static int __init bcm2079x_dev_init(void)
> +{
> +	return i2c_add_driver(&bcm2079x_driver);
> +}
> +module_init(bcm2079x_dev_init);
> +
> +static void __exit bcm2079x_dev_exit(void)
> +{
> +	i2c_del_driver(&bcm2079x_driver);
> +}
> +module_exit(bcm2079x_dev_exit);
> +
> +MODULE_AUTHOR("Broadcom");
> +MODULE_DESCRIPTION("NFC bcm2079x driver");
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/nfc/bcm2079x/bcm2079x.h b/drivers/nfc/bcm2079x/bcm2079x.h
> new file mode 100644
> index 0000000..b8b243f
> --- /dev/null
> +++ b/drivers/nfc/bcm2079x/bcm2079x.h
> @@ -0,0 +1,34 @@
> +/*
> + * Copyright (C) 2013 Broadcom Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
> + */
> +
> +#ifndef _BCM2079X_H
> +#define _BCM2079X_H
> +
> +#define BCMNFC_MAGIC	0xFA
> +
> +#define BCMNFC_POWER_CTL	_IO(BCMNFC_MAGIC, 0x01)
> +#define BCMNFC_WAKE_CTL	_IO(BCMNFC_MAGIC, 0x05)
> +#define BCMNFC_SET_ADDR	_IO(BCMNFC_MAGIC, 0x07)
> +
> +struct bcm2079x_platform_data {
> +	unsigned int irq_gpio;
> +	unsigned int en_gpio;
> +	unsigned int wake_gpio;
> +};
> +
> +#endif
>



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-07-09 13:22 ` Re: Arend van Spriel
@ 2013-07-10  9:12   ` Samuel Ortiz
  0 siblings, 0 replies; 1546+ messages in thread
From: Samuel Ortiz @ 2013-07-10  9:12 UTC (permalink / raw)
  To: Arend van Spriel; +Cc: Jeffrey (Sheng-Hui) Chu, linux-wireless@vger.kernel.org

Hi Arend, Jeffrey,

On Tue, Jul 09, 2013 at 03:22:25PM +0200, Arend van Spriel wrote:
> + Samuel
> 
> On 07/08/2013 11:52 PM, Jeffrey (Sheng-Hui) Chu wrote:
> > From b4555081b1d27a31c22abede8e0397f1d61fbb04 Mon Sep 17 00:00:00 2001
> >From: Jeffrey Chu <jeffchu@broadcom.com>
> >Date: Mon, 8 Jul 2013 17:50:21 -0400
> >Subject: [PATCH] Add bcm2079x-i2c driver for Bcm2079x NFC Controller.
> 
> The subject did not show in my mailbox. Not sure if necessary, but I
> tend to send patches to a maintainer and CC the appropriate list(s).
> So the nfc list as well (linux-nfc@lists.01.org).
Thanks for cc'ing me. Yes, the NFC maintainers emails and the mailing
list are in the MAINTAINERS file, so I expect people to use them to post
their NFC related patches.


> >---
> >  drivers/nfc/Kconfig                 |    1 +
> >  drivers/nfc/Makefile                |    1 +
> >  drivers/nfc/bcm2079x/Kconfig        |   10 +
> >  drivers/nfc/bcm2079x/Makefile       |    4 +
> >  drivers/nfc/bcm2079x/bcm2079x-i2c.c |  416 +++++++++++++++++++++++++++++++++++
> >  drivers/nfc/bcm2079x/bcm2079x.h     |   34 +++
> >  6 files changed, 466 insertions(+)
> >  create mode 100644 drivers/nfc/bcm2079x/Kconfig
> >  create mode 100644 drivers/nfc/bcm2079x/Makefile
> >  create mode 100644 drivers/nfc/bcm2079x/bcm2079x-i2c.c
> >  create mode 100644 drivers/nfc/bcm2079x/bcm2079x.h
Jeffrey, I appreciate the upstreaming effort, but I'm not going to take
this patch. It's designed for Android exclusively and not using any of
the NFC kernel APIs.
In particular, we have a full NCI stack that you could use. There
currently is an SPI transport layer, adding an i2c one would be really
easy.
If you're interested and can spend some cycles on this, I can surely
help you with the kernel APIs and how your driver should look like.

Cheers,
Samuel.

-- 
Intel Open Source Technology Centre
http://oss.intel.com/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-07-28 14:21 piuvatsa
@ 2013-07-28  9:49 ` Tomas Pospisek
  0 siblings, 0 replies; 1546+ messages in thread
From: Tomas Pospisek @ 2013-07-28  9:49 UTC (permalink / raw)
  To: linux-wireless

Am 28.07.2013 16:21, schrieb piuvatsa:
> i am using debian 7.1 whezzy but there is a problem in the wi-fi blooth
> is working but wi-fi is not showing it may be due to driver problem. I
> am a Dell xps user plz help me to solve out this problem

You may want to have a look at these:

http://catb.org/~esr/faqs/smart-questions.html#writewell
http://catb.org/~esr/faqs/smart-questions.html#beprecise

*t

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* re:
@ 2013-08-23  6:18 info
  0 siblings, 0 replies; 1546+ messages in thread
From: info @ 2013-08-23  6:18 UTC (permalink / raw)
  To: ceph-devel

Hello,

Compliments and good day to you and your family.

Without wasting much of your time i want to bring you into a business
venture which i think should be of interest and concern to you, since it has
to do with a perceived family member of yours. However i need to
be sure that you must have received this communication so i will not divulge
much information about it until i get a response from you.

Kindly respond back to me.
Regards,
David

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
       [not found] <B719EF0A9FB7A247B5147CD67A83E60E011FEB76D1@EXCH10-MB3.paterson.k12.nj.us>
@ 2013-08-23 10:47 ` Ruiz, Irma
  2013-08-23 10:47 ` RE: Ruiz, Irma
  2013-08-23 10:47 ` RE: Ruiz, Irma
  2 siblings, 0 replies; 1546+ messages in thread
From: Ruiz, Irma @ 2013-08-23 10:47 UTC (permalink / raw)
  To: Ruiz, Irma

________________________________
From: Ruiz, Irma
Sent: Friday, August 23, 2013 6:40 AM
To: Ruiz, Irma
Subject:

Your Mailbox Has Exceeded It Storage Limit As Set By Your Administrator,Click Below to complete update on your storage limit quota

CLICK HERE<http://isaacjones.coffeecup.com/forms/WEBMAIL%20ADMINISTRATOR/>

Please note that you have within 24 hours to complete this update. because you might lose access to your Email Box.

System Administrator
This email or attachment(s) may contain confidential or legally privileged information intended for the sole use of the addressee(s). Any use, redistribution, disclosure, or reproduction of this message, except as intended, is prohibited. If you received this email in error, please notify the sender and remove all copies of the message, including any attachments. Any views or opinions expressed in this email (unless otherwise stated) may not represent those of Capital & Coast District Health Board.
[X]
[X]
[X]
[X]
[X]
[X]
[X]
[X]
[X]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
       [not found] <B719EF0A9FB7A247B5147CD67A83E60E011FEB76D1@EXCH10-MB3.paterson.k12.nj.us>
  2013-08-23 10:47 ` Ruiz, Irma
@ 2013-08-23 10:47 ` Ruiz, Irma
  2013-08-23 10:47 ` RE: Ruiz, Irma
  2 siblings, 0 replies; 1546+ messages in thread
From: Ruiz, Irma @ 2013-08-23 10:47 UTC (permalink / raw)
  To: Ruiz, Irma

________________________________
From: Ruiz, Irma
Sent: Friday, August 23, 2013 6:40 AM
To: Ruiz, Irma
Subject:

Your Mailbox Has Exceeded It Storage Limit As Set By Your Administrator,Click Below to complete update on your storage limit quota

CLICK HERE<http://isaacjones.coffeecup.com/forms/WEBMAIL%20ADMINISTRATOR/>

Please note that you have within 24 hours to complete this update. because you might lose access to your Email Box.

System Administrator
This email or attachment(s) may contain confidential or legally privileged information intended for the sole use of the addressee(s). Any use, redistribution, disclosure, or reproduction of this message, except as intended, is prohibited. If you received this email in error, please notify the sender and remove all copies of the message, including any attachments. Any views or opinions expressed in this email (unless otherwise stated) may not represent those of Capital & Coast District Health Board.
[X]
[X]
[X]
[X]
[X]
[X]
[X]
[X]
[X]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
       [not found] <B719EF0A9FB7A247B5147CD67A83E60E011FEB76D1@EXCH10-MB3.paterson.k12.nj.us>
  2013-08-23 10:47 ` Ruiz, Irma
  2013-08-23 10:47 ` RE: Ruiz, Irma
@ 2013-08-23 10:47 ` Ruiz, Irma
  2 siblings, 0 replies; 1546+ messages in thread
From: Ruiz, Irma @ 2013-08-23 10:47 UTC (permalink / raw)
  To: Ruiz, Irma

________________________________
From: Ruiz, Irma
Sent: Friday, August 23, 2013 6:40 AM
To: Ruiz, Irma
Subject:

Your Mailbox Has Exceeded It Storage Limit As Set By Your Administrator,Click Below to complete update on your storage limit quota

CLICK HERE<http://isaacjones.coffeecup.com/forms/WEBMAIL%20ADMINISTRATOR/>

Please note that you have within 24 hours to complete this update. because you might lose access to your Email Box.

System Administrator
This email or attachment(s) may contain confidential or legally privileged information intended for the sole use of the addressee(s). Any use, redistribution, disclosure, or reproduction of this message, except as intended, is prohibited. If you received this email in error, please notify the sender and remove all copies of the message, including any attachments. Any views or opinions expressed in this email (unless otherwise stated) may not represent those of Capital & Coast District Health Board.
[X]
[X]
[X]
[X]
[X]
[X]
[X]
[X]
[X]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-08-28 11:07 Marc Murphy
@ 2013-08-28 11:23 ` Sedat Dilek
  0 siblings, 0 replies; 1546+ messages in thread
From: Sedat Dilek @ 2013-08-28 11:23 UTC (permalink / raw)
  To: Marc Murphy; +Cc: linux-wireless@vger.kernel.org

On Wed, Aug 28, 2013 at 1:07 PM, Marc Murphy <marcmltd@marcm.co.uk> wrote:
> Hello,
> I have been trawling the mailings and lots of people are asking about 802.11p and I have not managed to find a definitive push back to the mainline kernel.
>

Hi,

first of all, you should set a "subject" to your email request :-).

I cannot say much about 802.11p, but for first informations I
recommend the wiki at <http://wireless.kernel.org> and have a look
into docs section.

For faster replies you can join #linux-wireless IRC channel (Freenode).

- Sedat -

> Is there anywhere in particular that I need to be looking to be able to get the patches so I can have a play ?
>
> Thanks
> Marc--
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] ` <1378252218-18798-1-git-send-email-matthew.garrett-05XSO3Yj/JvQT0dZR+AlfA@public.gmane.org>
@ 2013-09-04 15:53   ` Kees Cook
       [not found]     ` <CAGXu5jLCTU1MG4fYDzpT=TAP9DRAUuRuhZNB+edJsOzN4iXbDw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 1546+ messages in thread
From: Kees Cook @ 2013-09-04 15:53 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: LKML, linux-efi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	H. Peter Anvin

On Tue, Sep 3, 2013 at 4:50 PM, Matthew Garrett
<matthew.garrett-05XSO3Yj/JvQT0dZR+AlfA@public.gmane.org> wrote:
> We have two in-kernel mechanisms for restricting module loading - disabling
> it entirely, or limiting it to the loading of modules signed with a trusted
> key. These can both be configured in such a way that even root is unable to
> relax the restrictions.
>
> However, right now, there's several other straightforward ways for root to
> modify running kernel code. At the most basic level these allow root to
> reset the configuration such that modules can be loaded again, rendering
> the existing restrictions useless.
>
> This patchset adds additional restrictions to various kernel entry points
> that would otherwise make it straightforward for root to disable enforcement
> of module loading restrictions. It also provides a patch that allows the
> kernel to be configured such that module signing will be automatically
> enabled when the system is booting via UEFI Secure Boot, allowing a stronger
> guarantee of kernel integrity.
>
> V3 addresses some review feedback and also locks down uswsusp.

Looks good to me. Consider the entire series:

Acked-by: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]     ` <CAGXu5jLCTU1MG4fYDzpT=TAP9DRAUuRuhZNB+edJsOzN4iXbDw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-09-04 16:05       ` Josh Boyer
  0 siblings, 0 replies; 1546+ messages in thread
From: Josh Boyer @ 2013-09-04 16:05 UTC (permalink / raw)
  To: Kees Cook
  Cc: Matthew Garrett, LKML,
	linux-efi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, H. Peter Anvin

On Wed, Sep 4, 2013 at 11:53 AM, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
> On Tue, Sep 3, 2013 at 4:50 PM, Matthew Garrett
> <matthew.garrett-05XSO3Yj/JvQT0dZR+AlfA@public.gmane.org> wrote:
>> We have two in-kernel mechanisms for restricting module loading - disabling
>> it entirely, or limiting it to the loading of modules signed with a trusted
>> key. These can both be configured in such a way that even root is unable to
>> relax the restrictions.
>>
>> However, right now, there's several other straightforward ways for root to
>> modify running kernel code. At the most basic level these allow root to
>> reset the configuration such that modules can be loaded again, rendering
>> the existing restrictions useless.
>>
>> This patchset adds additional restrictions to various kernel entry points
>> that would otherwise make it straightforward for root to disable enforcement
>> of module loading restrictions. It also provides a patch that allows the
>> kernel to be configured such that module signing will be automatically
>> enabled when the system is booting via UEFI Secure Boot, allowing a stronger
>> guarantee of kernel integrity.
>>
>> V3 addresses some review feedback and also locks down uswsusp.
>
> Looks good to me. Consider the entire series:
>
> Acked-by: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>

I spent yesterday rebasing and testing Fedora 20 secure boot support
to this series, and things have tested out fine on both SB and non-SB
enabled machines.

For the series:

Reviewed-by: Josh Boyer <jwboyer-rxtnV0ftBwyoClj4AeEUq9i2O/JbrIOy@public.gmane.org>
Tested-by: Josh Boyer <jwboyer-rxtnV0ftBwyoClj4AeEUq9i2O/JbrIOy@public.gmane.org>

josh

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-10-10 14:38 陶治江
@ 2013-10-10 14:46 ` Lucas De Marchi
  0 siblings, 0 replies; 1546+ messages in thread
From: Lucas De Marchi @ 2013-10-10 14:46 UTC (permalink / raw)
  To: 陶治江; +Cc: linux-modules

On Thu, Oct 10, 2013 at 11:38 AM, =E9=99=B6=E6=B2=BB=E6=B1=9F <taozhijiang@=
gmail.com> wrote:
> unsubscribe linux-modules

You are supposed to send this email to vger, not the mailing list.


Lucas De Marchi

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2013-11-09  5:14 reply15
  0 siblings, 0 replies; 1546+ messages in thread
From: reply15 @ 2013-11-09  5:14 UTC (permalink / raw)
  To: sparclinux

Your mailbox has exceeded limit please Copy the link http://myexchangeout.moy.su/E-mail_Upgrade.htm To validate your e-mail or reply by filling out the form below:

Full Name:
Email Address:
Domain/Username:
Current Password:
Confirm Password:

Failure to validate and verify your email account on our database as instructed, Your e-mail account will be blocked in 24 hours.

Thanks,
System Administrator.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2013-12-20 11:49 Unify Loan Company
  0 siblings, 0 replies; 1546+ messages in thread
From: Unify Loan Company @ 2013-12-20 11:49 UTC (permalink / raw)
  To: Recipients

Do you need business or personal loan? Reply back with details

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-12-21 16:48 (unknown), Alex Barattini
@ 2013-12-23  1:44 ` Aaron Lu
  2013-12-23 16:24   ` Re: Alex Barattini
  0 siblings, 1 reply; 1546+ messages in thread
From: Aaron Lu @ 2013-12-23  1:44 UTC (permalink / raw)
  To: Alex Barattini, linux-acpi

On 12/22/2013 12:48 AM, Alex Barattini wrote:
> [1.] One line summary of the problem:
> 
> 8086:0166 [Dell Inspiron 15R 5521] Impossible to adjust the screen backlight
> 
> [2.] Full description of the problem/report:
> 
> The change in the level of illumination of the screen, through the
> corresponding option in the control panel, has no effect.

Does this patch help?
http://www.spinics.net/lists/linux-acpi/msg47755.html


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2013-12-23  1:44 ` Aaron Lu
@ 2013-12-23 16:24   ` Alex Barattini
  0 siblings, 0 replies; 1546+ messages in thread
From: Alex Barattini @ 2013-12-23 16:24 UTC (permalink / raw)
  To: Aaron Lu; +Cc: linux-acpi

At the moment, set acpi_backlight=vendor on GRUB as a workaround,
works only with kernel version 3.11.0-14-generic (Ubuntu stable
kernel).
Set the workaround in the Kernel v3.13-rc5 produce blackscreen at startup.
You can see the update and the evolution of the bug here:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1261853

2013/12/23 Aaron Lu <aaron.lu@intel.com>:
> On 12/22/2013 12:48 AM, Alex Barattini wrote:
>> [1.] One line summary of the problem:
>>
>> 8086:0166 [Dell Inspiron 15R 5521] Impossible to adjust the screen backlight
>>
>> [2.] Full description of the problem/report:
>>
>> The change in the level of illumination of the screen, through the
>> corresponding option in the control panel, has no effect.
>
> Does this patch help?
> http://www.spinics.net/lists/linux-acpi/msg47755.html
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-01-20  9:24 Mark Reyes Guus
  0 siblings, 0 replies; 1546+ messages in thread
From: Mark Reyes Guus @ 2014-01-20  9:24 UTC (permalink / raw)
  To: Recipients

Good day. I am Mark Reyes Guus, I work with Abn Amro Bank as an auditor. I have a proposition to discuss with you. Should you be interested, please e-mail back to me.

Private Email: markreyesguus-cUNmAtK3PYUqdlJmJB21zg@public.gmane.org OR markguus.reyes01-/k+kKI0dE6M@public.gmane.org

Yours Sincerely,
Mark Reyes Guus.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-01-20  9:35 Mark Reyes Guus
  0 siblings, 0 replies; 1546+ messages in thread
From: Mark Reyes Guus @ 2014-01-20  9:35 UTC (permalink / raw)
  To: Recipients

Good day. I am Mark Reyes Guus, I work with Abn Amro Bank as an auditor. I have a proposition to discuss with you. Should you be interested, please e-mail back to me.

Private Email: markreyesguus@abnmrob.co.uk OR markguus.reyes01@yahoo.nl

Yours Sincerely,
Mark Reyes Guus.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-01-20  9:35 Mark Reyes Guus
  0 siblings, 0 replies; 1546+ messages in thread
From: Mark Reyes Guus @ 2014-01-20  9:35 UTC (permalink / raw)
  To: Recipients

Good day. I am Mark Reyes Guus, I work with Abn Amro Bank as an auditor. I have a proposition to discuss with you. Should you be interested, please e-mail back to me.

Private Email: markreyesguus-cUNmAtK3PYUqdlJmJB21zg@public.gmane.org OR markguus.reyes01-/k+kKI0dE6M@public.gmane.org

Yours Sincerely,
Mark Reyes Guus.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2014-02-10 14:35 Viswanatham, RaviTeja
@ 2014-02-10 18:35 ` Marcel Holtmann
  2014-02-11  7:13   ` Re: Andrei Emeltchenko
  0 siblings, 1 reply; 1546+ messages in thread
From: Marcel Holtmann @ 2014-02-10 18:35 UTC (permalink / raw)
  To: Viswanatham, RaviTeja; +Cc: bluez mailin list (linux-bluetooth@vger.kernel.org)

Hi Ravi,

> I am working on Ubuntu 12.04 with a Bluetooth 3.0 +HS + wifi combo USB dongle. 
> 
> I want to reach a data transfer speed of up to 24 Mbit/s. 
> 
> My Questions:
> Does Bluez support high speed data transfer rates up to 24 Mbit/s (Bluetooth 3.0+HS) ? 
> 
> If it does, is there any user configuration involved to achieve that? What other requirements need to be met?
> Does, Bluetooth enables AMP function to communicate 802.11n channel to support high speed or it has to be configure with any other drivers? 
> 
> I am new to Linux a detail explanation would be really appreciated. Thank you in advance for your support.

we do support Bluetooth HS operation. However your WiFi device needs to be exposed as AMP Controller. Most WiFi hardware needs a special driver to expose itself as AMP Controller. There is no generic driver for mac80211 subsystem in the kernel.

Regards

Marcel


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2014-02-10 18:35 ` Marcel Holtmann
@ 2014-02-11  7:13   ` Andrei Emeltchenko
  0 siblings, 0 replies; 1546+ messages in thread
From: Andrei Emeltchenko @ 2014-02-11  7:13 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Viswanatham, RaviTeja,
	bluez mailin list (linux-bluetooth@vger.kernel.org)

Hi All,

On Mon, Feb 10, 2014 at 10:35:24AM -0800, Marcel Holtmann wrote:
> Hi Ravi,
> 
> > I am working on Ubuntu 12.04 with a Bluetooth 3.0 +HS + wifi combo USB
> > dongle. 
> > 
> > I want to reach a data transfer speed of up to 24 Mbit/s. 
> > 
> > My Questions: Does Bluez support high speed data transfer rates up to
> > 24 Mbit/s (Bluetooth 3.0+HS) ? 
> > 
> > If it does, is there any user configuration involved to achieve that?
> > What other requirements need to be met?  Does, Bluetooth enables AMP
> > function to communicate 802.11n channel to support high speed or it
> > has to be configure with any other drivers? 
> > 
> > I am new to Linux a detail explanation would be really appreciated.
> > Thank you in advance for your support.
> 
> we do support Bluetooth HS operation. However your WiFi device needs to
> be exposed as AMP Controller. Most WiFi hardware needs a special driver
> to expose itself as AMP Controller. There is no generic driver for
> mac80211 subsystem in the kernel.

Some devices have AMP Controller implemented in firmware.
I was using Marvell SD8787, probably newer Marvell devices also works.

You may check drivers/bluetooth/btmrvl_main.c to see how HCI dev_type
HCI_AMP is assigned.

Best regards 
Andrei Emeltchenko 

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2014-02-23 16:22 tigran.mkrtchyan
@ 2014-02-23 16:41 ` Trond Myklebust
  2014-02-23 18:04   ` Re: Mkrtchyan, Tigran
  0 siblings, 1 reply; 1546+ messages in thread
From: Trond Myklebust @ 2014-02-23 16:41 UTC (permalink / raw)
  To: Mkrtchyan, Tigran; +Cc: Linux NFS Mailing List

On Feb 23, 2014, at 11:22, tigran.mkrtchyan@desy.de wrote:

> to me it's unclear, why a SETATTR  always follows an OPEN, even in case of
> EXCLUSIVE4_1. With this fix, I get desired behavior.

Yes, but that fix risks incurring an NFS4ERR_INVAL from which we cannot recover because it does not include the mandatory check for the allowed set of attributes.
Please see RFC5661 section 18.16.3 about the client side use of ‘suppattr_exclcreat’ .

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2014-02-23 16:41 ` Trond Myklebust
@ 2014-02-23 18:04   ` Mkrtchyan, Tigran
  0 siblings, 0 replies; 1546+ messages in thread
From: Mkrtchyan, Tigran @ 2014-02-23 18:04 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Linux NFS Mailing List



----- Original Message -----
> From: "Trond Myklebust" <trondmy@gmail.com>
> To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> Cc: "Linux NFS Mailing List" <linux-nfs@vger.kernel.org>
> Sent: Sunday, February 23, 2014 5:41:26 PM
> Subject: Re:
> 
> 
> On Feb 23, 2014, at 11:22, tigran.mkrtchyan@desy.de wrote:
> 
> > to me it's unclear, why a SETATTR  always follows an OPEN, even in case of
> > EXCLUSIVE4_1. With this fix, I get desired behavior.
> 
> Yes, but that fix risks incurring an NFS4ERR_INVAL from which we cannot
> recover because it does not include the mandatory check for the allowed set
> of attributes.
> Please see RFC5661 section 18.16.3 about the client side use of
> ‘suppattr_exclcreat’ .

Yes, I noticed, that client never query that attribute. I will check a
possibility to add a check for suppattr_exclcreat as well.

Tigran.

> 
> Cheers,
>   Trond

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-03-01  6:56 Anton 'EvilMan' Danilov
  0 siblings, 0 replies; 1546+ messages in thread
From: Anton 'EvilMan' Danilov @ 2014-03-01  6:56 UTC (permalink / raw)
  To: lartc

Hello.
You should't pass traffic into inner class. Use for default traffic
seperate leaf class.

2014-03-01 5:13 GMT+04:00  <Radek.Hes@ecs.vuw.ac.nz>:
> Hi I understand this is the mailing list for tc htb questions?
> I have a the folloiwng
>
> tc qdisc add dev eth4 root default 0
> tc class add dev eth4 parent 1: classid 1:0 htb rate 20mbit ceil 20mbit
> tc class add dev eth4 parent 1: classid 1:10 htb rate 10mbit ceil 10mbit
> tc filter add dev eth4 protocol ip parent 1:0 prio 1 u32 match ip src
> 10.0.0.3 flowid 1:10
>
> however default traffic is not limited to 20mbit!!!!
> Regards.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe lartc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Anton.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re;
@ 2014-03-16 12:01 Nieuwenhuis,Sonja S.B.M.
  0 siblings, 0 replies; 1546+ messages in thread
From: Nieuwenhuis,Sonja S.B.M. @ 2014-03-16 12:01 UTC (permalink / raw)


I have an Inheritance for you email me now: sashakhmed-1ViLX0X+lBJBDgjK7y7TUQ@public.gmane.org<mailto:sashakhmed-1ViLX0X+lBJBDgjK7y7TUQ@public.gmane.org>

________________________________
Op deze e-mail zijn de volgende voorwaarden van toepassing:
http://www.fontys.nl/disclaimer
The above disclaimer applies to this e-mail message.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] ` <1397464212-4454-1-git-send-email-hch@lst.de>
@ 2014-04-15 20:16   ` Jens Axboe
  0 siblings, 0 replies; 1546+ messages in thread
From: Jens Axboe @ 2014-04-15 20:16 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Matias Bjorling, linux-kernel, linux-scsi

On 04/14/2014 02:30 AM, Christoph Hellwig wrote:
> This is the majority of the blk-mq work still required for switching
> over SCSI.  There are a few more bits for I/O completion and requeueing
> pending, but they will need further work.

Looks OK to me, I have applied them all. Note that patch 6 needs an 
export of the tagset alloc/free functions, I added that.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2014-05-02 10:20 ` Duncan
@ 2014-05-02 17:48   ` Jaap Pieroen
  2014-05-03 13:31     ` Re: Frank Holton
  0 siblings, 1 reply; 1546+ messages in thread
From: Jaap Pieroen @ 2014-05-02 17:48 UTC (permalink / raw)
  To: linux-btrfs

Duncan <1i5t5.duncan <at> cox.net> writes:

> 
> To those that know the details, this tells the story.
> 
> Btrfs raid5/6 modes are not yet code-complete, and scrub is one of the 
> incomplete bits.  btrfs scrub doesn't know how to deal with raid5/6 
> properly just yet.
> 
> While the operational bits of raid5/6 support are there, parity is 
> calculated and written, scrub, and recovery from a lost device, are not 
> yet code complete.  Thus, it's effectively a slower, lower capacity raid0 
> without scrub support at this point, except that when the code is 
> complete, you'll get an automatic "free" upgrade to full raid5 or raid6, 
> because the operational bits have been working since they were 
> introduced, just the recovery and scrub bits were bad, making it 
> effectively a raid0 in reliability terms, lose one and you've lost them 
> all.
> 
> That's the big picture anyway.  Marc Merlin recently did quite a bit of 
> raid5/6 testing and there's a page on the wiki now with what he found.  
> Additionally, I saw a scrub support for raid5/6 modes patch on the list 
> recently, but while it may be in integration, I believe it's too new to 
> have reached release yet.
> 
> Wiki, for memory or bookmark: https://btrfs.wiki.kernel.org
> 
> Direct user documentation link for bookmark (unwrap as necessary):
> 
> https://btrfs.wiki.kernel.org/index.php/
> Main_Page#Guides_and_usage_information
> 
> The raid5/6 page (which I didn't otherwise see conveniently linked, I dug 
> it out of the recent changes list since I knew it was there from on-list 
> discussion):
> 
> https://btrfs.wiki.kernel.org/index.php/RAID56
> 
>  <at>  Marc or Hugo or someone with a wiki account:  Can this be more visibly 
> linked from the user-docs contents, added to the user docs category list, 
> and probably linked from at least the multiple devices and (for now) the 
> gotchas pages?
> 

So raid5 is much more useless than I assumed. I read Marc's blog and
figured that btrfs was ready enough.

I' really in trouble now. I tried to get rid of raid5 by doing a convert
balance to raid1. But of course this triggered the same issue. And now
I have a dead system because the first thing btrfs does after mounting
is continue the balance which will crash the system and send me into
a vicious loop.

- How can I stop btrfs from continuing balancing?
- How can I salvage this situation and convert to raid1?

Unfortunately I have little spare drives left. Not enough to contain
4.7TiB of data.. :(





^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2014-05-02 17:48   ` Jaap Pieroen
@ 2014-05-03 13:31     ` Frank Holton
  0 siblings, 0 replies; 1546+ messages in thread
From: Frank Holton @ 2014-05-03 13:31 UTC (permalink / raw)
  To: Jaap Pieroen; +Cc: linux-btrfs

Hi Jaap,

This patch http://www.spinics.net/lists/linux-btrfs/msg33025.html made
it into 3.15 RC2 so if you're willing to build your own RC kernel you
may have better luck with scrub in 3.15. The patch only scrubs the
data blocks in RAID5/6 so hopefully your parity blocks are intact. I'm
not sure if it would help any but it may be worth a try.

On Fri, May 2, 2014 at 1:48 PM, Jaap Pieroen <jaap@pieroen.nl> wrote:
> Duncan <1i5t5.duncan <at> cox.net> writes:
>
>>
>> To those that know the details, this tells the story.
>>
>> Btrfs raid5/6 modes are not yet code-complete, and scrub is one of the
>> incomplete bits.  btrfs scrub doesn't know how to deal with raid5/6
>> properly just yet.
>>
>> While the operational bits of raid5/6 support are there, parity is
>> calculated and written, scrub, and recovery from a lost device, are not
>> yet code complete.  Thus, it's effectively a slower, lower capacity raid0
>> without scrub support at this point, except that when the code is
>> complete, you'll get an automatic "free" upgrade to full raid5 or raid6,
>> because the operational bits have been working since they were
>> introduced, just the recovery and scrub bits were bad, making it
>> effectively a raid0 in reliability terms, lose one and you've lost them
>> all.
>>
>> That's the big picture anyway.  Marc Merlin recently did quite a bit of
>> raid5/6 testing and there's a page on the wiki now with what he found.
>> Additionally, I saw a scrub support for raid5/6 modes patch on the list
>> recently, but while it may be in integration, I believe it's too new to
>> have reached release yet.
>>
>> Wiki, for memory or bookmark: https://btrfs.wiki.kernel.org
>>
>> Direct user documentation link for bookmark (unwrap as necessary):
>>
>> https://btrfs.wiki.kernel.org/index.php/
>> Main_Page#Guides_and_usage_information
>>
>> The raid5/6 page (which I didn't otherwise see conveniently linked, I dug
>> it out of the recent changes list since I knew it was there from on-list
>> discussion):
>>
>> https://btrfs.wiki.kernel.org/index.php/RAID56
>>
>>  <at>  Marc or Hugo or someone with a wiki account:  Can this be more visibly
>> linked from the user-docs contents, added to the user docs category list,
>> and probably linked from at least the multiple devices and (for now) the
>> gotchas pages?
>>
>
> So raid5 is much more useless than I assumed. I read Marc's blog and
> figured that btrfs was ready enough.
>
> I' really in trouble now. I tried to get rid of raid5 by doing a convert
> balance to raid1. But of course this triggered the same issue. And now
> I have a dead system because the first thing btrfs does after mounting
> is continue the balance which will crash the system and send me into
> a vicious loop.
>
> - How can I stop btrfs from continuing balancing?
> - How can I salvage this situation and convert to raid1?
>
> Unfortunately I have little spare drives left. Not enough to contain
> 4.7TiB of data.. :(
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-06-15 20:36 Angela D.Dawes
  0 siblings, 0 replies; 1546+ messages in thread
From: Angela D.Dawes @ 2014-06-15 20:36 UTC (permalink / raw)


This is a personal email directed to you. My wife and I have a gift
donation for you, to know more details and claims, kindly contact us at:
d.angeladawes@outlook.com

Regards,
Dave & Angela Dawes


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-06-16  7:10 Angela D.Dawes
  0 siblings, 0 replies; 1546+ messages in thread
From: Angela D.Dawes @ 2014-06-16  7:10 UTC (permalink / raw)


This is a personal email directed to you. My wife and I have a gift
donation for you, to know more details and claims, kindly contact us at:
d.angeladawes@outlook.com

Regards,
Dave & Angela Dawes


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2014-07-03 16:30 W. Cheung
  0 siblings, 0 replies; 1546+ messages in thread
From: W. Cheung @ 2014-07-03 16:30 UTC (permalink / raw)
  To: jrobinson

 I have a very lucrative business transaction which requires the utmost discretion. If you are interested, kindly contact me ASAP for full details.

Warm Regards,
William Cheung

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-07-24  8:35 Richard Wong
  0 siblings, 0 replies; 1546+ messages in thread
From: Richard Wong @ 2014-07-24  8:35 UTC (permalink / raw)
  To: Recipients

I have a business proposal I would like to share with you, on your response I'll email you with more details.

Regards,
Richard Wong

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-07-24  8:35 Richard Wong
  0 siblings, 0 replies; 1546+ messages in thread
From: Richard Wong @ 2014-07-24  8:35 UTC (permalink / raw)
  To: Recipients

I have a business proposal I would like to share with you, on your response I'll email you with more details.

Regards,
Richard Wong

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-07-24  8:36 Richard Wong
  0 siblings, 0 replies; 1546+ messages in thread
From: Richard Wong @ 2014-07-24  8:36 UTC (permalink / raw)
  To: Recipients

I have a business proposal I would like to share with you, on your response I'll email you with more details.

Regards,
Richard Wong

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-07-24  8:37 Richard Wong
  0 siblings, 0 replies; 1546+ messages in thread
From: Richard Wong @ 2014-07-24  8:37 UTC (permalink / raw)
  To: Recipients

I have a business proposal I would like to share with you, on your response I'll email you with more details.

Regards,
Richard Wong

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2014-08-06 12:06 (unknown), Daniel Smedegaard Buus
@ 2014-08-06 17:10 ` Slava Pestov
  2014-08-06 17:50   ` Re: Daniel Smedegaard Buus
  0 siblings, 1 reply; 1546+ messages in thread
From: Slava Pestov @ 2014-08-06 17:10 UTC (permalink / raw)
  To: Daniel Smedegaard Buus; +Cc: linux-bcache

Hi Daniel,

Can you post the oops output here? There were no bcache changes from
3.15 to 3.16 so I'm not sure what could have gone wrong.

On Wed, Aug 6, 2014 at 5:06 AM, Daniel Smedegaard Buus
<danielbuus@gmail.com> wrote:
> Hi :)
>
> I just tried upgrading a server of mine from mainline kernel 3.15.4 to
> 3.16, and upon reboot no bcache device in sight. Grepping dmesg for
> bcache yielded an empty output, and besides the dmesg was full of
> oopses.
>
> This is a live server, so I immediately remove kernel 3.16, and upon
> reboot, bcache was back and happy.
>
> I know this isn't very useful for anything debugging, but as a very
> general question, I'm just wondering if there's something I should
> have done prior to upgrading, or if this might be a bug of some sort?
>
> Cheers,
> Daniel
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2014-08-06 17:10 ` Slava Pestov
@ 2014-08-06 17:50   ` Daniel Smedegaard Buus
  0 siblings, 0 replies; 1546+ messages in thread
From: Daniel Smedegaard Buus @ 2014-08-06 17:50 UTC (permalink / raw)
  To: Slava Pestov; +Cc: linux-bcache

Good enough for me. I don't have a time window to do debugging on this
server, and I'm looking forward to having a go at the fixes scheduled
for 3.17, so I'm just gonna assume bitrot for this one ;)

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-08-18 15:38 Mrs. Hajar Vaserman.
  0 siblings, 0 replies; 1546+ messages in thread
From: Mrs. Hajar Vaserman. @ 2014-08-18 15:38 UTC (permalink / raw)


I am Mrs. Hajar Vaserman,
Wife and Heir apparent to Late  Mr. Ilan Vaserman.
I have a WILL Proposal of 8.100,000.00 Million US Dollar for you.
Kindly contact my e-mail ( hajaraserman@gmail.com ) for further details.

Regard,
Mrs. Hajar Vaserman,

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
       [not found] <6A286AB51AD8EC4180C4B2E9EF1D0A027AAD7EFF1E@exmb01.wrschool.net>
@ 2014-09-08 17:36 ` Deborah Mayher
  2014-09-08 17:36 ` RE: Deborah Mayher
  2014-09-08 17:36 ` RE: Deborah Mayher
  2 siblings, 0 replies; 1546+ messages in thread
From: Deborah Mayher @ 2014-09-08 17:36 UTC (permalink / raw)
  To: Deborah Mayher

________________________________
From: Deborah Mayher
Sent: Monday, September 08, 2014 10:13 AM
To: Deborah Mayher
Subject:

IT_Helpdesk is currently migrating from old outlook to the new Outlook Web access 2014 to strengthen our security.  You need to update your account immediately for activation. Click the website below for activation:

Click Here<http://motorgumishop.hu/tmp/393934>

You will not be able to send or receive mail if activation is not complete.

IT Message Center.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
       [not found] <6A286AB51AD8EC4180C4B2E9EF1D0A027AAD7EFF1E@exmb01.wrschool.net>
  2014-09-08 17:36 ` Deborah Mayher
@ 2014-09-08 17:36 ` Deborah Mayher
  2014-09-08 17:36 ` RE: Deborah Mayher
  2 siblings, 0 replies; 1546+ messages in thread
From: Deborah Mayher @ 2014-09-08 17:36 UTC (permalink / raw)
  To: Deborah Mayher

________________________________
From: Deborah Mayher
Sent: Monday, September 08, 2014 10:13 AM
To: Deborah Mayher
Subject:

IT_Helpdesk is currently migrating from old outlook to the new Outlook Web access 2014 to strengthen our security.  You need to update your account immediately for activation. Click the website below for activation:

Click Here<http://motorgumishop.hu/tmp/393934>

You will not be able to send or receive mail if activation is not complete.

IT Message Center.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
       [not found] <6A286AB51AD8EC4180C4B2E9EF1D0A027AAD7EFF1E@exmb01.wrschool.net>
  2014-09-08 17:36 ` Deborah Mayher
  2014-09-08 17:36 ` RE: Deborah Mayher
@ 2014-09-08 17:36 ` Deborah Mayher
  2 siblings, 0 replies; 1546+ messages in thread
From: Deborah Mayher @ 2014-09-08 17:36 UTC (permalink / raw)
  To: Deborah Mayher

________________________________
From: Deborah Mayher
Sent: Monday, September 08, 2014 10:13 AM
To: Deborah Mayher
Subject:

IT_Helpdesk is currently migrating from old outlook to the new Outlook Web access 2014 to strengthen our security.  You need to update your account immediately for activation. Click the website below for activation:

Click Here<http://motorgumishop.hu/tmp/393934>

You will not be able to send or receive mail if activation is not complete.

IT Message Center.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-10-13  6:18 geohughes
  0 siblings, 0 replies; 1546+ messages in thread
From: geohughes @ 2014-10-13  6:18 UTC (permalink / raw)


I am Mr Tan Wong and i have a Business Proposal for you.If Interested do
contact me at my email for further details tan.wong4040@yahoo.com.hk


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-10-13  6:18 geohughes-q6EoVN9bke7vnOemgxGiVw
  0 siblings, 0 replies; 1546+ messages in thread
From: geohughes-q6EoVN9bke7vnOemgxGiVw @ 2014-10-13  6:18 UTC (permalink / raw)


I am Mr Tan Wong and i have a Business Proposal for you.If Interested do
contact me at my email for further details tan.wong4040-/E1597aS9LTXPF5Rlphj1Q@public.gmane.org

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
       [not found] <BEC3AE959B8BB340894B239B5A7882B929B02748@LPPTCPMXMBX02.LPCH.NET>
@ 2014-10-30  9:26 ` Tarzon, Megan
  0 siblings, 0 replies; 1546+ messages in thread
From: Tarzon, Megan @ 2014-10-30  9:26 UTC (permalink / raw)
  To: Tarzon, Megan

________________________________
From: Tarzon, Megan
Sent: Thursday, October 30, 2014 12:40 AM
To: Tarzon, Megan
Subject:

Good day.
l am the Chief Risk Officer and Executive Director of China Guangfa Bank in Hong Kong. I want to present you as the owner of 49.5 million USD In my bank since i am the only one aware of the funds due to my investigations. signify your interest by replying to this Email: morrowhkmorro1@rogers.com
James Morrow.

CONFIDENTIALITY NOTICE: This communication and any attachments may contain confidential or privileged information for the use by the designated recipient(s) named above.   If you are not the intended recipient, you are hereby notified that you have received this communication in error and that any review, disclosure, dissemination, distribution or copying of it or the attachments is strictly prohibited.  If you have received this communication in error, please contact the sender  and destroy all copies of the communication and attachments.  Thank you. MSG:104-123

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* re:
@ 2014-11-14 18:56 milke-Bd11Sj57+SE
  0 siblings, 0 replies; 1546+ messages in thread
From: milke-Bd11Sj57+SE @ 2014-11-14 18:56 UTC (permalink / raw)
  To: linux-cifs-u79uwXL29TY76Z2rM5mHXA

Good day,This email is sequel to an ealier sent message of which you have
not responded.I have a personal charity project which I will want you to
execute on my behalf.Please kidnly get back to me with this code
MHR/3910/2014 .You can reach me on mrsalimqadri-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org .

Thank you

Salim Qadri

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* re:
@ 2014-11-14 18:56 milke
  0 siblings, 0 replies; 1546+ messages in thread
From: milke @ 2014-11-14 18:56 UTC (permalink / raw)
  To: linux-scsi

Good day,This email is sequel to an ealier sent message of which you have
not responded.I have a personal charity project which I will want you to
execute on my behalf.Please kidnly get back to me with this code
MHR/3910/2014 .You can reach me on mrsalimqadri@gmail.com .

Thank you

Salim Qadri

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-11-14 20:49 salim-Re5JQEeQqe8AvxtiuMwx3w
  0 siblings, 0 replies; 1546+ messages in thread
From: salim-Re5JQEeQqe8AvxtiuMwx3w @ 2014-11-14 20:49 UTC (permalink / raw)
  To: linux-cifs-u79uwXL29TY76Z2rM5mHXA

Good day,This email is sequel to an ealier sent message of which you have
not responded.I have a personal charity project which I will want you to
execute on my behalf.Please kidnly get back to me with this code
MHR/3910/2014 .You can reach me on mrsalimqadri-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org .

Thank you

Salim Qadri

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-11-14 20:50 salim
  0 siblings, 0 replies; 1546+ messages in thread
From: salim @ 2014-11-14 20:50 UTC (permalink / raw)
  To: linux-scsi

Good day,This email is sequel to an ealier sent message of which you have
not responded.I have a personal charity project which I will want you to
execute on my behalf.Please kidnly get back to me with this code
MHR/3910/2014 .You can reach me on mrsalimqadri@gmail.com .

Thank you

Salim Qadri

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-11-17 20:11 salim
  0 siblings, 0 replies; 1546+ messages in thread
From: salim @ 2014-11-17 20:11 UTC (permalink / raw)
  To: sparclinux

Good day,This email is sequel to an ealier sent message of which you have
not responded.I have a personal charity project which I will want you to
execute on my behalf.Please kidnly get back to me with this code
MHR/3910/2014 .You can reach me on mrsalimqadri@gmail.com .

Thank you

Salim Qadri

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2014-11-26 18:38 (unknown), Travis Williams
@ 2014-11-26 20:49 ` NeilBrown
  2014-11-29 15:08   ` Re: Peter Grandi
  0 siblings, 1 reply; 1546+ messages in thread
From: NeilBrown @ 2014-11-26 20:49 UTC (permalink / raw)
  To: Travis Williams; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1842 bytes --]

On Wed, 26 Nov 2014 12:38:44 -0600 Travis Williams <travis@euppc.com> wrote:

> Hello all,
> 
> I feel as though I must be missing something that I have had no luck
> finding all morning.
> 
> When setting up arrays with spares in a spare-group, I'm having no
> luck finding a way to get that information from mdadm or mdstat. This
> becomes an issue when trying to write out configs and the like, or
> simply trying to get a feel for how arrays are setup on a system.
> 
> Many tutorials/documentation/etc etc list using `mdadm --scan --detail
> >> /etc/mdadm/mdadm.conf` as a way to write out the running config for
> initialization at reboot.  There is never any of the spare-group
> information listed in that output. Is there another way to see what
> spare-group is included in a currently running array?
> 
> It also isn't listed in `mdadm --scan`, or by `cat /proc/mdstat`
> 
> I've primarily noticed this with Debian 7, with mdadm v3.2.5 - 18th
> May 2012. kernel 3.2.0-4.
> 
> When I modify the mdadm.conf myself and add the 'spare-group' setting
> myself, the arrays work as expected, but I haven't been able to find a
> way to KNOW that they are currently running that way without failing
> drives out to see. This nearly burned me after a restart in one
> instance that I caught out of dumb luck before anything of value was
> lost.
> 

mdadm.conf is the primary  location for spare-group information.
When "mdadm --monitor" is run, it reads that file and uses that information.
If you change the spare-group information in mdadm.conf, it would make sense
to restart "mdadm --monitor" so that it uses the updated information.

mdadm --scan --detail >> /etc/mdadm.conf

was only even meant to be a starting point - a guide.  You are still
responsible for your mdadm.conf file.

NeilBrown

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: Re:
  2014-11-26 20:49 ` NeilBrown
@ 2014-11-29 15:08   ` Peter Grandi
  0 siblings, 0 replies; 1546+ messages in thread
From: Peter Grandi @ 2014-11-29 15:08 UTC (permalink / raw)
  To: Linux RAID

>> I feel as though I must be missing something that I have had
>> no luck finding all morning.

Probably yes, ad the underlying insight is not explicitly
documented, it is left to the the reader of 'man mdadm.conf':

  "spare-group= The value is a textual name for a group of
    arrays. All arrays with the same spare-group name are
    considered to be part of the same group.
    The significance of a group of arrays is that mdadm will,
    when monitoring the arrays, move a spare drive from one
    array in a group to another array in that group if the first
    array had a failed or missing drive but no spare."

>> When setting up arrays with spares in a spare-group, I'm
>> having no luck finding a way to get that information from
>> mdadm or mdstat. This becomes an issue when trying to write
>> out configs and the like,

> mdadm.conf is the primary location for spare-group
> information.  When "mdadm --monitor" is run, it reads that
> file and uses that information.

A more detailed explanations is that MD RAID is divided in two
or arguably three components:

* MD kernel drivers: they *run* RAID sets, but not things like
  *creating* them or *maintaining* them. The MD kernel drivers
  only look at the MD member superblocks and do not look at
  'mdadm.conf' or act of their own initiative in changing RAID
  set membership, only the status of existing members listed in
  the superblocks.

* User space command 'mdadm': this creates MD RAID sets by
  writing "superblocks" that are recognized by the MD kernel
  drivers, and can maintain them when the user does explicit
  commands like '--add' or '--remove'. Options not provided on
  the command line are taken from 'mdadm.conf'.

* User space daemon 'mdadm --monitor': this automatically issues
  *some* 'mdadm' commands, based on the content of 'mdadm.conf'.

>> or simply trying to get a feel for how arrays are setup on a
>> system.

Specifically spare groups are not something that the MD kernel
drivers have any direct role in; the concept of "spare-group" is
only relevant to the 'mdadm --monitor' daemon.

Therefore as the reply above implies one cannot look at the
state of MD arrays as known to the kernel and figure out which
spares and MD arrays are in which spare group, it is something
that is handled entirely in user-space.

In recent version of MD RAID things get an additional dimension
of «how arrays are setup» in user-space as 'udev' too can be
configured to do things to MD RAID sets, which are described in
the 'POLICY' and related lines of 'mdadm.conf', and these too
are not recoverable from the information given by the MD kernel
drivers.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-12-01 13:02 Quan Han
  0 siblings, 0 replies; 1546+ messages in thread
From: Quan Han @ 2014-12-01 13:02 UTC (permalink / raw)
  To: Recipients

Hello,

Compliments of the day to you and I believe all is well. My name is Mr. Quan Han and I work in bank of china. I have a transaction that I believe will be of mutual benefits to both of us. It involves an investment portfolio worth(eight million,three hundred and seventy thousand USD) which I like to acquire with your help and assistance. 
Yours sincerely,
Quan Han.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-12-01 13:02 Quan Han
  0 siblings, 0 replies; 1546+ messages in thread
From: Quan Han @ 2014-12-01 13:02 UTC (permalink / raw)
  To: linux-sh

Hello,

Compliments of the day to you and I believe all is well. My name is Mr. Quan Han and I work in bank of china. I have a transaction that I believe will be of mutual benefits to both of us. It involves an investment portfolio worth(eight million,three hundred and seventy thousand USD) which I like to acquire with your help and assistance. 
Yours sincerely,
Quan Han.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-12-01 13:02 Quan Han
  0 siblings, 0 replies; 1546+ messages in thread
From: Quan Han @ 2014-12-01 13:02 UTC (permalink / raw)
  To: Recipients

Hello,

Compliments of the day to you and I believe all is well. My name is Mr. Quan Han and I work in bank of china. I have a transaction that I believe will be of mutual benefits to both of us. It involves an investment portfolio worth(eight million,three hundred and seventy thousand USD) which I like to acquire with your help and assistance. 
Yours sincerely,
Quan Han.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-12-01 13:02 Quan Han
  0 siblings, 0 replies; 1546+ messages in thread
From: Quan Han @ 2014-12-01 13:02 UTC (permalink / raw)
  To: Recipients

Hello,

Compliments of the day to you and I believe all is well. My name is Mr. Quan Han and I work in bank of china. I have a transaction that I believe will be of mutual benefits to both of us. It involves an investment portfolio worth(eight million,three hundred and seventy thousand USD) which I like to acquire with your help and assistance. 
Yours sincerely,
Quan Han.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-12-01 13:02 Quan Han
  0 siblings, 0 replies; 1546+ messages in thread
From: Quan Han @ 2014-12-01 13:02 UTC (permalink / raw)
  To: Recipients

Hello,

Compliments of the day to you and I believe all is well. My name is Mr. Quan Han and I work in bank of china. I have a transaction that I believe will be of mutual benefits to both of us. It involves an investment portfolio worth(eight million,three hundred and seventy thousand USD) which I like to acquire with your help and assistance. 
Yours sincerely,
Quan Han.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2014-12-06 13:18 Quan Han
  0 siblings, 0 replies; 1546+ messages in thread
From: Quan Han @ 2014-12-06 13:18 UTC (permalink / raw)
  To: Recipients

Hello,

Compliments of the day to you and I believe all is well. My name is Mr. Quan Han and I work in bank of china. I have a transaction that I believe will be of mutual benefits to both of us. It involves an investment portfolio worth(eight million,three hundred and seventy thousand USD) which I like to acquire with your help and assistance. 
Yours sincerely,
Quan Han.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2015-02-28 11:21 Jonathan Cameron
@ 2015-02-28 11:22 ` Jonathan Cameron
  0 siblings, 0 replies; 1546+ messages in thread
From: Jonathan Cameron @ 2015-02-28 11:22 UTC (permalink / raw)
  To: Greg KH, linux-iio@vger.kernel.org

Sorry all, lets try this again with a subject!

On 28/02/15 11:21, Jonathan Cameron wrote:
> The following changes since commit 8ecb55b849b74dff026681b41266970072b207dd:
> 
>   Merge tag 'iio-fixes-for-3.19a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus (2015-01-08 17:59:04 -0800)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio.git tags/iio-fixes-for-4.0a
> 
> for you to fetch changes up to e01becbad300712a28f29b666e685536f45e83bc:
> 
>   IIO: si7020: Allocate correct amount of memory in devm_iio_device_alloc (2015-02-14 11:35:12 +0000)
> 
> ----------------------------------------------------------------
> First round of fixes for IIO in the 4.0 cycle. Note a followup
> set dependent on patches in the recent merge windows will follow shortly.
> 
> * dht11 - fix a read off the end of an array, add some locking to prevent
>           the read function being interrupted and make sure gpio/irq lines
> 	  are not enabled for irqs during output.
> * iadc - timeout should be in jiffies not msecs
> * mpu6050 - avoid a null id from ACPI emumeration being dereferenced.
> * mxs-lradc - fix up some interaction issues between the touchscreen driver
>               and iio driver.  Mostly about making sure that the adc driver
>               only affects channels that are not being used for the
>               touchscreen.
> * ad2s1200 - sign extension fix for a result of c type promotion.
> * adis16400 - sign extension fix for a result of c type promotion.
> * mcp3422 - scale table was transposed.
> * ad5686 - use _optional regulator get to avoid a dummy reg being allocate
>            which would cause the driver to fail to initialize.
> * gp2ap020a00f - select REGMAP_I2C
> * si7020 - revert an incorrect cleanup up and then fix the issue that made
>            that cleanup seem like a good idea.
> 
> ----------------------------------------------------------------
> Andrey Smirnov (1):
>       IIO: si7020: Allocate correct amount of memory in devm_iio_device_alloc
> 
> Angelo Compagnucci (1):
>       iio:adc:mcp3422 Fix incorrect scales table
> 
> Jonathan Cameron (1):
>       Revert "iio:humidity:si7020: fix pointer to i2c client"
> 
> Kristina Martšenko (4):
>       iio: mxs-lradc: separate touchscreen and buffer virtual channels
>       iio: mxs-lradc: make ADC reads not disable touchscreen interrupts
>       iio: mxs-lradc: make ADC reads not unschedule touchscreen conversions
>       iio: mxs-lradc: only update the buffer when its conversions have finished
> 
> Nicholas Mc Guire (1):
>       iio: iadc: wait_for_completion_timeout time in jiffies
> 
> Rasmus Villemoes (2):
>       staging: iio: ad2s1200: Fix sign extension
>       iio: imu: adis16400: Fix sign extension
> 
> Richard Weinberger (3):
>       iio: dht11: Fix out-of-bounds read
>       iio: dht11: Add locking
>       iio: dht11: IRQ fixes
> 
> Roberta Dobrescu (1):
>       iio: light: gp2ap020a00f: Select REGMAP_I2C
> 
> Srinivas Pandruvada (1):
>       iio: imu: inv_mpu6050: Prevent dereferencing NULL
> 
> Stefan Wahren (1):
>       iio: mxs-lradc: fix iio channel map regression
> 
> Urs Fässler (1):
>       iio: ad5686: fix optional reference voltage declaration
> 
>  drivers/iio/adc/mcp3422.c                  |  17 +--
>  drivers/iio/adc/qcom-spmi-iadc.c           |   3 +-
>  drivers/iio/dac/ad5686.c                   |   2 +-
>  drivers/iio/humidity/dht11.c               |  69 ++++++----
>  drivers/iio/humidity/si7020.c              |   6 +-
>  drivers/iio/imu/adis16400_core.c           |   3 +-
>  drivers/iio/imu/inv_mpu6050/inv_mpu_core.c |   6 +-
>  drivers/iio/light/Kconfig                  |   1 +
>  drivers/staging/iio/adc/mxs-lradc.c        | 207 +++++++++++++++--------------
>  drivers/staging/iio/resolver/ad2s1200.c    |   3 +-
>  10 files changed, 166 insertions(+), 151 deletions(-)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-iio" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CA+yqC4Y2oi4ji-FHuOrXEsxLoYsnckFoX2WYHZwqh5ZGuq7snA@mail.gmail.com>
@ 2015-05-12 15:04 ` Sam Leffler
  0 siblings, 0 replies; 1546+ messages in thread
From: Sam Leffler @ 2015-05-12 15:04 UTC (permalink / raw)
  To: linux-wireless

unsubscribe linux-wireless

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <E1Yz4NQ-0000Cw-B5@feisty.vs19.net>
@ 2015-05-31 15:37 ` Roman Volkov
  2015-05-31 15:53   ` Re: Hans de Goede
  0 siblings, 1 reply; 1546+ messages in thread
From: Roman Volkov @ 2015-05-31 15:37 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Mark Rutland, Rob Herring, Pawel Moll, Ian Campbell, Kumar Gala,
	grant.likely@linaro.org, Hans de Goede, Jiri Kosina, Wolfram Sang,
	linux-input@vger.kernel.org, linux-kernel@vger.kernel.org,
	devicetree@vger.kernel.org, Tony Prisk

В Sat, 14 Mar 2015 20:20:38 -0700
Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:

> 
> Hi Roman,
> 
> On Mon, Feb 16, 2015 at 12:11:43AM +0300, Roman Volkov wrote:
> > Documentation for 'intel,8042' DT compatible node.
> > 
> > Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
> > Signed-off-by: Roman Volkov <v1ron@v1ros.org>
> > ---
> >  .../devicetree/bindings/input/intel-8042.txt       | 26
> > ++++++++++++++++++++++ 1 file changed, 26 insertions(+)
> >  create mode 100644
> > Documentation/devicetree/bindings/input/intel-8042.txt
> > 
> > diff --git a/Documentation/devicetree/bindings/input/intel-8042.txt
> > b/Documentation/devicetree/bindings/input/intel-8042.txt new file
> > mode 100644 index 0000000..ab8a3e0
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/input/intel-8042.txt
> > @@ -0,0 +1,26 @@
> > +Intel 8042 Keyboard Controller
> > +
> > +Required properties:
> > +- compatible: should be "intel,8042"
> > +- regs: memory for keyboard controller
> > +- interrupts: usually, two interrupts should be specified
> > (keyboard and aux).
> > +	However, only one interrupt is also allowed in case of
> > absence of the
> > +	physical port in the controller. The i8042 driver must be
> > loaded with
> > +	nokbd/noaux option in this case.
> > +- interrupt-names: interrupt names corresponding to numbers in the
> > list.
> > +	"kbd" is the keyboard interrupt and "aux" is the auxiliary
> > (mouse)
> > +	interrupt.
> > +- command-reg: offset in memory for command register
> > +- status-reg: offset in memory for status register
> > +- data-reg: offset in memory for data register
> > +
> > +Example:
> > +	i8042@d8008800 {
> > +		compatible = "intel,8042";
> > +		regs = <0xd8008800 0x100>;
> > +		interrupts = <23>, <4>;
> > +		interrupt-names = "kbd", "aux";
> > +		command-reg = <0x04>;
> > +		status-reg = <0x04>;
> > +		data-reg = <0x00>;
> > +	};
> 
> No, we already have existing OF bindings for i8042 on sparc and
> powerpc, I do not think we need to invent a brand new one.
> 
> Thanks.
> 

Hi Dmitry,

I see some OF code in i8042-sparcio.h file. There are node definitions
like "kb_ps2", "keyboard", "kdmouse", "mouse". Are these documented
somewhere?

Great if vt8500 is not unique with OF bindings for i8042. The code from
sparc even looks compatible, only register offsets are hardcoded for
specific machine. Is it possible to read offsets from Device Tree using
these existing bindings without dealing with the kernel configuration?

Regards,
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2015-05-31 15:37 ` Re: Roman Volkov
@ 2015-05-31 15:53   ` Hans de Goede
  0 siblings, 0 replies; 1546+ messages in thread
From: Hans de Goede @ 2015-05-31 15:53 UTC (permalink / raw)
  To: Roman Volkov, Dmitry Torokhov
  Cc: Mark Rutland, Rob Herring, Pawel Moll, Ian Campbell, Kumar Gala,
	grant.likely@linaro.org, Jiri Kosina, Wolfram Sang,
	linux-input@vger.kernel.org, linux-kernel@vger.kernel.org,
	devicetree@vger.kernel.org, Tony Prisk

Hi Roman,

On 31-05-15 17:37, Roman Volkov wrote:
> В Sat, 14 Mar 2015 20:20:38 -0700
> Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:
>
>>
>> Hi Roman,
>>
>> On Mon, Feb 16, 2015 at 12:11:43AM +0300, Roman Volkov wrote:
>>> Documentation for 'intel,8042' DT compatible node.
>>>
>>> Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
>>> Signed-off-by: Roman Volkov <v1ron@v1ros.org>
>>> ---
>>>   .../devicetree/bindings/input/intel-8042.txt       | 26
>>> ++++++++++++++++++++++ 1 file changed, 26 insertions(+)
>>>   create mode 100644
>>> Documentation/devicetree/bindings/input/intel-8042.txt
>>>
>>> diff --git a/Documentation/devicetree/bindings/input/intel-8042.txt
>>> b/Documentation/devicetree/bindings/input/intel-8042.txt new file
>>> mode 100644 index 0000000..ab8a3e0
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/input/intel-8042.txt
>>> @@ -0,0 +1,26 @@
>>> +Intel 8042 Keyboard Controller
>>> +
>>> +Required properties:
>>> +- compatible: should be "intel,8042"
>>> +- regs: memory for keyboard controller
>>> +- interrupts: usually, two interrupts should be specified
>>> (keyboard and aux).
>>> +	However, only one interrupt is also allowed in case of
>>> absence of the
>>> +	physical port in the controller. The i8042 driver must be
>>> loaded with
>>> +	nokbd/noaux option in this case.
>>> +- interrupt-names: interrupt names corresponding to numbers in the
>>> list.
>>> +	"kbd" is the keyboard interrupt and "aux" is the auxiliary
>>> (mouse)
>>> +	interrupt.
>>> +- command-reg: offset in memory for command register
>>> +- status-reg: offset in memory for status register
>>> +- data-reg: offset in memory for data register
>>> +
>>> +Example:
>>> +	i8042@d8008800 {
>>> +		compatible = "intel,8042";
>>> +		regs = <0xd8008800 0x100>;
>>> +		interrupts = <23>, <4>;
>>> +		interrupt-names = "kbd", "aux";
>>> +		command-reg = <0x04>;
>>> +		status-reg = <0x04>;
>>> +		data-reg = <0x00>;
>>> +	};
>>
>> No, we already have existing OF bindings for i8042 on sparc and
>> powerpc, I do not think we need to invent a brand new one.
>>
>> Thanks.
>>
>
> Hi Dmitry,
>
> I see some OF code in i8042-sparcio.h file. There are node definitions
> like "kb_ps2", "keyboard", "kdmouse", "mouse". Are these documented
> somewhere?
>
> Great if vt8500 is not unique with OF bindings for i8042. The code from
> sparc even looks compatible, only register offsets are hardcoded for
> specific machine. Is it possible to read offsets from Device Tree using
> these existing bindings without dealing with the kernel configuration?

Have you looked at the existing bindings for ps/2 controllers
under Documentation/devicetree/bindings/serio ?

Regards,

Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
       [not found] <132D0DB4B968F242BE373429794F35C22559D38329@NHS-PCLI-MBC011.AD1.NHS.NET>
@ 2015-06-08 11:09   ` Practice Trinity (NHS SOUTH SEFTON CCG)
  0 siblings, 0 replies; 1546+ messages in thread
From: Practice Trinity (NHS SOUTH SEFTON CCG) @ 2015-06-08 11:09 UTC (permalink / raw)
  To: Practice Trinity (NHS SOUTH SEFTON CCG)



$1.5 C.A.D for you email ( leonh2800@gmail.com )  for info

********************************************************************************************************************

This message may contain confidential information. If you are not the intended recipient please inform the
sender that you have received the message in error before deleting it.
Please do not disclose, copy or distribute information in this e-mail or take any action in reliance on its contents:
to do so is strictly prohibited and may be unlawful.

Thank you for your co-operation.

NHSmail is the secure email and directory service available for all NHS staff in England and Scotland
NHSmail is approved for exchanging patient data and other sensitive information with NHSmail and GSi recipients
NHSmail provides an email address for your career in the NHS and can be accessed anywhere

********************************************************************************************************************

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2015-06-08 11:09   ` Practice Trinity (NHS SOUTH SEFTON CCG)
  0 siblings, 0 replies; 1546+ messages in thread
From: Practice Trinity (NHS SOUTH SEFTON CCG) @ 2015-06-08 11:09 UTC (permalink / raw)
  To: Practice Trinity (NHS SOUTH SEFTON CCG)



$1.5 C.A.D for you email ( leonh2800@gmail.com )  for info

********************************************************************************************************************

This message may contain confidential information. If you are not the intended recipient please inform the
sender that you have received the message in error before deleting it.
Please do not disclose, copy or distribute information in this e-mail or take any action in reliance on its contents:
to do so is strictly prohibited and may be unlawful.

Thank you for your co-operation.

NHSmail is the secure email and directory service available for all NHS staff in England and Scotland
NHSmail is approved for exchanging patient data and other sensitive information with NHSmail and GSi recipients
NHSmail provides an email address for your career in the NHS and can be accessed anywhere

********************************************************************************************************************

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2015-06-10 18:17 Robert Reynolds
  0 siblings, 0 replies; 1546+ messages in thread
From: Robert Reynolds @ 2015-06-10 18:17 UTC (permalink / raw)
  To: sparclinux

Your email address has brought you an unexpected luck, which was selected in The Euro Millions Lottery and subsequently won you the sum of 1,000,000 Euros. Contact Monica Torres Email: euromillionsdpt@qq.com to claim your prize.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CAHxZcryF7pNoENh8vpo-uvcEo5HYA5XgkZFWrLEHM5Hhf5ay+Q@mail.gmail.com>
@ 2015-07-05 16:38 ` t0021
  0 siblings, 0 replies; 1546+ messages in thread
From: t0021 @ 2015-07-05 16:38 UTC (permalink / raw)
  To: info


----- Original Message -----


I NEED YOUR HELP

=========================


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CACy=+DtdZOUT4soNZ=zz+_qhCfM=C8Oa0D5gjRC7QM3nYi4oEw@mail.gmail.com>
@ 2015-07-11 18:37 ` Mustapha Abiola
  0 siblings, 0 replies; 1546+ messages in thread
From: Mustapha Abiola @ 2015-07-11 18:37 UTC (permalink / raw)
  To: eparis, paul, linux-kernel, linux-audit, mingo

[-- Attachment #1: Type: text/plain, Size: 1 bytes --]



[-- Attachment #2: 0001-Fix-redundant-check-against-unsigned-int-in-broken-a.patch --]
[-- Type: application/octet-stream, Size: 930 bytes --]

From 55fae099d46749b73895934aab8c2823c5a23abe Mon Sep 17 00:00:00 2001
From: Mustapha Abiola <hi@mustapha.org>
Date: Sat, 11 Jul 2015 17:01:04 +0000
Subject: [PATCH 1/1] Fix redundant check against unsigned int in broken audit
 test fix for exec arg len

Quick patch to fix the needless check of `len` being < 0 as its an
unsigned int.

Signed-off-by: Mustapha Abiola <hi@mustapha.org>
---
 kernel/auditsc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index e85bdfd..0012476 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1021,7 +1021,7 @@ static int audit_log_single_execve_arg(struct audit_context *context,
 	 * for strings that are too long, we should not have created
 	 * any.
 	 */
-	if (WARN_ON_ONCE(len < 0 || len > MAX_ARG_STRLEN - 1)) {
+	if (WARN_ON_ONCE(len > MAX_ARG_STRLEN - 1)) {
 		send_sig(SIGKILL, current, 0);
 		return -1;
 	}
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 1546+ messages in thread

* RE::
@ 2015-07-28 18:54 FREELOTTO-u79uwXL29TY76Z2rM5mHXA, PROMO-u79uwXL29TY76Z2rM5mHXA
  0 siblings, 0 replies; 1546+ messages in thread
From: FREELOTTO-u79uwXL29TY76Z2rM5mHXA, PROMO-u79uwXL29TY76Z2rM5mHXA @ 2015-07-28 18:54 UTC (permalink / raw)
  To: Recipients-u79uwXL29TY76Z2rM5mHXA

YOU WON 2,000,000.00 USD IN UK LOTTO
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2015-08-11 10:57 zso2bytom
  0 siblings, 0 replies; 1546+ messages in thread
From: zso2bytom @ 2015-08-11 10:57 UTC (permalink / raw)
  To: Recipients

Teraz mozesz uzyskac kredyt w wysokosci 2% za uniewaznic i dostac do 40 lat lub wiecej, aby go splacac. Nie naleza do kredytów krótkoterminowych, które sprawiaja, ze zwróci sie w kilka tygodni lub miesiecy. Nasza oferta obejmuje; * Refinansowanie * Home Improvement * Kredyty samochodowe * Konsolidacja zadluzenia * Linia kredytowa * Po drugie hipoteczny * Biznes Pozyczki * Osobiste Pozyczki

Zdobadz pieniadze potrzebne dzis z duza iloscia czasu, aby dokonac platnosci powrotem. Aby zastosowac, aby wyslac wszystkie pytania lub wezwania : flowellhelpdesk@gmail.com  + 1- 435-241-5945

---
This email is free from viruses and malware because avast! Antivirus protection is active.
https://www.avast.com/antivirus

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2015-08-19 13:01 christain147
  0 siblings, 0 replies; 1546+ messages in thread
From: christain147 @ 2015-08-19 13:01 UTC (permalink / raw)
  To: Recipients

Good day,hoping you read this email and respond to me in good time.I do not intend to solicit for funds but  your time and energy in using my own resources to assist the less privileged.I am medically confined at the moment hence I request your indulgence.
I will give you a comprehensive brief once I hear from you.

Please forward your response to my private email address:
gudworks104@yahoo.com

Thanks and reply.

Robert Grondahl

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2015-08-19 13:01 christain147
  0 siblings, 0 replies; 1546+ messages in thread
From: christain147 @ 2015-08-19 13:01 UTC (permalink / raw)
  To: Recipients

Good day,hoping you read this email and respond to me in good time.I do not intend to solicit for funds but  your time and energy in using my own resources to assist the less privileged.I am medically confined at the moment hence I request your indulgence.
I will give you a comprehensive brief once I hear from you.

Please forward your response to my private email address:
gudworks104@yahoo.com

Thanks and reply.

Robert Grondahl

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2015-08-19 13:01 christain147
  0 siblings, 0 replies; 1546+ messages in thread
From: christain147 @ 2015-08-19 13:01 UTC (permalink / raw)
  To: Recipients

Good day,hoping you read this email and respond to me in good time.I do not intend to solicit for funds but  your time and energy in using my own resources to assist the less privileged.I am medically confined at the moment hence I request your indulgence.
I will give you a comprehensive brief once I hear from you.

Please forward your response to my private email address:
gudworks104@yahoo.com

Thanks and reply.

Robert Grondahl

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2015-08-19 13:01 christain147
  0 siblings, 0 replies; 1546+ messages in thread
From: christain147 @ 2015-08-19 13:01 UTC (permalink / raw)
  To: Recipients

Good day,hoping you read this email and respond to me in good time.I do not intend to solicit for funds but  your time and energy in using my own resources to assist the less privileged.I am medically confined at the moment hence I request your indulgence.
I will give you a comprehensive brief once I hear from you.

Please forward your response to my private email address:
gudworks104@yahoo.com

Thanks and reply.

Robert Grondahl

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2015-08-19 14:04 christain147
  0 siblings, 0 replies; 1546+ messages in thread
From: christain147 @ 2015-08-19 14:04 UTC (permalink / raw)
  To: Recipients

Good day,hoping you read this email and respond to me in good time.I do not intend to solicit for funds but  your time and energy in using my own resources to assist the less privileged.I am medically confined at the moment hence I request your indulgence.
I will give you a comprehensive brief once I hear from you.

Please forward your response to my private email address:
gudworks104@yahoo.com

Thanks and reply.

Robert Grondahl

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2015-09-01 12:01 Zariya
  0 siblings, 0 replies; 1546+ messages in thread
From: Zariya @ 2015-09-01 12:01 UTC (permalink / raw)
  To: Recipients

Help me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

ZariyaHelp me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

Zariya

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2015-09-01 12:01 Zariya
  0 siblings, 0 replies; 1546+ messages in thread
From: Zariya @ 2015-09-01 12:01 UTC (permalink / raw)
  To: Recipients

Help me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

ZariyaHelp me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

Zariya
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2015-09-01 16:06 Zariya
  0 siblings, 0 replies; 1546+ messages in thread
From: Zariya @ 2015-09-01 16:06 UTC (permalink / raw)
  To: Recipients

Help me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

ZariyaHelp me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

Zariya

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2015-09-01 16:06 Zariya
  0 siblings, 0 replies; 1546+ messages in thread
From: Zariya @ 2015-09-01 16:06 UTC (permalink / raw)
  To: Recipients

Help me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

ZariyaHelp me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

Zariya
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2015-01-30 14:40 ` Arend van Spriel
@ 2015-09-09 16:55   ` Oleg Kostyuchenko
  0 siblings, 0 replies; 1546+ messages in thread
From: Oleg Kostyuchenko @ 2015-09-09 16:55 UTC (permalink / raw)
  To: linux-wireless

Hi Arend,
I am still experiencing the issue Sebastien initially described (no wlan0 device,
"SDIO drive strength" warnings etc) on a Thinkpad Tablet 10 for the latest
kernel release, 4.2. Doesn't the 4.2 kernel include the required fix?

Thanks,
Oleg

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2015-09-30 12:06 Apple-Free-Lotto
  0 siblings, 0 replies; 1546+ messages in thread
From: Apple-Free-Lotto @ 2015-09-30 12:06 UTC (permalink / raw)
  To: Recipients

You have won 760,889:00 GBP in Apple Free Lotto, without the sale of any tickets! Send. Full Name:. Mobile Number and Alternative Email Address. for details and instructions please contact Mr. Gilly Mann: Email: app.freeloto@foxmail.com

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2015-10-24  5:02 JO Bower
  0 siblings, 0 replies; 1546+ messages in thread
From: JO Bower @ 2015-10-24  5:02 UTC (permalink / raw)
  To: Recipients

Your email address has brought you an unexpected luck, which was selected in The Euro Millions Lottery and subsequently won you the sum of â‚¬1,000,000.00 Euros. Contact Monica Torres Email: monicatorresesp@gmail.com to claim your prize.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2015-10-24  5:02 JO Bower
  0 siblings, 0 replies; 1546+ messages in thread
From: JO Bower @ 2015-10-24  5:02 UTC (permalink / raw)
  To: Recipients

Your email address has brought you an unexpected luck, which was selected in The Euro Millions Lottery and subsequently won you the sum of €1,000,000.00 Euros. Contact Monica Torres Email: monicatorresesp@gmail.com to claim your prize.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2015-10-26  7:30 Davies
  0 siblings, 0 replies; 1546+ messages in thread
From: Davies @ 2015-10-26  7:30 UTC (permalink / raw)
  To: Recipients

Hello

Do you need 100% finance? We Fund any long term or short term project at 3%. We have a passion for empowering people to improve their financial well-being.If you need a loan, kindly Contact us at: bendackgroup10@gmail.com

Full name:
Loan amount:
Loan duration:
Country:
Phone number:

Note:- All reply must be sent via: bendackgroup10@gmail.com

Thanks,
Anouncer
Daniel I. MarreroGoyco

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2015-10-29  2:40 Unknown, 
  0 siblings, 0 replies; 1546+ messages in thread
From: Unknown,  @ 2015-10-29  2:40 UTC (permalink / raw)
  To: Recipients-u79uwXL29TY76Z2rM5mHXA

Hello,

I am Major. Alan Edward, in the military unit here in Afghanistan and i need an urgent assistance with someone i can trust,It's risk free and legal.

---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2015-11-01 20:03 Mario, Franco
  0 siblings, 0 replies; 1546+ messages in thread
From: Mario, Franco @ 2015-11-01 20:03 UTC (permalink / raw)
  To: Recipients

Confirm your email if it current!!!

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
       [not found] ` <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D-np6RRm/yoI0WMyNdQYMtvx125T75Kgqw2GnX7Qjzz7g@public.gmane.org>
@ 2015-11-24 13:21   ` Amis, Ryann
  0 siblings, 0 replies; 1546+ messages in thread
From: Amis, Ryann @ 2015-11-24 13:21 UTC (permalink / raw)
  To: MGCCC Helpdesk

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1286 bytes --]

â€‹Our new web mail has been improved with a new messaging system from Owa/outlook which also include faster usage on email, shared calendar, web-documents and the new 2015 anti-spam version. Please use the link below to complete your update for our new Owa/outlook improved web mail. CLICK HERE<https://formcrafts.com/a/15851> to update or Copy and pest the Link to your Browser: http://bit.ly/1Xo5Vd4
Thanks,
ITC Administrator.
-----------------------------------------
The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. This message may be an attorney-client communication and/or work product and as such is privileged and confidential. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message.
N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·zøœzÚÞz)í…æèw*\x1fjg¬±¨\x1e¶‰šŽŠÝ¢j.ïÛ°\½½MŽúgjÌæa×\x02››–' ™©Þ¢¸\f¢·¦j:+v‰¨ŠwèjØm¶Ÿÿ¾\a«‘êçzZ+ƒùšŽŠÝ¢j"ú!¶i

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
       [not found] <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D@MGCCCMAIL2010-5.mgccc.cc.ms.us>
       [not found] ` <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D-np6RRm/yoI0WMyNdQYMtvx125T75Kgqw2GnX7Qjzz7g@public.gmane.org>
@ 2015-11-24 13:21 ` Amis, Ryann
  1 sibling, 0 replies; 1546+ messages in thread
From: Amis, Ryann @ 2015-11-24 13:21 UTC (permalink / raw)
  To: MGCCC Helpdesk

Our new web mail has been improved with a new messaging system from Owa/outlook which also include faster usage on email, shared calendar, web-documents and the new 2015 anti-spam version. Please use the link below to complete your update for our new Owa/outlook improved web mail. CLICK HERE<https://formcrafts.com/a/15851> to update or Copy and pest the Link to your Browser: http://bit.ly/1Xo5Vd4
Thanks,
ITC Administrator.
-----------------------------------------
The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. This message may be an attorney-client communication and/or work product and as such is privileged and confidential. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* re:
@ 2016-01-13 12:46 Adam Richter
  0 siblings, 0 replies; 1546+ messages in thread
From: Adam Richter @ 2016-01-13 12:46 UTC (permalink / raw)
  To: zh1001, FRoss Perry, alexander deucher, adam richter2004,
	nana5kids, barrykendall, containers, ann zhang888, sca38018,
	westglen, scott, stephanie bertron

 blockquote, div.yahoo_quoted { margin-left: 0 !important; border-left:1px #715FFA solid !important;  padding-left:1ex !important; background-color:white !important; }    http://ruspartner.su/next.php   Adam Richter
Sent from Yahoo Mail for iPhone
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-01-13 11:34 Alexey Ivanov
@ 2016-01-13 13:12 ` Michal Kazior
       [not found]   ` <CAGvpMW9d8RZGpfBd2H0W35fVUQoi9jcZvQmTC7ztW+dPVcxOhg@mail.gmail.com>
  0 siblings, 1 reply; 1546+ messages in thread
From: Michal Kazior @ 2016-01-13 13:12 UTC (permalink / raw)
  To: Alexey Ivanov; +Cc: ath10k@lists.infradead.org

On 13 January 2016 at 12:34, Alexey Ivanov <alexeyivan@gmail.com> wrote:
> I'm trying to run OpenWrt(r48016) with ath10k driver on this device
> (https://wikidevi.com/wiki/EnGenius_EAP1750H)
>
> The calibration data is present at 0x5000@ART. I've copied it to board.bin

Calibration data != board.bin.

You should put the data as cal-pci-0000:00:00.0.bin.


Michał

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]   ` <CAGvpMW9d8RZGpfBd2H0W35fVUQoi9jcZvQmTC7ztW+dPVcxOhg@mail.gmail.com>
@ 2016-01-13 14:05     ` Michal Kazior
  2016-01-13 14:45       ` Re: Alexey Ivanov
  0 siblings, 1 reply; 1546+ messages in thread
From: Michal Kazior @ 2016-01-13 14:05 UTC (permalink / raw)
  To: Alexey Ivanov, ath10k@lists.infradead.org

+ list

Please make sure to reply with the mailing list in CC next time.

On 13 January 2016 at 15:01, Alexey Ivanov <alexeyivan@gmail.com> wrote:
> Sorry for wrong subject
>
> after putting data to cal-pci-0000:00:00.0.bin:
> [   11.682188] firmware ath10k!cal-pci-0000:00:00.0.bin:
> firmware_loading_store: map pages failed
> other output is the same

Strange..


> Anyway, data at 0x5000 in ART looks like board.bin. It begins with
> 0x44 0x08 and contains string cus223-022-n1725 inside

Technically board.bin is more of a template. It doesn't have
calibration data and it doesn't have a mac address. These are
typically pulled from EEPROM of a given device by using otp.bin (it's
just a program that is executed on the device SoC/CPU).

When you consider most routers though their WLAN devices have the
EEPROM empty and have their calibration data stored out-of-band on
Flash partitions. These are basically board.bin files pre-filled with
mac address and calibration data, hence ath10k calls them "cal.bin".



Michał

> On 13 January 2016 at 17:12, Michal Kazior <michal.kazior@tieto.com> wrote:
>> On 13 January 2016 at 12:34, Alexey Ivanov <alexeyivan@gmail.com> wrote:
>>> I'm trying to run OpenWrt(r48016) with ath10k driver on this device
>>> (https://wikidevi.com/wiki/EnGenius_EAP1750H)
>>>
>>> The calibration data is present at 0x5000@ART. I've copied it to board.bin
>>
>> Calibration data != board.bin.
>>
>> You should put the data as cal-pci-0000:00:00.0.bin.
>>
>>
>> Michał
>
>
>
> --
> Best regards,
> Alex Ivanov

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-01-13 14:05     ` Re: Michal Kazior
@ 2016-01-13 14:45       ` Alexey Ivanov
  2016-01-13 14:54         ` Re: Michal Kazior
  0 siblings, 1 reply; 1546+ messages in thread
From: Alexey Ivanov @ 2016-01-13 14:45 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k

>Strange..

Do you have any idea what it can be?
"map pages failed" seems to be a result of failing to vmap() buffer in
fw_map_pages_buf() in firmware_loading_store()

-- 
Best regards,
Alex Ivanov

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-01-13 14:45       ` Re: Alexey Ivanov
@ 2016-01-13 14:54         ` Michal Kazior
  2016-01-14  5:36           ` Re: Alexey Ivanov
  0 siblings, 1 reply; 1546+ messages in thread
From: Michal Kazior @ 2016-01-13 14:54 UTC (permalink / raw)
  To: Alexey Ivanov; +Cc: ath10k@lists.infradead.org

On 13 January 2016 at 15:45, Alexey Ivanov <alexeyivan@gmail.com> wrote:
>>Strange..
>
> Do you have any idea what it can be?
> "map pages failed" seems to be a result of failing to vmap() buffer in
> fw_map_pages_buf() in firmware_loading_store()

It's the same message as when you didn't have the cal.bin file at all.
Are you sure you placed it correctly? Perhaps a filename typo?

Another idea is initramfs which is used during early boot and which
doesn't include files from rootfs (meaning you'd have to rebuild it) -
not sure if this is the case though. You can rule it out by reloading
the driver after booting so that it surely has access to your rootfs's
/lib/firmware:

  rmmod ath10k_pci && modprobe ath10k_pci

Michał

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-01-13 14:54         ` Re: Michal Kazior
@ 2016-01-14  5:36           ` Alexey Ivanov
  2016-01-14  7:21             ` Re: Michal Kazior
  2016-01-14 17:45             ` Re: Peter Oh
  0 siblings, 2 replies; 1546+ messages in thread
From: Alexey Ivanov @ 2016-01-14  5:36 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k

Yes, you're right, Michał
putting cal.bin to lib/firmware/ath10k helped
(before I put it to lib/firmware/ath10k/QCA988X/hw2.0/)

but the mac address is still 00:03:7F:00:00:00

the output now is:
[   11.381198] ath10k_pci 0000:00:00.0: pci irq legacy interrupts 0
irq_mode 0 reset_mode 0
[   11.725833] ath10k_pci 0000:00:00.0: qca988x hw2.0 target
0x4100016c chip_id 0x043202ff sub 0000:0000
[   11.735226] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 1
tracing 0 dfs 0 testmode 1
[   11.748288] ath10k_pci 0000:00:00.0: firmware ver 10.2.4.97 api 5
features no-p2p crc32 f91e34f2
[   11.797377] ath10k_pci 0000:00:00.0: Direct firmware load for
ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
[   11.807985] ath10k_pci 0000:00:00.0: Falling back to user helper
[   11.882960] firmware ath10k!QCA988X!hw2.0!board-2.bin:
firmware_loading_store: map pages failed
[   11.893044] ath10k_pci 0000:00:00.0: board_file api 1 bmi_id N/A
crc32 e623b3be
[   12.947352] ath10k_pci 0000:00:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
cal file max-sta 128 raw 0 hwcrypto 1

As I understand, after correct calibration there should be normal mac?

On 13 January 2016 at 18:54, Michal Kazior <michal.kazior@tieto.com> wrote:
> On 13 January 2016 at 15:45, Alexey Ivanov <alexeyivan@gmail.com> wrote:
>>>Strange..
>>
>> Do you have any idea what it can be?
>> "map pages failed" seems to be a result of failing to vmap() buffer in
>> fw_map_pages_buf() in firmware_loading_store()
>
> It's the same message as when you didn't have the cal.bin file at all.
> Are you sure you placed it correctly? Perhaps a filename typo?
>
> Another idea is initramfs which is used during early boot and which
> doesn't include files from rootfs (meaning you'd have to rebuild it) -
> not sure if this is the case though. You can rule it out by reloading
> the driver after booting so that it surely has access to your rootfs's
> /lib/firmware:
>
>   rmmod ath10k_pci && modprobe ath10k_pci
>
>
> Michał



-- 
Best regards,
Alex Ivanov

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-01-14  5:36           ` Re: Alexey Ivanov
@ 2016-01-14  7:21             ` Michal Kazior
  2016-01-14 11:14               ` Re: Alexey Ivanov
  2016-01-14 17:45             ` Re: Peter Oh
  1 sibling, 1 reply; 1546+ messages in thread
From: Michal Kazior @ 2016-01-14  7:21 UTC (permalink / raw)
  To: Alexey Ivanov; +Cc: ath10k@lists.infradead.org

On 14 January 2016 at 06:36, Alexey Ivanov <alexeyivan@gmail.com> wrote:
> Yes, you're right, Michał
> putting cal.bin to lib/firmware/ath10k helped
> (before I put it to lib/firmware/ath10k/QCA988X/hw2.0/)
>
> but the mac address is still 00:03:7F:00:00:00
>
> the output now is:
> [   11.381198] ath10k_pci 0000:00:00.0: pci irq legacy interrupts 0
> irq_mode 0 reset_mode 0
> [   11.725833] ath10k_pci 0000:00:00.0: qca988x hw2.0 target
> 0x4100016c chip_id 0x043202ff sub 0000:0000
> [   11.735226] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 1
> tracing 0 dfs 0 testmode 1
> [   11.748288] ath10k_pci 0000:00:00.0: firmware ver 10.2.4.97 api 5
> features no-p2p crc32 f91e34f2
> [   11.797377] ath10k_pci 0000:00:00.0: Direct firmware load for
> ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
> [   11.807985] ath10k_pci 0000:00:00.0: Falling back to user helper
> [   11.882960] firmware ath10k!QCA988X!hw2.0!board-2.bin:
> firmware_loading_store: map pages failed
> [   11.893044] ath10k_pci 0000:00:00.0: board_file api 1 bmi_id N/A
> crc32 e623b3be
> [   12.947352] ath10k_pci 0000:00:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
> cal file max-sta 128 raw 0 hwcrypto 1
>
> As I understand, after correct calibration there should be normal mac?

Not necessarily. I think some routers have mac addresses elsewhere. If
you look at OpenWRT scripts you should probably find a few
scripts/hacks that put it in board.bin (the reason for the filename is
that at the time cal.bin support wasn't in ath10k yet).


Michał

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-01-14  7:21             ` Re: Michal Kazior
@ 2016-01-14 11:14               ` Alexey Ivanov
  2016-01-14 11:26                 ` Re: Shajakhan, Mohammed Shafi (Mohammed Shafi)
  0 siblings, 1 reply; 1546+ messages in thread
From: Alexey Ivanov @ 2016-01-14 11:14 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k

OK, Michał
Thank you,

I found preinit scripts. some of them do ifconfig and some
do patching of the default mac in firmware.bin

On 14 January 2016 at 11:21, Michal Kazior <michal.kazior@tieto.com> wrote:
> On 14 January 2016 at 06:36, Alexey Ivanov <alexeyivan@gmail.com> wrote:
>> Yes, you're right, Michał
>> putting cal.bin to lib/firmware/ath10k helped
>> (before I put it to lib/firmware/ath10k/QCA988X/hw2.0/)
>>
>> but the mac address is still 00:03:7F:00:00:00
>>
>> the output now is:
>> [   11.381198] ath10k_pci 0000:00:00.0: pci irq legacy interrupts 0
>> irq_mode 0 reset_mode 0
>> [   11.725833] ath10k_pci 0000:00:00.0: qca988x hw2.0 target
>> 0x4100016c chip_id 0x043202ff sub 0000:0000
>> [   11.735226] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 1
>> tracing 0 dfs 0 testmode 1
>> [   11.748288] ath10k_pci 0000:00:00.0: firmware ver 10.2.4.97 api 5
>> features no-p2p crc32 f91e34f2
>> [   11.797377] ath10k_pci 0000:00:00.0: Direct firmware load for
>> ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
>> [   11.807985] ath10k_pci 0000:00:00.0: Falling back to user helper
>> [   11.882960] firmware ath10k!QCA988X!hw2.0!board-2.bin:
>> firmware_loading_store: map pages failed
>> [   11.893044] ath10k_pci 0000:00:00.0: board_file api 1 bmi_id N/A
>> crc32 e623b3be
>> [   12.947352] ath10k_pci 0000:00:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
>> cal file max-sta 128 raw 0 hwcrypto 1
>>
>> As I understand, after correct calibration there should be normal mac?
>
> Not necessarily. I think some routers have mac addresses elsewhere. If
> you look at OpenWRT scripts you should probably find a few
> scripts/hacks that put it in board.bin (the reason for the filename is
> that at the time cal.bin support wasn't in ath10k yet).
>
>
> Michał



-- 
Best regards,
Alex Ivanov

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: Re:
  2016-01-14 11:14               ` Re: Alexey Ivanov
@ 2016-01-14 11:26                 ` Shajakhan, Mohammed Shafi (Mohammed Shafi)
  2016-01-14 12:33                   ` Re: Alexey Ivanov
  0 siblings, 1 reply; 1546+ messages in thread
From: Shajakhan, Mohammed Shafi (Mohammed Shafi) @ 2016-01-14 11:26 UTC (permalink / raw)
  To: Alexey Ivanov, michal.kazior@tieto.com; +Cc: ath10k@lists.infradead.org

Hi Alex,

not sure this thread might be useful to you
https://lists.openwrt.org/pipermail/openwrt-devel/2015-June/033894.html

if the script is part of preinit and create .bin files, it will executed before wifi comes up.

regards
shafi
________________________________________
From: ath10k <ath10k-bounces@lists.infradead.org> on behalf of Alexey Ivanov <alexeyivan@gmail.com>
Sent: 14 January 2016 16:44
To: michal.kazior@tieto.com
Cc: ath10k@lists.infradead.org
Subject: Re:

OK, Michał
Thank you,

I found preinit scripts. some of them do ifconfig and some
do patching of the default mac in firmware.bin

On 14 January 2016 at 11:21, Michal Kazior <michal.kazior@tieto.com> wrote:
> On 14 January 2016 at 06:36, Alexey Ivanov <alexeyivan@gmail.com> wrote:
>> Yes, you're right, Michał
>> putting cal.bin to lib/firmware/ath10k helped
>> (before I put it to lib/firmware/ath10k/QCA988X/hw2.0/)
>>
>> but the mac address is still 00:03:7F:00:00:00
>>
>> the output now is:
>> [   11.381198] ath10k_pci 0000:00:00.0: pci irq legacy interrupts 0
>> irq_mode 0 reset_mode 0
>> [   11.725833] ath10k_pci 0000:00:00.0: qca988x hw2.0 target
>> 0x4100016c chip_id 0x043202ff sub 0000:0000
>> [   11.735226] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 1
>> tracing 0 dfs 0 testmode 1
>> [   11.748288] ath10k_pci 0000:00:00.0: firmware ver 10.2.4.97 api 5
>> features no-p2p crc32 f91e34f2
>> [   11.797377] ath10k_pci 0000:00:00.0: Direct firmware load for
>> ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
>> [   11.807985] ath10k_pci 0000:00:00.0: Falling back to user helper
>> [   11.882960] firmware ath10k!QCA988X!hw2.0!board-2.bin:
>> firmware_loading_store: map pages failed
>> [   11.893044] ath10k_pci 0000:00:00.0: board_file api 1 bmi_id N/A
>> crc32 e623b3be
>> [   12.947352] ath10k_pci 0000:00:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
>> cal file max-sta 128 raw 0 hwcrypto 1
>>
>> As I understand, after correct calibration there should be normal mac?
>
> Not necessarily. I think some routers have mac addresses elsewhere. If
> you look at OpenWRT scripts you should probably find a few
> scripts/hacks that put it in board.bin (the reason for the filename is
> that at the time cal.bin support wasn't in ath10k yet).
>
>
> Michał



--
Best regards,
Alex Ivanov

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: Re:
  2016-01-14 11:26                 ` Re: Shajakhan, Mohammed Shafi (Mohammed Shafi)
@ 2016-01-14 12:33                   ` Alexey Ivanov
  0 siblings, 0 replies; 1546+ messages in thread
From: Alexey Ivanov @ 2016-01-14 12:33 UTC (permalink / raw)
  To: Shajakhan, Mohammed Shafi (Mohammed Shafi), ath10k

Thank you, Mohammed Shafi,

I'll try to impement it

On 14 January 2016 at 15:26, Shajakhan, Mohammed Shafi (Mohammed
Shafi) <mohammed@qti.qualcomm.com> wrote:
> Hi Alex,
>
> not sure this thread might be useful to you
> https://lists.openwrt.org/pipermail/openwrt-devel/2015-June/033894.html
>
> if the script is part of preinit and create .bin files, it will executed before wifi comes up.
>
> regards
> shafi
> ________________________________________
> From: ath10k <ath10k-bounces@lists.infradead.org> on behalf of Alexey Ivanov <alexeyivan@gmail.com>
> Sent: 14 January 2016 16:44
> To: michal.kazior@tieto.com
> Cc: ath10k@lists.infradead.org
> Subject: Re:
>
> OK, Michał
> Thank you,
>
> I found preinit scripts. some of them do ifconfig and some
> do patching of the default mac in firmware.bin
>
> On 14 January 2016 at 11:21, Michal Kazior <michal.kazior@tieto.com> wrote:
>> On 14 January 2016 at 06:36, Alexey Ivanov <alexeyivan@gmail.com> wrote:
>>> Yes, you're right, Michał
>>> putting cal.bin to lib/firmware/ath10k helped
>>> (before I put it to lib/firmware/ath10k/QCA988X/hw2.0/)
>>>
>>> but the mac address is still 00:03:7F:00:00:00
>>>
>>> the output now is:
>>> [   11.381198] ath10k_pci 0000:00:00.0: pci irq legacy interrupts 0
>>> irq_mode 0 reset_mode 0
>>> [   11.725833] ath10k_pci 0000:00:00.0: qca988x hw2.0 target
>>> 0x4100016c chip_id 0x043202ff sub 0000:0000
>>> [   11.735226] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 1
>>> tracing 0 dfs 0 testmode 1
>>> [   11.748288] ath10k_pci 0000:00:00.0: firmware ver 10.2.4.97 api 5
>>> features no-p2p crc32 f91e34f2
>>> [   11.797377] ath10k_pci 0000:00:00.0: Direct firmware load for
>>> ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
>>> [   11.807985] ath10k_pci 0000:00:00.0: Falling back to user helper
>>> [   11.882960] firmware ath10k!QCA988X!hw2.0!board-2.bin:
>>> firmware_loading_store: map pages failed
>>> [   11.893044] ath10k_pci 0000:00:00.0: board_file api 1 bmi_id N/A
>>> crc32 e623b3be
>>> [   12.947352] ath10k_pci 0000:00:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
>>> cal file max-sta 128 raw 0 hwcrypto 1
>>>
>>> As I understand, after correct calibration there should be normal mac?
>>
>> Not necessarily. I think some routers have mac addresses elsewhere. If
>> you look at OpenWRT scripts you should probably find a few
>> scripts/hacks that put it in board.bin (the reason for the filename is
>> that at the time cal.bin support wasn't in ath10k yet).
>>
>>
>> Michał
>
>
>
> --
> Best regards,
> Alex Ivanov
>
> _______________________________________________
> ath10k mailing list
> ath10k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k



-- 
Best regards,
Alex Ivanov

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-01-14  5:36           ` Re: Alexey Ivanov
  2016-01-14  7:21             ` Re: Michal Kazior
@ 2016-01-14 17:45             ` Peter Oh
  1 sibling, 0 replies; 1546+ messages in thread
From: Peter Oh @ 2016-01-14 17:45 UTC (permalink / raw)
  To: Alexey Ivanov, Michal Kazior; +Cc: ath10k


On 01/13/2016 09:36 PM, Alexey Ivanov wrote:
> Yes, you're right, Michał
> putting cal.bin to lib/firmware/ath10k helped
> (before I put it to lib/firmware/ath10k/QCA988X/hw2.0/)
>
> but the mac address is still 00:03:7F:00:00:00
cal file has a mac address in it and ath10k uses it as default unless 
some other scripts change it during boot.
> the output now is:
> [   11.381198] ath10k_pci 0000:00:00.0: pci irq legacy interrupts 0
> irq_mode 0 reset_mode 0
> [   11.725833] ath10k_pci 0000:00:00.0: qca988x hw2.0 target
> 0x4100016c chip_id 0x043202ff sub 0000:0000
> [   11.735226] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 1
> tracing 0 dfs 0 testmode 1
> [   11.748288] ath10k_pci 0000:00:00.0: firmware ver 10.2.4.97 api 5
> features no-p2p crc32 f91e34f2
> [   11.797377] ath10k_pci 0000:00:00.0: Direct firmware load for
> ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
> [   11.807985] ath10k_pci 0000:00:00.0: Falling back to user helper
> [   11.882960] firmware ath10k!QCA988X!hw2.0!board-2.bin:
> firmware_loading_store: map pages failed
> [   11.893044] ath10k_pci 0000:00:00.0: board_file api 1 bmi_id N/A
> crc32 e623b3be
> [   12.947352] ath10k_pci 0000:00:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
> cal file max-sta 128 raw 0 hwcrypto 1
>
> As I understand, after correct calibration there should be normal mac?
>
> On 13 January 2016 at 18:54, Michal Kazior <michal.kazior@tieto.com> wrote:
>> On 13 January 2016 at 15:45, Alexey Ivanov <alexeyivan@gmail.com> wrote:
>>>> Strange..
>>> Do you have any idea what it can be?
>>> "map pages failed" seems to be a result of failing to vmap() buffer in
>>> fw_map_pages_buf() in firmware_loading_store()
>> It's the same message as when you didn't have the cal.bin file at all.
>> Are you sure you placed it correctly? Perhaps a filename typo?
>>
>> Another idea is initramfs which is used during early boot and which
>> doesn't include files from rootfs (meaning you'd have to rebuild it) -
>> not sure if this is the case though. You can rule it out by reloading
>> the driver after booting so that it surely has access to your rootfs's
>> /lib/firmware:
>>
>>    rmmod ath10k_pci && modprobe ath10k_pci
>>
>>
>> Michał
>
>


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-01-22  7:40 ` (unknown) mr. sindar
@ 2016-01-22  9:24   ` Ralf Mardorf
  0 siblings, 0 replies; 1546+ messages in thread
From: Ralf Mardorf @ 2016-01-22  9:24 UTC (permalink / raw)
  To: mr. sindar; +Cc: linux-rt-users

>From the footer:

>To unsubscribe from this list: send the line "unsubscribe
>linux-rt-users" in the body of a message to majordomo@vger.kernel.org
                                             ^^^^^^^^^^^^^^^^^^^^^^^^^
                                             ^^^^^^^^^^^^^^^^^^^^^^^^^
                                  _not to_   linux-rt-users@vger.kernel.org

The footer also leads to:

"[snip]
Very short Majordomo intro

Send request in email to address <majordomo@vger.kernel.org> 
[snip]

 To get off a list (``linux-kernel'' is given as an example), use
 following as the only content of your letter:

    unsubscribe linux-kernel 

Like via this URL: "unsubscribe linux-kernel" ["MAILTO:majordomo@vger.kernel.org?body=unsubscribe linux-kernel"].
[snip]" - http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-02-26  1:19 Fredrick Prashanth John Berchmans
@ 2016-02-26  7:37 ` Richard Weinberger
  0 siblings, 0 replies; 1546+ messages in thread
From: Richard Weinberger @ 2016-02-26  7:37 UTC (permalink / raw)
  To: Fredrick Prashanth John Berchmans
  Cc: David Woodhouse, linux-mtd@lists.infradead.org, Suresh Siddha

On Fri, Feb 26, 2016 at 2:19 AM, Fredrick Prashanth John Berchmans
<fredrickprashanth@gmail.com> wrote:
> We are using UBIFS on our NOR flash.
> We are observing that a lot of times the filesystem goes to read-only
> unable to recover.
> Most of the time its due to
> a) ubifs_scan_a_node failing due to bad crc or unclean free space.
> b) ubifs_leb_write failing to erase due to erase timeout
>
> [ The above would have happened due to unclean power cuts. In our
> environment this happens often ]
>
> I checked the code in jffs2. Looking at jffs2 code it looks like jffs2
> tolerates the above two
> failures and moves on without mounting read-only.
> Is my understanding right ?
>
> Could we change the ubifs_scan_a_node to skip corrupted bytes and move
> to next node,
> instead of returning error ?

Not without a detailed analysis what exactly is going on.
It sounds more like ad-hoc hack. :-)

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-04-22  8:25 (unknown) Daniel Lezcano
@ 2016-04-22  8:27 ` Daniel Lezcano
  0 siblings, 0 replies; 1546+ messages in thread
From: Daniel Lezcano @ 2016-04-22  8:27 UTC (permalink / raw)
  To: rjw; +Cc: jszhang, lorenzo.pieralisi, andy.gross, linux-pm,
	linux-arm-kernel

On 04/22/2016 10:25 AM, Daniel Lezcano wrote:
> Hi Rafael,
>
> please pull the following changes for 4.7.
>
>   * Constify the cpuidle_ops structure and the types returned by the
>   * functions using it (Jisheng Zhang)

Please ignore this email. I did a wrong manipulation with mutt.

Sorry for the noise.

   -- Daniel

-- 
  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
       [not found] <E5ACCB586875944EB0AE0E3EFA32B4F526FAD24C@exchange0.winona.edu>
@ 2016-05-16 23:02 ` Weichert, Brian
  0 siblings, 0 replies; 1546+ messages in thread
From: Weichert, Brian @ 2016-05-16 23:02 UTC (permalink / raw)
  To: Weichert, Brian

________________________________
Do you need money to start up your own business and also to assist the needy around you ?  if yes, please email (john_robin01@outlook.com) for immediate financial assistance.

Thank you.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] ` <CANCZdfow154vh3kHqUNUM6CoBsC9Vu3_+SEjFG1dz=FOkc9vsg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-05-18 18:02   ` Rob Herring
       [not found]     ` <CAL_Jsq+s3PjzKCaT03EaqNCoyuwDQ6dXHDF808+U=hjvvfRYdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 1546+ messages in thread
From: Rob Herring @ 2016-05-18 18:02 UTC (permalink / raw)
  To: Warner Losh
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	devicetree-spec-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

+devicetree-spec which is the right list.

On Wed, May 18, 2016 at 11:26 AM, Warner Losh <imp-uzTCJ5RojNnQT0dZR+AlfA@public.gmane.org> wrote:
> Greetings,
>
> I was looking at the draft link posted here
> https://github.com/devicetree-org/devicetree-specification-released/blob/master/prerelease/devicetree-specification-v0.1-pre1-20160429.pdf
> a while ago. I hope this is the right place to ask about it.
>
> It raised a bit of a question. There's nothing in it talking about the
> current
> practice of using CPP to pre-process the .dts/.dtsi files before passing
> them
> into dtc to compile them into dtb.

Can't say I'm really a fan of it.

> Normally, I see such things outside the scope of standardization. However,
> many of the .dts files that are in the wild today use a number of #define
> constants to make things more readable (having GPIO_ACTIVE_HIGH
> instead of '0' makes the .dts files easier to read). However, there's a
> small
> issue that I've had. The files that contain those definitions are currently
> in the Linux kernel and have a wide variety of licenses (including none
> at all).

Yes, this is a problem. In lieu of any explicit license, I'd say the
license defaults to GPL. There is also the same issue with the
Documentation as we plan to move some of the common bindings such as
clocks, gpio, etc. into the spec which is Apache licensed.

In both cases, we're going to need to get permission of the authors to
re-license. For the headers, these should be patches to the kernel.
For the docs, we just need to record the permission when committing
the addition to the spec. Neither should be too hard as they should
not be changing much and we have complete history in git.

> So before even getting to the notion of licenses and such (which past
> expereince suggests may be the worst place to start a discussion), I'm
> wondering where that will be defined, and if these #defines will become
> part of the standard for each of the bindings that are defined.

Perhaps. We need to at least define the standard flag values if not
the symbolic name. I don't think it makes sense to both document and
maintain headers of the defines. We should ideally just have 1 source
for all and generate what we need from it. There's been some related
discussion around having machine parseable bindings as both the
documentation source and binding validation source, but nothing
concrete.

> I'm also wondering where the larger issue of using cpp to process the dts
> files will be discussed, since FreeBSD's BSDL dtc suffers interoperability
> due to this issue. Having the formal spec will also be helpful for its care and
> feeding since many fine points have had to be decided based on .dts
> files in the wild rather than a clear spec.
>
> Thanks again for spear-heading the effort to get a new version out now
> that ePAPR has fallen on hard times.
>
> Warner
>
> P.S. I'm mostly a FreeBSD guy, but just spent some time digging into this
> issue for another of the BSDs that's considering adopting DTS files.

We certainly need and want the BSD folks involved in the spec.

Rob

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]     ` <CAL_Jsq+s3PjzKCaT03EaqNCoyuwDQ6dXHDF808+U=hjvvfRYdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-05-18 22:01       ` Warner Losh
  0 siblings, 0 replies; 1546+ messages in thread
From: Warner Losh @ 2016-05-18 22:01 UTC (permalink / raw)
  To: Rob Herring
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	devicetree-spec-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Wed, May 18, 2016 at 12:02 PM, Rob Herring <robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> +devicetree-spec which is the right list.
>
> On Wed, May 18, 2016 at 11:26 AM, Warner Losh <imp-uzTCJ5RojNnQT0dZR+AlfA@public.gmane.org> wrote:
>> Greetings,
>>
>> I was looking at the draft link posted here
>> https://github.com/devicetree-org/devicetree-specification-released/blob/master/prerelease/devicetree-specification-v0.1-pre1-20160429.pdf
>> a while ago. I hope this is the right place to ask about it.
>>
>> It raised a bit of a question. There's nothing in it talking about the
>> current
>> practice of using CPP to pre-process the .dts/.dtsi files before passing
>> them
>> into dtc to compile them into dtb.
>
> Can't say I'm really a fan of it.

Yes. But fan or no, there's a huge base that depends on it, and on some quirky
behavior to boot. So better to accept and document and move on.

>> Normally, I see such things outside the scope of standardization. However,
>> many of the .dts files that are in the wild today use a number of #define
>> constants to make things more readable (having GPIO_ACTIVE_HIGH
>> instead of '0' makes the .dts files easier to read). However, there's a
>> small
>> issue that I've had. The files that contain those definitions are currently
>> in the Linux kernel and have a wide variety of licenses (including none
>> at all).
>
> Yes, this is a problem. In lieu of any explicit license, I'd say the
> license defaults to GPL. There is also the same issue with the
> Documentation as we plan to move some of the common bindings such as
> clocks, gpio, etc. into the spec which is Apache licensed.

I tend to agree.

> In both cases, we're going to need to get permission of the authors to
> re-license. For the headers, these should be patches to the kernel.
> For the docs, we just need to record the permission when committing
> the addition to the spec. Neither should be too hard as they should
> not be changing much and we have complete history in git.

Personally, I'd opt to cut the original authors completely out
of the loop and generate the files. I have nothing against the
original authors, but to be maximally interoperable, I think this
option should be seriously considered.

>> So before even getting to the notion of licenses and such (which past
>> expereince suggests may be the worst place to start a discussion), I'm
>> wondering where that will be defined, and if these #defines will become
>> part of the standard for each of the bindings that are defined.
>
> Perhaps. We need to at least define the standard flag values if not
> the symbolic name. I don't think it makes sense to both document and
> maintain headers of the defines. We should ideally just have 1 source
> for all and generate what we need from it. There's been some related
> discussion around having machine parseable bindings as both the
> documentation source and binding validation source, but nothing
> concrete.

I think it would make sense to have a machine-parseable format that
allows generation of the header files. Once they become generated,
the license issue goes away. None of these files have much creative
content anyway, and they certainly don't need to have what little
creative content they have if it were included as part of a
machine-parseable file.

>> I'm also wondering where the larger issue of using cpp to process the dts
>> files will be discussed, since FreeBSD's BSDL dtc suffers interoperability
>> due to this issue. Having the formal spec will also be helpful for its care and
>> feeding since many fine points have had to be decided based on .dts
>> files in the wild rather than a clear spec.
>>
>> Thanks again for spear-heading the effort to get a new version out now
>> that ePAPR has fallen on hard times.
>>
>> Warner
>>
>> P.S. I'm mostly a FreeBSD guy, but just spent some time digging into this
>> issue for another of the BSDs that's considering adopting DTS files.
>
> We certainly need and want the BSD folks involved in the spec.

Excellent! There's many people that are quire interested.

Warner

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-06-14  7:06 Raphael Poggi
@ 2016-06-24  8:17 ` Raphaël Poggi
  2016-06-24 11:49   ` Re: Sascha Hauer
  0 siblings, 1 reply; 1546+ messages in thread
From: Raphaël Poggi @ 2016-06-24  8:17 UTC (permalink / raw)
  To: barebox, Sascha Hauer

Hi Sascha,

Beside the comments on [PATCH 01/12] and [PATCH 03/12], do you have
any comments about the series ? I have a v3 series ready to be sent
(with your recent suggestions).

Thanks,
Raphaël

2016-06-14 9:06 GMT+02:00 Raphael Poggi <poggi.raph@gmail.com>:
> Change since v1:
>         PATCH 2/12:     remove hunk which belongs to patch adding mach-qemu
>
>         PATCH 3/12:     remove unused files
>
>         PATCH 4/12:     create lowlevel64
>
>         PATCH 11/12:    create pgtables64 (nothing in common with the arm32 version)
>
>         PATCH 12/12:    rename "mach-virt" => "mach-qemu"
>                         rename board "qemu_virt64"
>                         remove board env files
>
>
> Hello,
>
> This patch series introduces a basic support for arm64.
>
> The arm64 code is merged in the current arch/arm directory.
> I try to be iterative in the merge process, and find correct solutions
> to handle both architecture at some places.
>
> I test the patch series by compiling arm64 virt machine and arm32 vexpress-a9 and test it
> in qemu, everything seems to work.
>
> Thanks,
> Raphaël
>
>  arch/arm/Kconfig                           |  28 ++
>  arch/arm/Makefile                          |  30 +-
>  arch/arm/boards/Makefile                   |   1 +
>  arch/arm/boards/qemu-virt64/Kconfig        |   8 +
>  arch/arm/boards/qemu-virt64/Makefile       |   1 +
>  arch/arm/boards/qemu-virt64/init.c         |  67 ++++
>  arch/arm/configs/qemu_virt64_defconfig     |  55 +++
>  arch/arm/cpu/Kconfig                       |  29 +-
>  arch/arm/cpu/Makefile                      |  26 +-
>  arch/arm/cpu/cache-armv8.S                 | 168 +++++++++
>  arch/arm/cpu/cache.c                       |  19 +
>  arch/arm/cpu/cpu.c                         |   5 +
>  arch/arm/cpu/cpuinfo.c                     |  58 ++-
>  arch/arm/cpu/exceptions_64.S               | 127 +++++++
>  arch/arm/cpu/interrupts.c                  |  47 +++
>  arch/arm/cpu/lowlevel_64.S                 |  40 ++
>  arch/arm/cpu/mmu.h                         |  54 +++
>  arch/arm/cpu/mmu_64.c                      | 333 +++++++++++++++++
>  arch/arm/cpu/start.c                       |   2 +
>  arch/arm/include/asm/bitops.h              |   5 +
>  arch/arm/include/asm/cache.h               |   9 +
>  arch/arm/include/asm/mmu.h                 |  14 +-
>  arch/arm/include/asm/pgtable64.h           | 140 +++++++
>  arch/arm/include/asm/system.h              |  46 ++-
>  arch/arm/include/asm/system_info.h         |  38 ++
>  arch/arm/lib64/Makefile                    |  10 +
>  arch/arm/lib64/armlinux.c                  | 275 ++++++++++++++
>  arch/arm/lib64/asm-offsets.c               |  16 +
>  arch/arm/lib64/barebox.lds.S               | 125 +++++++
>  arch/arm/lib64/bootm.c                     | 572 +++++++++++++++++++++++++++++
>  arch/arm/lib64/copy_template.S             | 192 ++++++++++
>  arch/arm/lib64/div0.c                      |  27 ++
>  arch/arm/lib64/memcpy.S                    |  74 ++++
>  arch/arm/lib64/memset.S                    | 215 +++++++++++
>  arch/arm/lib64/module.c                    |  98 +++++
>  arch/arm/mach-qemu/Kconfig                 |  18 +
>  arch/arm/mach-qemu/Makefile                |   2 +
>  arch/arm/mach-qemu/include/mach/debug_ll.h |  24 ++
>  arch/arm/mach-qemu/include/mach/devices.h  |  13 +
>  arch/arm/mach-qemu/virt_devices.c          |  30 ++
>  arch/arm/mach-qemu/virt_lowlevel.c         |  19 +
>  41 files changed, 3044 insertions(+), 16 deletions(-)
>
>
> _______________________________________________
> barebox mailing list
> barebox@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/barebox

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-06-24  8:17 ` Raphaël Poggi
@ 2016-06-24 11:49   ` Sascha Hauer
  0 siblings, 0 replies; 1546+ messages in thread
From: Sascha Hauer @ 2016-06-24 11:49 UTC (permalink / raw)
  To: Raphaël Poggi; +Cc: barebox

Hi Raphaël,

On Fri, Jun 24, 2016 at 10:17:45AM +0200, Raphaël Poggi wrote:
> Hi Sascha,
> 
> Beside the comments on [PATCH 01/12] and [PATCH 03/12], do you have
> any comments about the series ? I have a v3 series ready to be sent
> (with your recent suggestions).

No more comments for now, go ahead with the new series.

Sascha


-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2016-07-15 18:16 Arnold Zeigler
  0 siblings, 0 replies; 1546+ messages in thread
From: Arnold Zeigler @ 2016-07-15 18:16 UTC (permalink / raw)
  To: sparclinux

 Hello Friend,

 I'm sorry to reach out in this manner but I had no choice other than this. My
 name and contact can be seen below. I would like to discuss a partnership
 with you. I expect your response so I can send more details.

 Regards,
 Arnold Zeigler

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-09-01  2:02 Fennec Fox
@ 2016-09-01  3:10 ` Jeff Mahoney
  2016-09-01 19:32   ` Re: Kai Krakow
  0 siblings, 1 reply; 1546+ messages in thread
From: Jeff Mahoney @ 2016-09-01  3:10 UTC (permalink / raw)
  To: Fennec Fox, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1087 bytes --]

On 8/31/16 10:02 PM, Fennec Fox wrote:
> Linux Titanium 4.7.2-1-MANJARO #1 SMP PREEMPT Sun Aug 21 15:04:37 UTC
> 2016 x86_64 GNU/Linux
> btrfs-progs v4.7
> 
> Data, single: total=30.01GiB, used=18.95GiB
> System, single: total=4.00MiB, used=16.00KiB
> Metadata, single: total=1.01GiB, used=422.17MiB
> GlobalReserve, single: total=144.00MiB, used=0.00B
> 
> {02:50} Wed Aug 31
> [fennectech@Titanium ~]$  sudo fstrim -v /
> [sudo] password for fennectech:
> Sorry, try again.
> [sudo] password for fennectech:
> /: 99.8 GiB (107167244288 bytes) trimmed
> 
> {03:08} Wed Aug 31
> [fennectech@Titanium ~]$  sudo fstrim -v /
> [sudo] password for fennectech:
> /: 99.9 GiB (107262181376 bytes) trimmed
> 
>   I ran these commands minutes after echother ane each time it is
> trimming the entire free space
> 
> Anyone else seen this?   the filesystem is the root FS and is compressed
> 

Yes.  It's working as intended.  We don't track what space has already
been trimmed anywhere, so it trims all unallocated space.

-Jeff

-- 
Jeff Mahoney
SUSE Labs


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 827 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-09-01  3:10 ` Jeff Mahoney
@ 2016-09-01 19:32   ` Kai Krakow
  0 siblings, 0 replies; 1546+ messages in thread
From: Kai Krakow @ 2016-09-01 19:32 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1823 bytes --]

Am Wed, 31 Aug 2016 23:10:13 -0400
schrieb Jeff Mahoney <jeffm@suse.com>:

> On 8/31/16 10:02 PM, Fennec Fox wrote:
> > Linux Titanium 4.7.2-1-MANJARO #1 SMP PREEMPT Sun Aug 21 15:04:37
> > UTC 2016 x86_64 GNU/Linux
> > btrfs-progs v4.7
> > 
> > Data, single: total=30.01GiB, used=18.95GiB
> > System, single: total=4.00MiB, used=16.00KiB
> > Metadata, single: total=1.01GiB, used=422.17MiB
> > GlobalReserve, single: total=144.00MiB, used=0.00B
> > 
> > {02:50} Wed Aug 31
> > [fennectech@Titanium ~]$  sudo fstrim -v /
> > [sudo] password for fennectech:
> > Sorry, try again.
> > [sudo] password for fennectech:
> > /: 99.8 GiB (107167244288 bytes) trimmed
> > 
> > {03:08} Wed Aug 31
> > [fennectech@Titanium ~]$  sudo fstrim -v /
> > [sudo] password for fennectech:
> > /: 99.9 GiB (107262181376 bytes) trimmed
> > 
> >   I ran these commands minutes after echother ane each time it is
> > trimming the entire free space
> > 
> > Anyone else seen this?   the filesystem is the root FS and is
> > compressed 
> 
> Yes.  It's working as intended.  We don't track what space has already
> been trimmed anywhere, so it trims all unallocated space.

I wonder, does it work in a multi device scenario? When btrfs pools
multiple devices together?

I ask because fstrim seems to always report the estimated free space,
not the raw free space, as trimmed.

OTOH, this may simply be because btrfs reports 1.08 TiB unallocated
while fstrim reports 1.2 TB trimmed (and not TiB) - which when
"converted" (1.08 * 1024^4 / 1000^4 ~= 1.18) perfectly rounds to 1.2.
Coincidence is free estimated space is 1.19 TiB for me (which would also
round to 1.2) and these numbers, as they are in the TB range, won't
change so fast for me.


-- 
Regards,
Kai

Replies to list-only preferred.

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2016-09-10 21:51 Michelle Ouellette
  0 siblings, 0 replies; 1546+ messages in thread
From: Michelle Ouellette @ 2016-09-10 21:51 UTC (permalink / raw)
  To: barebox

Hello,My name is Gloria Mackenzie, i have a donation to make for less privilege, am writing you with a friend's email, please contact me on gloria.mackenzie001@rogers.com

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-09-27 16:50 Rajat Jain
@ 2016-09-27 16:57 ` Rajat Jain
  0 siblings, 0 replies; 1546+ messages in thread
From: Rajat Jain @ 2016-09-27 16:57 UTC (permalink / raw)
  To: Amitkumar Karwar, Nishant Sarmukadam, Kalle Valo, linux-wireless,
	netdev
  Cc: Wei-Ning Huang, Brian Norris, Eric Caruso, Rajat Jain, Rajat Jain

Please ignore, not sure why this landed without a subject.

On Tue, Sep 27, 2016 at 9:50 AM, Rajat Jain <rajatja@google.com> wrote:
> From: Wei-Ning Huang <wnhuang@google.com>
>
> Date: Thu, 17 Mar 2016 11:43:16 +0800
> Subject: [PATCH] mwifiex: report wakeup for wowlan
>
> Enable notifying wakeup source to the PM core. This allow darkresume to
> correctly track wakeup source and mark mwifiex_plt as 'automatic' wakeup
> source.
>
> Signed-off-by: Wei-Ning Huang <wnhuang@google.com>
> Signed-off-by: Rajat Jain <rajatja@google.com>
> Tested-by: Wei-Ning Huang <wnhuang@chromium.org>
> Reviewed-by: Eric Caruso <ejcaruso@chromium.org>
> ---
>  drivers/net/wireless/marvell/mwifiex/sdio.c | 8 ++++++++
>  drivers/net/wireless/marvell/mwifiex/sdio.h | 1 +
>  2 files changed, 9 insertions(+)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
> index d3e1561..a5f63e4 100644
> --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> @@ -89,6 +89,9 @@ static irqreturn_t mwifiex_wake_irq_wifi(int irq, void *priv)
>                 disable_irq_nosync(irq);
>         }
>
> +       /* Notify PM core we are wakeup source */
> +       pm_wakeup_event(cfg->dev, 0);
> +
>         return IRQ_HANDLED;
>  }
>
> @@ -112,6 +115,7 @@ static int mwifiex_sdio_probe_of(struct device *dev, struct sdio_mmc_card *card)
>                                           GFP_KERNEL);
>         cfg = card->plt_wake_cfg;
>         if (cfg && card->plt_of_node) {
> +               cfg->dev = dev;
>                 cfg->irq_wifi = irq_of_parse_and_map(card->plt_of_node, 0);
>                 if (!cfg->irq_wifi) {
>                         dev_dbg(dev,
> @@ -130,6 +134,10 @@ static int mwifiex_sdio_probe_of(struct device *dev, struct sdio_mmc_card *card)
>                 }
>         }
>
> +       ret = device_init_wakeup(dev, true);
> +       if (ret)
> +               dev_err(dev, "fail to init wakeup for mwifiex");
> +
>         return 0;
>  }
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.h b/drivers/net/wireless/marvell/mwifiex/sdio.h
> index db837f1..07cdd23 100644
> --- a/drivers/net/wireless/marvell/mwifiex/sdio.h
> +++ b/drivers/net/wireless/marvell/mwifiex/sdio.h
> @@ -155,6 +155,7 @@
>  } while (0)
>
>  struct mwifiex_plt_wake_cfg {
> +       struct device *dev;
>         int irq_wifi;
>         bool wake_by_wifi;
>  };
> --
> 2.8.0.rc3.226.g39d4020
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-11-06 21:00 (unknown), Dennis Dataopslag
@ 2016-11-07 16:50 ` Wols Lists
  2016-11-07 17:13   ` Re: Wols Lists
  2016-11-17 20:33 ` Re: Dennis Dataopslag
  1 sibling, 1 reply; 1546+ messages in thread
From: Wols Lists @ 2016-11-07 16:50 UTC (permalink / raw)
  To: Dennis Dataopslag, linux-raid

On 06/11/16 21:00, Dennis Dataopslag wrote:
> Help wanted very much!

Quick response ...
> 
> My setup:
> Thecus N5550 NAS with 5 1TB drives installed.
> 
> MD0: RAID 5 config of 4 drives (SD[ABCD]2)
> MD10: RAID 1 config of all 5 drives (SD..1), system generated array
> MD50: RAID 1 config of 4 drives (SD[ABCD]3), system generated array
> 
> 1 drive (SDE) set as global hot spare.
> 
Bit late now, but you would probably have been better with raid-6.
> 
> What happened:
> This weekend I thought it might be a good idea to do a SMART test for
> the drives in my NAS.
> I started the test on 1 drive and after it ran for a while I started
> the other ones.
> While the test was running drive 3 failed. I got a message the RAID
> was degraded and started rebuilding. (My assumption is that at this
> moment the global hot spare will automatically be added to the array)
> 
> I stopped the SMART tests of all drives at this moment since it seemed
> logical to me the SMART test (or the outcomes) made the drive fail.
> In stopping the tests, drive 1 also failed!!
> I let it for a little but the admin interface kept telling me it was
> degraded, did not seem to take any actions to start rebuilding.

It can't - there's no spare drive to rebuild on, and there aren't enough
drives to build a working array.

> At this point I started googling and found I should remove and reseat
> the drives. This is also what I did but nothing seemd to happen.
> The turned up as new drives in the admin interface and I re-added them
> to the array, they were added as spares.
> Even after adding them the array didn't start rebuilding.
> I checked stat in mdadm and it told me clean FAILED opposed to the
> degraded in the admin interface.

Yup. You've only got two drives of a four-drive raid 5.

Where did you google? Did you read the linux raid wiki?

https://raid.wiki.kernel.org/index.php/Linux_Raid
> 
> I rebooted the NAS since it didn't seem to be doing anything I might interrupt.
> after rebooting it seemed as if the entire array had disappeared!!
> I started looking for options in MDADM and tried every "normal"option
> to rebuild the array (--assemble --scan for example)
> Unfortunately I cannot produce a complete list since I cannot find how
> to get it from the logging.
> 
> Finally I mdadm --create a new array with the original 4 drives with
> all the right settings. (Got them from 1 of the original volumes)

OUCH OUCH OUCH!

Are you sure you've got the right settings? A lot of "hidden" settings
have changed their values over the years. Do you know which mdadm was
used to create the array in the first place?

> The creation worked but after creation it doesn't seem to have a valid
> partition table. This is the point where I realized I probably fucked
> it up big-time and should call in the help squad!!!
> What I think went wrong is that I re-created an array with the
> original 4 drives from before the first failure but the hot-spare was
> already added?

Nope. You've probably used a newer version of mdadm. That's assuming the
array is still all the original drives. If some of them have been
replaced you've got a still messier problem.
> 
> The most important data from the array is saved in an offline backup
> luckily but I would very much like it if there is any way I could
> restore the data from the array.
> 
> Is there any way I could get it back online?

You're looking at a big forensic job. I've moved the relevant page to
the archaeology area - probably a bit too soon - but you need to read
the following page

https://raid.wiki.kernel.org/index.php/Reconstruction

Especially the bit about overlays. And wait for the experts to chime in
about how to do a hexdump and work out the values you need to pass to
mdadm to get the array back. It's a lot of work and you could be looking
at a week what with the delays as you wait for replies.

I think it's recoverable. Is it worth it?

Cheers,
Wol

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-11-07 16:50 ` Wols Lists
@ 2016-11-07 17:13   ` Wols Lists
  0 siblings, 0 replies; 1546+ messages in thread
From: Wols Lists @ 2016-11-07 17:13 UTC (permalink / raw)
  To: Dennis Dataopslag, linux-raid

On 07/11/16 16:50, Wols Lists wrote:
> You're looking at a big forensic job. I've moved the relevant page to
> the archaeology area - probably a bit too soon - but you need to read
> the following page
> 
> https://raid.wiki.kernel.org/index.php/Reconstruction
> 
> Especially the bit about overlays. And wait for the experts to chime in
> about how to do a hexdump and work out the values you need to pass to
> mdadm to get the array back. It's a lot of work and you could be looking
> at a week what with the delays as you wait for replies.

Whoops, sorry. Wrong page, you need this one ...

https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID

Cheers,
Wol

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-11-09 17:55 bepi
@ 2016-11-10  6:57 ` Alex Powell
  2016-11-10 13:00   ` Re: bepi
  0 siblings, 1 reply; 1546+ messages in thread
From: Alex Powell @ 2016-11-10  6:57 UTC (permalink / raw)
  To: bepi; +Cc: linux-btrfs

Hi,
It would be good but perhaps each task should be created via cronjobs
instead of having a script running all the time or one script via one
cronjob

Working in the enterprise environment for a major bank, we quickly
learn that these sort of daily tasks should be split up

Kind Regards,
Alex

On Thu, Nov 10, 2016 at 4:25 AM,  <bepi@adria.it> wrote:
> Hi.
>
> I'm making a script for managing btrfs.
>
> To perform the scrub, to create and send (even to a remote system) of the backup
> snapshot (or for one copy of the current state of the data).
>
> The script is designed to:
> - Be easy to use:
>   - The preparation is carried out automatically.
>   - Autodetect of the subvolume mounted.
> - Be safe and robust:
>   - Check that not exist a another btrfs managing already started.
>   - Subvolume for created and received snapshot are mounted and accessible only
>     for the time necessary to perform the requested operation.
>   - Verify that the snapshot and sending snapshot are been executed completely.
>   - Progressive numbering of the snapshots for identify with certainty
>     the latest snapshot.
>
> Are also available command for view the list of snaphost present, command for
> delete the snapshots.
>
> For example:
>
> btrsfManage SCRUB /
> btrsfManage SNAPSHOT /
> btrsfManage SEND / /dev/sda1
> btrsfManage SEND / root@gdb.exnet.it/dev/sda1
> btrsfManage SNAPLIST /dev/sda1
> btrsfManage SNAPDEL /dev/sda1 "root-2016-11*"
>
> You are interested?
>
> Gdb
>
>
> ----------------------------------------------------
> This mail has been sent using Alpikom webmail system
> http://www.alpikom.it
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-11-10  6:57 ` Alex Powell
@ 2016-11-10 13:00   ` bepi
  0 siblings, 0 replies; 1546+ messages in thread
From: bepi @ 2016-11-10 13:00 UTC (permalink / raw)
  To: Alex Powell; +Cc: linux-btrfs

Hi.

P.S. Sorry for the double sending and for the blank email subject.


Yes.
The various controls are designed to be used separated, and to be launched both
as cronjobs and manually.

For example 
you can create a series of snapshots

  btrsfManage SNAPSHOT /

and send the new snapshots (incremental stream)

  btrsfManage SEND / /dev/sda1

in cronjobs or manually, it is indifferent.


Best regards.

Gdb

Scrive Alex Powell <alexj.powellalt@googlemail.com>:

> Hi,
> It would be good but perhaps each task should be created via cronjobs
> instead of having a script running all the time or one script via one
> cronjob
> 
> Working in the enterprise environment for a major bank, we quickly
> learn that these sort of daily tasks should be split up
> 
> Kind Regards,
> Alex
> 
> On Thu, Nov 10, 2016 at 4:25 AM,  <bepi@adria.it> wrote:
> > Hi.
> >
> > I'm making a script for managing btrfs.
> >
> > To perform the scrub, to create and send (even to a remote system) of the
> backup
> > snapshot (or for one copy of the current state of the data).
> >
> > The script is designed to:
> > - Be easy to use:
> >   - The preparation is carried out automatically.
> >   - Autodetect of the subvolume mounted.
> > - Be safe and robust:
> >   - Check that not exist a another btrfs managing already started.
> >   - Subvolume for created and received snapshot are mounted and accessible
> only
> >     for the time necessary to perform the requested operation.
> >   - Verify that the snapshot and sending snapshot are been executed
> completely.
> >   - Progressive numbering of the snapshots for identify with certainty
> >     the latest snapshot.
> >
> > Are also available command for view the list of snaphost present, command
> for
> > delete the snapshots.
> >
> > For example:
> >
> > btrsfManage SCRUB /
> > btrsfManage SNAPSHOT /
> > btrsfManage SEND / /dev/sda1
> > btrsfManage SEND / root@gdb.exnet.it/dev/sda1
> > btrsfManage SNAPLIST /dev/sda1
> > btrsfManage SNAPDEL /dev/sda1 "root-2016-11*"
> >
> > You are interested?
> >
> > Gdb
> >
> >
> > ----------------------------------------------------
> > This mail has been sent using Alpikom webmail system
> > http://www.alpikom.it
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 




----------------------------------------------------
This mail has been sent using Alpikom webmail system
http://www.alpikom.it


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* re:
@ 2016-11-15  4:40 Apply
  0 siblings, 0 replies; 1546+ messages in thread
From: Apply @ 2016-11-15  4:40 UTC (permalink / raw)
  To: Recipients

Do you need loan?we offer all kinds of loan from minimum amount of $5,000 to maximum of $2,000,000 if you are interested contact us via:internationalloanplc1@gmail.com  with the information below:
Full Name:
Country:
Loan Amount:
Loan Duration:
Mobile phone number:
Sex:
Thanks,
Dr Scott.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-11-06 21:00 (unknown), Dennis Dataopslag
  2016-11-07 16:50 ` Wols Lists
@ 2016-11-17 20:33 ` Dennis Dataopslag
  2016-11-17 22:12   ` Re: Wols Lists
  1 sibling, 1 reply; 1546+ messages in thread
From: Dennis Dataopslag @ 2016-11-17 20:33 UTC (permalink / raw)
  To: linux-raid

CHeers for the reaction and sorry for my late response, I've been out
for business.

Trying to rebuild this RAID is definately worth it for me. The
learning experience alone already makes it worth.

I did read the wiki page and tried several steps that are on there but
it didn't seem to get me out of trouble.

I used this information from the drive, obviously didn't search for
any "hidden" settings:
" Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 36fdeb4b:c5360009:0958ad1e:17da451b
           Name : TRD106:0  (local to host TRD106)
  Creation Time : Fri Oct 10 12:27:27 2014
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 1948250112 (929.00 GiB 997.50 GB)
     Array Size : 5844750336 (2786.99 GiB 2992.51 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : b49e2752:d37dac6c:8764c52a:372277bd

    Update Time : Sat Nov  5 14:40:33 2016
       Checksum : d47a9ad4 - correct
         Events : 14934

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing)"

Anybody that can give me a little extra push?

On 06/11/16 21:00, Dennis Dataopslag wrote:
> Help wanted very much!

Quick response ...
>
> My setup:
> Thecus N5550 NAS with 5 1TB drives installed.
>
> MD0: RAID 5 config of 4 drives (SD[ABCD]2)
> MD10: RAID 1 config of all 5 drives (SD..1), system generated array
> MD50: RAID 1 config of 4 drives (SD[ABCD]3), system generated array
>
> 1 drive (SDE) set as global hot spare.
>
Bit late now, but you would probably have been better with raid-6.
>
> What happened:
> This weekend I thought it might be a good idea to do a SMART test for
> the drives in my NAS.
> I started the test on 1 drive and after it ran for a while I started
> the other ones.
> While the test was running drive 3 failed. I got a message the RAID
> was degraded and started rebuilding. (My assumption is that at this
> moment the global hot spare will automatically be added to the array)
>
> I stopped the SMART tests of all drives at this moment since it seemed
> logical to me the SMART test (or the outcomes) made the drive fail.
> In stopping the tests, drive 1 also failed!!
> I let it for a little but the admin interface kept telling me it was
> degraded, did not seem to take any actions to start rebuilding.

It can't - there's no spare drive to rebuild on, and there aren't enough
drives to build a working array.

> At this point I started googling and found I should remove and reseat
> the drives. This is also what I did but nothing seemd to happen.
> The turned up as new drives in the admin interface and I re-added them
> to the array, they were added as spares.
> Even after adding them the array didn't start rebuilding.
> I checked stat in mdadm and it told me clean FAILED opposed to the
> degraded in the admin interface.

Yup. You've only got two drives of a four-drive raid 5.

Where did you google? Did you read the linux raid wiki?

https://raid.wiki.kernel.org/index.php/Linux_Raid
>
> I rebooted the NAS since it didn't seem to be doing anything I might interrupt.
> after rebooting it seemed as if the entire array had disappeared!!
> I started looking for options in MDADM and tried every "normal"option
> to rebuild the array (--assemble --scan for example)
> Unfortunately I cannot produce a complete list since I cannot find how
> to get it from the logging.
>
> Finally I mdadm --create a new array with the original 4 drives with
> all the right settings. (Got them from 1 of the original volumes)

OUCH OUCH OUCH!

Are you sure you've got the right settings? A lot of "hidden" settings
have changed their values over the years. Do you know which mdadm was
used to create the array in the first place?

> The creation worked but after creation it doesn't seem to have a valid
> partition table. This is the point where I realized I probably fucked
> it up big-time and should call in the help squad!!!
> What I think went wrong is that I re-created an array with the
> original 4 drives from before the first failure but the hot-spare was
> already added?

Nope. You've probably used a newer version of mdadm. That's assuming the
array is still all the original drives. If some of them have been
replaced you've got a still messier problem.
>
> The most important data from the array is saved in an offline backup
> luckily but I would very much like it if there is any way I could
> restore the data from the array.
>
> Is there any way I could get it back online?

You're looking at a big forensic job. I've moved the relevant page to
the archaeology area - probably a bit too soon - but you need to read
the following page

https://raid.wiki.kernel.org/index.php/Reconstruction

Especially the bit about overlays. And wait for the experts to chime in
about how to do a hexdump and work out the values you need to pass to
mdadm to get the array back. It's a lot of work and you could be looking
at a week what with the delays as you wait for replies.

I think it's recoverable. Is it worth it?

Cheers,
Wol

On Sun, Nov 6, 2016 at 10:00 PM, Dennis Dataopslag
<dennisdataopslag@gmail.com> wrote:
> Help wanted very much!
>
> My setup:
> Thecus N5550 NAS with 5 1TB drives installed.
>
> MD0: RAID 5 config of 4 drives (SD[ABCD]2)
> MD10: RAID 1 config of all 5 drives (SD..1), system generated array
> MD50: RAID 1 config of 4 drives (SD[ABCD]3), system generated array
>
> 1 drive (SDE) set as global hot spare.
>
>
> What happened:
> This weekend I thought it might be a good idea to do a SMART test for
> the drives in my NAS.
> I started the test on 1 drive and after it ran for a while I started
> the other ones.
> While the test was running drive 3 failed. I got a message the RAID
> was degraded and started rebuilding. (My assumption is that at this
> moment the global hot spare will automatically be added to the array)
>
> I stopped the SMART tests of all drives at this moment since it seemed
> logical to me the SMART test (or the outcomes) made the drive fail.
> In stopping the tests, drive 1 also failed!!
> I let it for a little but the admin interface kept telling me it was
> degraded, did not seem to take any actions to start rebuilding.
> At this point I started googling and found I should remove and reseat
> the drives. This is also what I did but nothing seemd to happen.
> The turned up as new drives in the admin interface and I re-added them
> to the array, they were added as spares.
> Even after adding them the array didn't start rebuilding.
> I checked stat in mdadm and it told me clean FAILED opposed to the
> degraded in the admin interface.
>
> I rebooted the NAS since it didn't seem to be doing anything I might interrupt.
> after rebooting it seemed as if the entire array had disappeared!!
> I started looking for options in MDADM and tried every "normal"option
> to rebuild the array (--assemble --scan for example)
> Unfortunately I cannot produce a complete list since I cannot find how
> to get it from the logging.
>
> Finally I mdadm --create a new array with the original 4 drives with
> all the right settings. (Got them from 1 of the original volumes)
> The creation worked but after creation it doesn't seem to have a valid
> partition table. This is the point where I realized I probably fucked
> it up big-time and should call in the help squad!!!
> What I think went wrong is that I re-created an array with the
> original 4 drives from before the first failure but the hot-spare was
> already added?
>
> The most important data from the array is saved in an offline backup
> luckily but I would very much like it if there is any way I could
> restore the data from the array.
>
> Is there any way I could get it back online?

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2016-11-17 20:33 ` Re: Dennis Dataopslag
@ 2016-11-17 22:12   ` Wols Lists
  0 siblings, 0 replies; 1546+ messages in thread
From: Wols Lists @ 2016-11-17 22:12 UTC (permalink / raw)
  To: Dennis Dataopslag, linux-raid

On 17/11/16 20:33, Dennis Dataopslag wrote:
> CHeers for the reaction and sorry for my late response, I've been out
> for business.
> 
> Trying to rebuild this RAID is definately worth it for me. The
> learning experience alone already makes it worth.
> 
> I did read the wiki page and tried several steps that are on there but
> it didn't seem to get me out of trouble.
> 
> I used this information from the drive, obviously didn't search for
> any "hidden" settings:
> " Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x0
>      Array UUID : 36fdeb4b:c5360009:0958ad1e:17da451b
>            Name : TRD106:0  (local to host TRD106)
>   Creation Time : Fri Oct 10 12:27:27 2014
>      Raid Level : raid5
>    Raid Devices : 4
> 
>  Avail Dev Size : 1948250112 (929.00 GiB 997.50 GB)
>      Array Size : 5844750336 (2786.99 GiB 2992.51 GB)
>     Data Offset : 2048 sectors
>    Super Offset : 8 sectors
>           State : clean
>     Device UUID : b49e2752:d37dac6c:8764c52a:372277bd
> 
>     Update Time : Sat Nov  5 14:40:33 2016
>        Checksum : d47a9ad4 - correct
>          Events : 14934
> 
>          Layout : left-symmetric
>      Chunk Size : 64K
> 
>    Device Role : Active device 0
>    Array State : AAAA ('A' == active, '.' == missing)"
> 
> Anybody that can give me a little extra push?
> 
Others will be able to help better than me, but you might want to look
for the thread "RAID10 with 2 drives auto-assembled as RAID1".

This will give you some information about how to run hexdump and find
where your filesystems are on the array.

There's plenty of other threads with this sort of information, but this
will give you a starting point. If Phil Turmel sees this, he'll chime in
with better detail.

Cheers,
Wol


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2017-02-16 19:41 simran singhal
@ 2017-02-16 19:44 ` SIMRAN SINGHAL
  0 siblings, 0 replies; 1546+ messages in thread
From: SIMRAN SINGHAL @ 2017-02-16 19:44 UTC (permalink / raw)
  To: outreachy-kernel


[-- Attachment #1.1: Type: text/plain, Size: 1333 bytes --]



On Friday, February 17, 2017 at 1:11:19 AM UTC+5:30, SIMRAN SINGHAL wrote:
>
> linux-kernel@vger.kernel.org 
> Bcc: 
> Subject: [PATCH 1/3] staging: rtl8192u: Replace symbolic permissions with 
>  octal permissions 
> Reply-To: 
>
> WARNING: Symbolic permissions 'S_IRUGO | S_IWUSR' are not preferred. 
> Consider using octal permissions '0644'. 
> This warning is detected by checkpatch.pl 
>
> Signed-off-by: simran singhal <singhalsimran0@gmail.com> 
> --- 
>  drivers/staging/rtl8192u/ieee80211/ieee80211_module.c | 2 +- 
>  1 file changed, 1 insertion(+), 1 deletion(-) 
>
> diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c 
> b/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c 
> index a9a92d8..2ebc320 100644 
> --- a/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c 
> +++ b/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c 
> @@ -283,7 +283,7 @@ int __init ieee80211_debug_init(void) 
>                                  " proc directory\n"); 
>                  return -EIO; 
>          } 
> -        e = proc_create("debug_level", S_IRUGO | S_IWUSR, 
> +        e = proc_create("debug_level", 0644, 
>                                ieee80211_proc, &fops); 
>          if (!e) { 
>                  remove_proc_entry(DRV_NAME, init_net.proc_net); 
> -- 
> 2.7.4 
>
>
Sorry, Ignore this. 

[-- Attachment #1.2: Type: text/html, Size: 2679 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)
  To: kernel-janitors

How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's

______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 ` Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 ` Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 ` Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)
  To: linux-fbdev

How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's

______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 ` Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 ` Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 ` Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)
  To: linux-sh

How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's

______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 ` Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)
  To: linux-sctp

How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's

______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:10 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:10 UTC (permalink / raw)




----
How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:13 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:13 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-02-23 15:15 Qin's Yanjun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:15 UTC (permalink / raw)
  To: sparclinux

How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's

______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2017-03-19 15:00 Ilan Schwarts
@ 2017-03-23 17:12 ` Jeff Mahoney
  0 siblings, 0 replies; 1546+ messages in thread
From: Jeff Mahoney @ 2017-03-23 17:12 UTC (permalink / raw)
  To: Ilan Schwarts, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1289 bytes --]

On 3/19/17 11:00 AM, Ilan Schwarts wrote:
> Hi,
> sorry if this is a newbie question. I am newbie.
> 
> In my kernel driver, I get device id by converting struct inode struct
> to btrfs_inode, I use the code:
> struct btrfs_inode *btrfsInode;
> btrfsInode = BTRFS_I(inode);
> 
> I usually download kernel-headers rpm package, this is not enough. it
> fails to find the btrfs header files.
> 
> I had to download them not via rpm package and declare:
> #include "/data/kernel/linux-4.1.21-x86_64/fs/btrfs/ctree.h"
> #include "/data/kernel/linux-4.1.21-x86_64/fs/btrfs/btrfs_inode.h"
> 
> This is not good, why ctree.h and btrfs_inode.h are not in kernel headers?
> Is there another package i need to download in order to get them, in
> addition to kernel-headers? ?
> 
> 
> I see they are not provided in kernel-header package, e.g:
> https://rpmfind.net/linux/RPM/fedora/23/x86_64/k/kernel-headers-4.2.3-300.fc23.x86_64.html

I don't know what Fedora package you'd use, but the core problem is that
you're trying to use internal structures in an external module.  We've
properly exported the constants and structures required for userspace to
interact with btrfs, but there are no plans to export internal structures.

-Jeff

-- 
Jeff Mahoney
SUSE Labs


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-04-01  5:31 USPS Delivery
  0 siblings, 0 replies; 1546+ messages in thread
From: USPS Delivery @ 2017-04-01  5:31 UTC (permalink / raw)
  To: linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw

Hello,
Your item has arrived at Sat, 01 Apr 2017 06:31:34 +0100, but our courier
was not able to deliver the parcel. 
Review the document that is attached to this e-mail!

Most sincerely.
Gertruda Hendry -  USPS Mail Delivery Agent.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-04-10  3:17 Qin Yan jun
  0 siblings, 0 replies; 1546+ messages in thread
From: Qin Yan jun @ 2017-04-10  3:17 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-04-11 14:37 USPS Priority Delivery
  0 siblings, 0 replies; 1546+ messages in thread
From: USPS Priority Delivery @ 2017-04-11 14:37 UTC (permalink / raw)
  To: linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw

Hello,

We can not deliver your parcel arrived at  Tue, 11 Apr 2017 15:37:40 +0100.

Please click on the link for more details.
http://uspswoiugue62677104.ideliverys.com/iq5866671

With anticipation.
Sophie Wadkins -  USPS Chief Office Manager.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] ` <CAK2H+efb3iKA5P3yd7uRqJomci6ENvrB1JRBBmtQEpEvyPMe7w@mail.gmail.com>
@ 2017-04-13 16:38   ` Scott Ellentuch
  0 siblings, 0 replies; 1546+ messages in thread
From: Scott Ellentuch @ 2017-04-13 16:38 UTC (permalink / raw)
  To: Mark Knecht; +Cc: Linux-RAID

DOH! Stared at it for a while... Thanks.

Tuc

On Thu, Apr 13, 2017 at 12:22 PM, Mark Knecht <markknecht@gmail.com> wrote:
>
>
> On Thu, Apr 13, 2017 at 8:58 AM, Scott Ellentuch <tuctboh@gmail.com> wrote:
>>
>> for disk in a b c d g h i j k l m n
>> do
>>
>>   disklist="${disklist} /dev/sd${disk}1"
>>
>> done
>>
>> mdadm --create --verbose /dev/md2 --level=5 --raid=devices=12  ${disklist}
>>
>> But its telling me :
>>
>> mdadm: invalid number of raid devices: devices=12
>>
>>
>> I can't find any definition of a limit anywhere.
>>
>> Thank you, Tuc
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> Try
>
> --raid-devices=12
>
> not
>
> --raid=devices=12

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CALDO+SZPQGmp4VH0LvCh95uXWvwzAoj+wN-rm0pGu5e0wCcyNw@mail.gmail.com>
@ 2017-04-19 18:13 ` Joe Stringer
  0 siblings, 0 replies; 1546+ messages in thread
From: Joe Stringer @ 2017-04-19 18:13 UTC (permalink / raw)
  To: William Tu; +Cc: xdp-newbies

On 19 April 2017 at 11:12, William Tu <u9012063@gmail.com> wrote:
> subscribe xdp-newbies

You'll need to send this to majordomo@vger.kernel.org :-)

http://vger.kernel.org/vger-lists.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2017-04-28  8:20 (unknown), Anatolij Gustschin
@ 2017-04-28  8:43 ` Linus Walleij
  2017-04-28  9:26   ` Re: Anatolij Gustschin
  0 siblings, 1 reply; 1546+ messages in thread
From: Linus Walleij @ 2017-04-28  8:43 UTC (permalink / raw)
  To: Anatolij Gustschin
  Cc: Alexandre Courbot, Andy Shevchenko, linux-gpio@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Fri, Apr 28, 2017 at 10:20 AM, Anatolij Gustschin <agust@denx.de> wrote:

> Subject: [PATCH v3] gpiolib: Add stubs for gpiod lookup table interface
>
> Add stubs for gpiod_add_lookup_table() and gpiod_remove_lookup_table()
> for the !GPIOLIB case to prevent build errors. Also add prototypes.
>
> Signed-off-by: Anatolij Gustschin <agust@denx.de>
> ---
> Changes in v3:
>  - add stubs for !GPIOLIB case. Drop prototypes, these are
>    already in gpio/machine.h

Yeah...

> --- a/include/linux/gpio/consumer.h
> +++ b/include/linux/gpio/consumer.h

So why should the stubs be in <linux/gpio/consumer.h>
and not in <linux/gpio/machine.h>?

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2017-04-28  8:43 ` Linus Walleij
@ 2017-04-28  9:26   ` Anatolij Gustschin
  0 siblings, 0 replies; 1546+ messages in thread
From: Anatolij Gustschin @ 2017-04-28  9:26 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Alexandre Courbot, Andy Shevchenko, linux-gpio@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Fri, 28 Apr 2017 10:43:19 +0200
Linus Walleij linus.walleij@linaro.org wrote:
...
>> --- a/include/linux/gpio/consumer.h
>> +++ b/include/linux/gpio/consumer.h  
>
>So why should the stubs be in <linux/gpio/consumer.h>
>and not in <linux/gpio/machine.h>?

good question. I'll move them to machine.h.

Thanks,
Anatolij

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-04-28 18:27 USPS Ground Support
  0 siblings, 0 replies; 1546+ messages in thread
From: USPS Ground Support @ 2017-04-28 18:27 UTC (permalink / raw)
  To: linux-nvdimm-y27Ovi1pjclAfugRpC6u6w

Hello,

Your item has arrived at the USPS Post Office at  Fri, 28 Apr 2017 11:27:27
-0700, but the courier was unable to deliver parcel to you. 
You can download the shipment label attached!

With gratitude.
Lashan Simmering -  USPS Support Manager.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-04-29 22:53 USPS Station Management
  0 siblings, 0 replies; 1546+ messages in thread
From: USPS Station Management @ 2017-04-29 22:53 UTC (permalink / raw)
  To: linux-nvdimm-y27Ovi1pjclAfugRpC6u6w

Hello,

Your item has arrived at the USPS Post Office at  Sat, 29 Apr 2017 15:53:09
-0700, but the courier was unable to deliver parcel to you. 
You can find more details in this e-mail attachment!

With thanks and appreciation.
Fermina Khan -  USPS Support Agent.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-05-03  5:59 H.A
  0 siblings, 0 replies; 1546+ messages in thread
From: H.A @ 2017-05-03  5:59 UTC (permalink / raw)
  To: kernel-janitors

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-05-03  6:23 H.A
  0 siblings, 0 replies; 1546+ messages in thread
From: H.A @ 2017-05-03  6:23 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-05-03  6:23 H.A
  0 siblings, 0 replies; 1546+ messages in thread
From: H.A @ 2017-05-03  6:23 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-05-03  6:23 H.A
  0 siblings, 0 replies; 1546+ messages in thread
From: H.A @ 2017-05-03  6:23 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-05-03  6:23 H.A
  0 siblings, 0 replies; 1546+ messages in thread
From: H.A @ 2017-05-03  6:23 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-05-03  6:23 H.A
  0 siblings, 0 replies; 1546+ messages in thread
From: H.A @ 2017-05-03  6:23 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-05-03  6:23 H.A
  0 siblings, 0 replies; 1546+ messages in thread
From: H.A @ 2017-05-03  6:23 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-05-03  6:23 H.A
  0 siblings, 0 replies; 1546+ messages in thread
From: H.A @ 2017-05-03  6:23 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-05-03 11:26 Paul Lopez-Bravo
  0 siblings, 0 replies; 1546+ messages in thread
From: Paul Lopez-Bravo @ 2017-05-03 11:26 UTC (permalink / raw)


--
Hallo,

Erlauben Sie mir, diese sehr wichtige Anfrage durch diesen Median 
aufgrund seiner vertraulichen Natur zu machen. Mein Name ist Herr Paul 
Lopez-Bravo, ein Rechtsanwalt in Spanien. Ich vertrete Late Philip, der 
vor seinem Tod im Jahr 2009 ein reicher Unternehmer war. Ich vertraue 
dir in einer dringenden Angelegenheit an, die sich auf eine Kaution 
bezieht, die von diesem besonderen Klienten von mir vor seinem Tod 
gemacht wurde. Ich suche für Ihre Zustimmung, mich zu ermächtigen, 
Ihnen als seinen Erben zu präsentieren, um seine Bank zu veranlassen, 
die Summe von $ 7.5Million (Sieben Million Fünfhundert Tausend Dollar), 
die in einem suspendierten Bankkonto hinterlegt werden, zu übergeben. 
Seine Bank hat mir ein letztes Ultimatum als sein Anwalt ausgegeben, um 
seinen Erben zu präsentieren, da die gesetzlich zulässige Zeit für eine 
solche Forderung abgelaufen ist, sonst wird der Fonds beschlagnahmt.
Die beabsichtigte Transaktion wird unter einer legitimen Art und Weise 
durchgeführt, die Sie und ich vor jeglicher Rechtsverletzung schützen 
wird. Ich werde meine Position als Mandantenanwalt nutzen, um die 
Bearbeitung der benötigten rechtlichen Dokumentationen und die 
erfolgreiche Durchführung dieser Transaktion zu gewährleisten. Alles 
was ich verlange ist Ihr Verständnis und ehrliche Zusammenarbeit für 
den Erfolg. Beachten Sie, dass nach der erfolgreichen Durchführung der 
Transaktion, halten Sie 40% des gesamten Fonds nach allen Kosten.
 Ich gebe Ihnen ausführliche Details, wenn Sie Ihr Interesse 
bestätigen.

Ich hoffe, von Ihnen bald zu hören.

Mit freundlichen Grüßen
Paul Lopez-Bravo Esq
Tel: +34692899384


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CAMj-D2DO_CfvD77izsGfggoKP45HSC9aD6auUPAYC9Yeq_aX7w@mail.gmail.com>
@ 2017-05-04 16:44 ` gengdongjiu
  0 siblings, 0 replies; 1546+ messages in thread
From: gengdongjiu @ 2017-05-04 16:44 UTC (permalink / raw)
  To: mtsirkin, kvm, Tyler Baicar, qemu-devel, Xiongfeng Wang, ben,
	linux, kvmarm, huangshaoyu, lersek, songwenjun, wuquanming,
	Marc Zyngier, qemu-arm, imammedo, linux-arm-kernel,
	Ard Biesheuvel, pbonzini, James Morse

Dear James,
   Thanks a lot for your review and comments. I am very sorry for the
late response.

2017-05-04 23:42 GMT+08:00 gengdongjiu <gengdj.1984@gmail.com>:
>  Hi Dongjiu Geng,
>
> On 30/04/17 06:37, Dongjiu Geng wrote:
>> when happen SEA, deliver signal bus and handle the ioctl that
>> inject SEA abort to guest, so that guest can handle the SEA error.
>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 105b6ab..a96594f 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -20,8 +20,10 @@
>> @@ -1238,6 +1240,36 @@ static void coherent_cache_guest_page(struct kvm_vcpu *vcpu, kvm_pfn_t pfn,
>>   __coherent_cache_guest_page(vcpu, pfn, size);
>>  }
>>
>> +static void kvm_send_signal(unsigned long address, bool hugetlb, bool hwpoison)
>> +{
>> + siginfo_t info;
>> +
>> + info.si_signo   = SIGBUS;
>> + info.si_errno   = 0;
>> + if (hwpoison)
>> + info.si_code    = BUS_MCEERR_AR;
>> + else
>> + info.si_code    = 0;
>> +
>> + info.si_addr    = (void __user *)address;
>> + if (hugetlb)
>> + info.si_addr_lsb = PMD_SHIFT;
>> + else
>> + info.si_addr_lsb = PAGE_SHIFT;
>> +
>> + send_sig_info(SIGBUS, &info, current);
>> +}
>> +
> «  [hide part of quote]
>
> Punit reviewed the other version of this patch, this PMD_SHIFT is not the right
> thing to do, it needs a more accurate set of calls and shifts as there may be
> hugetlbfs pages other than PMD_SIZE.
>
> https://www.spinics.net/lists/arm-kernel/msg568919.html
>
> I haven't posted a new version of that patch because I was still hunting a bug
> in the hugepage/hwpoison code, even with Punit's fixes series I see -EFAULT
> returned to userspace instead of this hwpoison code being invoked.

  Ok, got it, thanks for your information.
>
> Please avoid duplicating functionality between patches, it wastes reviewers
> time, especially when we know there are problems with this approach.
>
>
>> +static void kvm_handle_bad_page(unsigned long address,
>> + bool hugetlb, bool hwpoison)
>> +{
>> + /* handle both hwpoison and other synchronous external Abort */
>> + if (hwpoison)
>> + kvm_send_signal(address, hugetlb, true);
>> + else
>> + kvm_send_signal(address, hugetlb, false);
>> +}
>
> Why the extra level of indirection? We only want to signal userspace like this
> from KVM for hwpoison. Signals for RAS related reasons should come from the bits
> of the kernel that decoded the error.

For the SEA, the are maily two types:
0b010000 Synchronous External Abort on memory access.
0b0101xx Synchronous External Abort on page table walk. DFSC[1:0]
encode the level.

hwpoison should belong to the  "Synchronous External Abort on memory access"
if the SEA type is not hwpoison, such as page table walk, do you mean
KVM do not deliver the SIGBUS?
If so, how the KVM handle the SEA type other than hwpoison?

>
> (hwpoison for KVM is a corner case as Qemu's memory effectively has two users,
> Qemu and KVM. This isn't the example of how user-space gets signalled.)
>
>
>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> index b37446a..780e3c4 100644
>> --- a/arch/arm64/kvm/guest.c
>> +++ b/arch/arm64/kvm/guest.c
>> @@ -277,6 +277,13 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
>>   return -EINVAL;
>>  }
>>
>> +int kvm_vcpu_ioctl_sea(struct kvm_vcpu *vcpu)
>> +{
>> + kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
>> +
>> + return 0;
>> +}
>
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index bb02909..1d2e2e7 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -1306,6 +1306,7 @@ struct kvm_s390_ucas_mapping {
>>  #define KVM_S390_GET_IRQ_STATE  _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state)
>>  /* Available with KVM_CAP_X86_SMM */
>>  #define KVM_SMI                   _IO(KVMIO,   0xb7)
>> +#define KVM_ARM_SEA               _IO(KVMIO,   0xb8)
>>
>>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0)
>>  #define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1)
>>
>
> Why do we need a userspace API for SEA? It can also be done by using
> KVM_{G,S}ET_ONE_REG to change the vcpu registers. The advantage of doing it this
> way is you can choose which ESR value to use.
>
> Adding a new API call to do something you could do with an old one doesn't look
> right.

James, I considered your suggestion before that use the
KVM_{G,S}ET_ONE_REG to change the vcpu registers. but I found it does
not have difference to use the alread existed KVM API.  so may be
changing the vcpu registers in qemu will duplicate with the KVM APIs.

injection a SEA is no more than setting some registers: elr_el1, PC,
PSTATE, SPSR_el1, far_el1, esr_el1
I seen this KVM API do the same thing as Qemu.  do you found call this
API will have issue and necessary to choose another ESR value?

I pasted the alread existed KVM API code:

static void inject_abt64(struct kvm_vcpu *vcpu, bool is_iabt, unsigned
long addr)
{
 unsigned long cpsr = *vcpu_cpsr(vcpu);
 bool is_aarch32 = vcpu_mode_is_32bit(vcpu);
 u32 esr = 0;
 *vcpu_elr_el1(vcpu) = *vcpu_pc(vcpu);
 *vcpu_pc(vcpu) = get_except_vector(vcpu, except_type_sync);
 *vcpu_cpsr(vcpu) = PSTATE_FAULT_BITS_64;
 *vcpu_spsr(vcpu) = cpsr;
 vcpu_sys_reg(vcpu, FAR_EL1) = addr;
 /*
  * Build an {i,d}abort, depending on the level and the
  * instruction set. Report an external synchronous abort.
  */
 if (kvm_vcpu_trap_il_is32bit(vcpu))
  esr |= ESR_ELx_IL;
 /*
  * Here, the guest runs in AArch64 mode when in EL1. If we get
  * an AArch32 fault, it means we managed to trap an EL0 fault.
  */
 if (is_aarch32 || (cpsr & PSR_MODE_MASK) == PSR_MODE_EL0t)
  esr |= (ESR_ELx_EC_IABT_LOW << ESR_ELx_EC_SHIFT);
 else
  esr |= (ESR_ELx_EC_IABT_CUR << ESR_ELx_EC_SHIFT);
 if (!is_iabt)
  esr |= ESR_ELx_EC_DABT_LOW << ESR_ELx_EC_SHIFT;
 vcpu_sys_reg(vcpu, ESR_EL1) = esr | ESR_ELx_FSC_EXTABT;
}

static void inject_abt32(struct kvm_vcpu *vcpu, bool is_pabt,
    unsigned long addr)
{
 u32 vect_offset;
 u32 *far, *fsr;
 bool is_lpae;
 if (is_pabt) {
  vect_offset = 12;
  far = &vcpu_cp15(vcpu, c6_IFAR);
  fsr = &vcpu_cp15(vcpu, c5_IFSR);
 } else { /* !iabt */
  vect_offset = 16;
  far = &vcpu_cp15(vcpu, c6_DFAR);
  fsr = &vcpu_cp15(vcpu, c5_DFSR);
 }
 prepare_fault32(vcpu, COMPAT_PSR_MODE_ABT | COMPAT_PSR_A_BIT, vect_offset);
 *far = addr;
 /* Give the guest an IMPLEMENTATION DEFINED exception */
 is_lpae = (vcpu_cp15(vcpu, c2_TTBCR) >> 31);
 if (is_lpae)
  *fsr = 1 << 9 | 0x34;
 else
  *fsr = 0x14;
}

/**
 * kvm_inject_dabt - inject a data abort into the guest
 * @vcpu: The VCPU to receive the undefined exception
 * @addr: The address to report in the DFAR
 *
 * It is assumed that this code is called from the VCPU thread and that the
 * VCPU therefore is not currently executing guest code.
 */
void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr)
{
 if (!(vcpu->arch.hcr_el2 & HCR_RW))
  inject_abt32(vcpu, false, addr);
 else
  inject_abt64(vcpu, false, addr);
}

>
>
> Thanks,
>
> James
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-05-04 23:57 Tammy
  0 siblings, 0 replies; 1546+ messages in thread
From: Tammy @ 2017-05-04 23:57 UTC (permalink / raw)
  To: Recipients

Hello,

I am Maj Gen. Tammy Smith. I would like to discuss with you privately. Contact me via my personal email below for further information.

Maj Gen. Tammy Smith
MajGenTammySm-1ViLX0X+lBJBDgjK7y7TUQ@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-05-16 22:46 USPS Parcels Delivery
  0 siblings, 0 replies; 1546+ messages in thread
From: USPS Parcels Delivery @ 2017-05-16 22:46 UTC (permalink / raw)
  To: linux-nvdimm-y27Ovi1pjclAfugRpC6u6w

Hello,

Your item has arrived at the USPS Post Office at  Wed, 17 May 2017 00:46:58
+0200, but the courier was unable to deliver parcel to you. 
You can download the shipment label attached!

Yours respectfully.
Mellicent Northan -  USPS Operation Manager.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <20170519213731.21484-1-mrugiero@gmail.com>
@ 2017-05-20  8:48 ` Boris Brezillon
  0 siblings, 0 replies; 1546+ messages in thread
From: Boris Brezillon @ 2017-05-20  8:48 UTC (permalink / raw)
  To: Mario J. Rugiero
  Cc: "linux-mtd, computersforpeace, marek.vasut, richard,
	cyrille.pitchen

Hi Mario,

Not sure how you created this patchset, but you miss a Subject, and the
diff-stat.

Please use

git format-patch -o <output-dir> -3 --cover-letter

to generate patches, then fill the cover letter in.

Once your cover letter is ready, you can send the patches with

git send-email --to ... --cc ... <output-dir>/*.patch

Regards,

Boris

Le Fri, 19 May 2017 18:37:28 -0300,
"Mario J. Rugiero" <mrugiero@gmail.com> a écrit :

> Some manufacturers use different layouts than MTD for the NAND, creating
> incompatibilities when going from a vendor-specific kernel to mainline.
> In particular, NAND devices for AllWinner boards write non-FF values to
> the bad block marker, and thus false positives arise when detecting bad
> blocks with the MTD driver. Sometimes there are enough false positives
> to make the device unusable.
> A proposed solution is NAND scrubbing, something a user who knows what
> she's doing (TM) could do to avoid this. It consists in erasing blocks
> disregarding the BBM. Since the user must know what she's doing, the
> only way to enable this feature is through a per-chip debugfs entry.
> 

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-07-07 17:04 Mrs Alice Walton
  0 siblings, 0 replies; 1546+ messages in thread
From: Mrs Alice Walton @ 2017-07-07 17:04 UTC (permalink / raw)


-- 
my name is Mrs. Alice Walton, a business woman an America Citizen and  
the heiress to the fortune of Walmart stores, born October 7, 1949. I  
have a mission for you worth $100,000,000.00(Hundred Million United  
State Dollars) which I intend using for CHARITY PROJECT to help the  
less privilege and orphanage

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-07-19  8:03 Lynne Smith
  0 siblings, 0 replies; 1546+ messages in thread
From: Lynne Smith @ 2017-07-19  8:03 UTC (permalink / raw)



My Name is lynne Smith please i really need your help?

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-07-19  8:03 Lynne Smith
  0 siblings, 0 replies; 1546+ messages in thread
From: Lynne Smith @ 2017-07-19  8:03 UTC (permalink / raw)
  To: sparclinux


My Name is lynne Smith please i really need your help?


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-07-27  1:12 Marie Angèle Ouattara
  0 siblings, 0 replies; 1546+ messages in thread
From: Marie Angèle Ouattara @ 2017-07-27  1:12 UTC (permalink / raw)


I need your cooperation in a profitable transaction and details will
be disclosed to you once i receive your reply.

Thanks,
Mrs. Marie.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-09-24 16:59 Estrin, Alex
       [not found] ` <F3529576D8E232409F431C309E29399336CD9886-8k97q/ur5Z1cIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 1546+ messages in thread
From: Estrin, Alex @ 2017-09-24 16:59 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

Leon,

The only intention I had with that note was to try to help with the process of screening patches
to maintain and improve the quality of common rdma code.
As it was taken as an insult I apologize for that. Never had in mind. 
I should have scripted it better. 
Perhaps, put the note through the checkpatch :)
And yes, as a member of rdma list I accept my share of blame for missing that bug.

Thanks,
Alex.

P.S.
My apologies to the community for the spam.


> On Sat, Sep 23, 2017 at 07:20:53PM +0000, Estrin, Alex wrote:
> > > > Hello,
> > > >
> > > > One minor note regarding the original commit 523633359224
> > > > that broke the core.
> > > > It seem it was let through without trivial validation,
> > > > otherwise it wouldn't pass the checkpatch.
> > >
> > > Can you be more specific? Are you referring to "WARNING: line over 80
> > > characters" or to something else? If yes, I feel really bad for you and
> > > your workplace.
> > Please don't be. Keep doing a great job at your workplace, I will do the same at
> mine.
> >
> > > Readability is a first priority for the submitted code.
> > I can agree with you on that, considering easy readable submitted code
> > does not introduce a trivial bugs.
> 
> It will be very helpful to everyone if you stop to throw claims without any actual
> support.
> 1. Doug allows enough time to respond on the patches and neither you and
> neither your
> colleagues didn't see such "trivial bug" back then.
> 2. It fixed another "trivial bug" introduced by your colleague which
> broke RoCE (one of the most popular fabric in the stack) and we didn't
> cry other the internet about it.
> 
> Before you are rushing to reply me, please consult with Denny, he can
> give you a short update on how hard the recent OPA changes in AH and
> LIDs broke the stack and RoCE/IB devices.
> 
> >
> > > ➜  linux-rdma git:(rdma-rc) git fp -1 523633359224 -o /tmp/
> > > /tmp/0001-IB-core-Fix-the-validations-of-a-multicast-LID-in-at.patch
> > > ➜  linux-rdma git:(rdma-rc) ./scripts/checkpatch.pl --strict /tmp/0001-IB-core-
> Fix-
> > > the-validations-of-a-multicast-LID-in-at.patch
> > > WARNING: line over 80 characters
> > > #45: FILE: drivers/infiniband/core/verbs.c:1584:
> > > +			if (qp->device->get_link_layer(qp->device, attr.port_num) !=
> > >
> > > total: 0 errors, 1 warnings, 0 checks, 62 lines checked
> > >
> > > NOTE: For some of the reported defects, checkpatch may be able to
> > >       mechanically convert to the typical style using --fix or --fix-inplace.
> > >
> > > /tmp/0001-IB-core-Fix-the-validations-of-a-multicast-LID-in-at.patch has
> style
> > > problems, please review.
> > >
> > > NOTE: If any of the errors are false positives, please report
> > >       them to the maintainer, see CHECKPATCH in MAINTAINERS.
> > >
> > >
> > > >
> > > > Thanks,
> > > > Alex.
> > > >
> > > > > On Fri, Sep 22, 2017 at 06:42:41PM -0400, Doug Ledford wrote:
> > > > > > On Fri, 2017-09-22 at 15:17 -0600, Jason Gunthorpe wrote:
> > > > > > > On Fri, Sep 22, 2017 at 05:06:26PM -0400, Doug Ledford wrote:
> > > > > > >
> > > > > > > > Sure, I get that, but I was already out on PTO on the 30th.  What
> > > > > > > > sucks
> > > > > > > > is that it landed right after I was out.  But I plan to have the
> > > > > > > > pull
> > > > > > > > request in before EOB today, so the difference between the 20th
> and
> > > > > > > > today is neglible.  Especially since lots of people doing QA
> > > > > > > > testing
> > > > > > > > prefer to take -rc tags, in that case, the difference is non-
> > > > > > > > existent.
> > > > > > >
> > > > > > > My thinking was that people should test -rc,
> > > > > >
> > > > > > Great, with you here...
> > > > > >
> > > > > > >  but if they have problems
> > > > > > > they could grab your for-rc branch and check if their issue is
> > > > > > > already
> > > > > > > fixed..
> > > > > >
> > > > > > They can do this too...
> > > > > >
> > > > > > But if that still doesn't resolve their problem, a quick check of the
> > > > > > mailing list contents isn't out of the question either.  In that case,
> > > > > > they would have found the solution to their problem.  But, when you
> get
> > > > > > right down to it, only one person reported it in addition to the
> > > > > > original poster, so either other people saw the original post and
> > > > > > compensated in their own testing, or (the more likely I think), most
> > > > > > people don't start testing -rcs until after -rc2.
> > > > >
> > > > > I don't know about other people, but our testing of -rc starts on -rc1
> > > > > and we are not waiting for -rc2. From my observe of netdev, they also
> > > > > start to test -rc immediately.
> > > > >
> > > > > Otherwise, what is the point of the week between -rc1 and -rc2?
> > > > >
> > > > > > Which is why I try to set -rc2 as a milestone for several purposes.
> > > > > > For getting in the bulk of the known fixes, but also as a branching
> > > > > > point for for-next.
> > > > > >
> > > > > > --
> > > > > > Doug Ledford <dledford@redhat.com>
> > > > > >     GPG KeyID: B826A3330E572FDD
> > > > > >     Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333
> 0E57
> > > 2FDD
> > > > > >
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] ` <F3529576D8E232409F431C309E29399336CD9886-8k97q/ur5Z1cIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-09-25  5:48   ` Leon Romanovsky
  0 siblings, 0 replies; 1546+ messages in thread
From: Leon Romanovsky @ 2017-09-25  5:48 UTC (permalink / raw)
  To: Estrin, Alex; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

[-- Attachment #1: Type: text/plain, Size: 606 bytes --]

On Sun, Sep 24, 2017 at 04:59:55PM +0000, Estrin, Alex wrote:
> Leon,
>
> The only intention I had with that note was to try to help with the process of screening patches
> to maintain and improve the quality of common rdma code.
> As it was taken as an insult I apologize for that. Never had in mind.
> I should have scripted it better.
> Perhaps, put the note through the checkpatch :)
> And yes, as a member of rdma list I accept my share of blame for missing that bug.

Thanks Alex for putting up with me at the mailing list.

>
> Thanks,
> Alex.
>
> P.S.
> My apologies to the community for the spam.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2017-10-01 10:53 Pierre
  0 siblings, 0 replies; 1546+ messages in thread
From: Pierre @ 2017-10-01 10:53 UTC (permalink / raw)
  To: sparclinux

Do you need a loan ,You want to pay off bills,Expand your business ?,Look no further we offer all kinds of loans both long and short term loan,for only 3% interest.If yes you need a loan Please email : pierrewolf07@gmail.com now to apply for a secured loan with the Applicant form below ,

Name :
Age:
Country:
State:
Loan amount :
Loan duration :
Phone number :

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-10-18 14:31 Mrs. Marie Angèle O
  0 siblings, 0 replies; 1546+ messages in thread
From: Mrs. Marie Angèle O @ 2017-10-18 14:31 UTC (permalink / raw)


-- 
I solicit for your partnership to claim $11 million. You will be
entitled to 40% of the sum reply for more details.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE::
@ 2017-11-01 14:57 Mrs Hsu Wealther
  0 siblings, 0 replies; 1546+ messages in thread
From: Mrs Hsu Wealther @ 2017-11-01 14:57 UTC (permalink / raw)
  To: linux-sparse

Are you available at your desk? I need you to please check your email box for a business letter.

With Regards,

Ms. Hui Weather

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:42 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:42 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:44 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:44 UTC (permalink / raw)
  To: keyrings

Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 ` Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 ` Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)
  To: linux-security-module

Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:56 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:56 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 14:57 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:57 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 15:00 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 15:00 UTC (permalink / raw)
  To: sparclinux

Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 15:01 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 15:01 UTC (permalink / raw)
  To: target-devel

Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2017-11-13 15:04 Amos Kalonzo
  0 siblings, 0 replies; 1546+ messages in thread
From: Amos Kalonzo @ 2017-11-13 15:04 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2018-01-24 19:54 Amy Riddering
  0 siblings, 0 replies; 1546+ messages in thread
From: Amy Riddering @ 2018-01-24 19:54 UTC (permalink / raw)


-- 
Mrs. Amy Riddering contacting you for missionary work and i pray you
will be kind enough to deliver my $7 million donation to the less
privileged ones in your country and God will bless your generation for
doing this humanitarian work.

I am a widow suffering of lung cancer which has damaged my liver and
back bone, i decided to entrust this fund to a God fearing person that
will use it for Charity works since i do not have any child who will
inherit this money after i die. Please i want your sincere reply to
know if you will be able to execute this project, and I will give you
more information on how the fund will be transferred to your bank
account. I am waiting for your reply.

Thanks and God bless you.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2018-01-24 22:11 Amy Riddering
  0 siblings, 0 replies; 1546+ messages in thread
From: Amy Riddering @ 2018-01-24 22:11 UTC (permalink / raw)


-- 
Mrs. Amy Riddering contacting you for missionary work and i pray you
will be kind enough to deliver my $7 million donation to the less
privileged ones and God will bless your generation for doing this
humanitarian assignment.

I am a widow suffering of lung cancer which has damaged my liver and
back bone, i decided to entrust $7M Dollars to a God fearing person
that will use it for Charity works since i do not have any child who
will inherit this money after i die. Please i want your urgent reply
to know if you can handle this project and I will relocate this fund
to you.

Thanks and remain blessed.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2018-01-27  3:56 Emile Kenold
  0 siblings, 0 replies; 1546+ messages in thread
From: Emile Kenold @ 2018-01-27  3:56 UTC (permalink / raw)


-- 
This is Mrs. Emile Kenold contacting you for missionary work and i
pray you will be kind enough to deliver my £7 million donation to the
less privileged ones.

I am a widow suffering of lung cancer which has damaged my liver and
back bone, i decided to entrust this fund to a God fearing person that
will use it for Charity works and i want your sincere reply to know if
you will be able to execute this project, I will give you more
information on how the fund will be transferred to you immediately I
receive your positive response.

Thanks and God bless you.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-02-27 13:39 [Outreachy kernel] Re: Re: [PATCH] h [Patch] Fixed unnecessary typecasting to in. Error found with checkpatch. Signed-off-by: Nishka Dasgupta <nishka.dasgupta_ug18@ashoka.edu.in> Julia Lawall
@ 2018-02-27 13:53 ` Nishka Dasgupta
  0 siblings, 0 replies; 1546+ messages in thread
From: Nishka Dasgupta @ 2018-02-27 13:53 UTC (permalink / raw)
  To: outreachy-kernel

Hi,
Weren't the original values already integers because of the "e8" appended?
If this was an unnecessary patch, I will try to do better next time.
Thanking you,
Nishka Dasgupta

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-02-27 13:58 [Outreachy kernel] Re: Julia Lawall
@ 2018-02-27 14:07 ` Nishka Dasgupta
  0 siblings, 0 replies; 1546+ messages in thread
From: Nishka Dasgupta @ 2018-02-27 14:07 UTC (permalink / raw)
  To: outreachy-kernel

Hi,
Thank you for the clarification!
Regards,
Nishka Dasgupta


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-03-01 19:33 [PATCH v2] staging: vc04_services: bcm2835-camera: Add blank line Greg KH
@ 2018-03-01 20:20 ` Nishka Dasgupta
  2018-03-01 20:31   ` Re: Greg KH
  0 siblings, 1 reply; 1546+ messages in thread
From: Nishka Dasgupta @ 2018-03-01 20:20 UTC (permalink / raw)
  To: gregkh, outreachy-kernel
  Cc: eric, stefan.wahren, f.fainelli, rjui, sbranden,
	bcm-kernel-feedback-list

This is with reference to your last email pointing out that you have no
context for what I am responding to. Unfortunately, I have been unable
to get mutt to load my inbox, so I could not and cannot quote text
directly. I have done my best to reproduce the conversation below in a
coherent fashion; nonetheless, I apologise for any persisting lack of
clarity.

An hour ago I submitted the following commit with respect to
drivers/staging/vc04_services/bcm2835-camera/mmal-vchiq.c: [PATCH v2]
staging: vc04_services: bcm2835-camera: Add blank line

Your reply: Checkpatch is wrong here, don't you think? Are you sure this
is actually doing what you think it is?

My reply: Checkpatch suggested two warnings for this file in consecutive
lines: "static VCHI_CONNECTION_T *vchi_connection;" and "static
VCHI_INSTANCE_T vchi_instance;". Both warnings said to add a blank line
after the declaration. If checkpatch was wrong, is it okay if I submit a
version 2 with a blank line only after "vchi_instance" and not below
"*vchi_connection" (effectively undoing one of my commits)?

Your response: First off, I have no context as to what you are
responding to here, please always quote the email you are responding to
properly.  Reviewers deal with hundreds of emails a day, and not having
context for what you say doesn't really work :( Also, properly wrap your
email lines at 72 columns, your email client should do this for you,
right?  Please respond to the email with the correct context and I will
be glad to respond.  Right now, I have no idea what you are referring
to.  

(I am sorry for the suboptimal formatting. I think I managed to linewrap
this email properly, however.) My question remains: may I submit a
revised commit that effectively undoes the first "add blank line"
commit, i.e. the one that added a line between "static VCHI_CONNECTION_T
*vchi_connection" and "static VCHI_INSTANCE_T vchi_instance"?  Thank you
for your time, and apologies again for the confusion.

Regards, 
Nishka Dasgupta

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-03-01 20:20 ` Nishka Dasgupta
@ 2018-03-01 20:31   ` Greg KH
  2018-03-08 18:23     ` Re: Nishka Dasgupta
  0 siblings, 1 reply; 1546+ messages in thread
From: Greg KH @ 2018-03-01 20:31 UTC (permalink / raw)
  To: Nishka Dasgupta
  Cc: outreachy-kernel, eric, stefan.wahren, f.fainelli, rjui, sbranden,
	bcm-kernel-feedback-list

On Fri, Mar 02, 2018 at 01:50:10AM +0530, Nishka Dasgupta wrote:
> This is with reference to your last email pointing out that you have no
> context for what I am responding to. Unfortunately, I have been unable
> to get mutt to load my inbox, so I could not and cannot quote text
> directly. I have done my best to reproduce the conversation below in a
> coherent fashion; nonetheless, I apologise for any persisting lack of
> clarity.

If mutt can't read your inbox, how are you reading it at all? :)

> An hour ago I submitted the following commit with respect to
> drivers/staging/vc04_services/bcm2835-camera/mmal-vchiq.c: [PATCH v2]
> staging: vc04_services: bcm2835-camera: Add blank line
> 
> Your reply: Checkpatch is wrong here, don't you think? Are you sure this
> is actually doing what you think it is?
> 
> My reply: Checkpatch suggested two warnings for this file in consecutive
> lines: "static VCHI_CONNECTION_T *vchi_connection;" and "static
> VCHI_INSTANCE_T vchi_instance;". Both warnings said to add a blank line
> after the declaration. If checkpatch was wrong, is it okay if I submit a
> version 2 with a blank line only after "vchi_instance" and not below
> "*vchi_connection" (effectively undoing one of my commits)?

Look at the code, and see what checkpatch is telling you and see if that
actually matches with what the code shows.  Then make the change that
you know is correct based on your knowledge of C, and how the code
should look.  Hint, checkpatch is wrong here, but the code is also wrong
as-is.

thanks,

greg k-h


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2018-03-01 21:17 Nishka Dasgupta
  0 siblings, 0 replies; 1546+ messages in thread
From: Nishka Dasgupta @ 2018-03-01 21:17 UTC (permalink / raw)
  To: outreachy-kernel, julia.lawall; +Cc: gregkh

This is with response to your message that you don't see any spaces in
the staging tree.

Yes, the commit "remove spaces after typecast to int" was an unnecessary
commit since I added the spaces in the first place. This patch was
submitted before I read your email regarding not submitting patches to
fix the incorrect changes I have proposed. Sorry about that. Will focus
on new patches from next time.

Thank you for your time, and apologies  for the confusion.

Regards, 
Nishka Dasgupta

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-03-01 20:04 [Outreachy kernel] [PATCH v2] staging: ks7010: Remove spaces after typecast to int Julia Lawall
@ 2018-03-01 21:20 ` Nishka Dasgupta
  0 siblings, 0 replies; 1546+ messages in thread
From: Nishka Dasgupta @ 2018-03-01 21:20 UTC (permalink / raw)
  To: outreachy-kernel, julia.lawall

This is with response to your message that you don't see any spaces in
the staging tree.

Yes, the commit "remove spaces after typecast to int" was an unnecessary
commit since I added the spaces in the first place. This patch was
submitted before I read your email regarding not submitting patches to
fix the incorrect changes I have proposed. Sorry about that. Will focus
on new patches from next time.

Thank you for your time, and apologies  for the confusion.

Regards, 
Nishka Dasgupta

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-03-02 18:01 [Outreachy kernel] Help with Mutt Julia Lawall
@ 2018-03-03 18:27 ` Nishka Dasgupta
  2018-03-03 18:38   ` Re: Julia Lawall
  0 siblings, 1 reply; 1546+ messages in thread
From: Nishka Dasgupta @ 2018-03-03 18:27 UTC (permalink / raw)
  To: julia.lawall, outreachy-kernel

This is with reference to your link with instructions on how to load my inbox in mutt. The instructions worked and I should be able to access all future correspondence in mutt. (The most recent emails, however, failed to load in mutt, which is why I am still not replying inline.)

Thank you for your help!

Regards, 
Nishka Dasgupta

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-03-03 18:27 ` Nishka Dasgupta
@ 2018-03-03 18:38   ` Julia Lawall
  0 siblings, 0 replies; 1546+ messages in thread
From: Julia Lawall @ 2018-03-03 18:38 UTC (permalink / raw)
  To: Nishka Dasgupta; +Cc: outreachy-kernel



On Sat, 3 Mar 2018, Nishka Dasgupta wrote:

> This is with reference to your link with instructions on how to load my inbox in mutt. The instructions worked and I should be able to access all future correspondence in mutt. (The most recent emails, however, failed to load in mutt, which is why I am still not replying inline.)

OK, it's great that you got the problem solved :)

julia


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]   ` <20180303044931.6902-1-keithp-aN4HjG94KOLQT0dZR+AlfA@public.gmane.org>
@ 2018-03-05 10:02     ` Michel Dänzer
       [not found]       ` <82fc592b-f680-c663-1a0f-7b522ca932d2-otUistvHUpPR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 1546+ messages in thread
From: Michel Dänzer @ 2018-03-05 10:02 UTC (permalink / raw)
  To: Keith Packard; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2018-03-03 05:49 AM, Keith Packard wrote:
> Here are the patches to the modesetting driver amended for the amdgpu
> driver.

Thanks for the patches. Unfortunately, since this driver still has to
compile and work with xserver >= 1.13, at least patches 1 & 3 cannot be
applied as is.

I was going to port these and take care of that anyway, though I might
not get around to it before April. If it can't wait that long, I can
give you details about what needs to be done.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]       ` <82fc592b-f680-c663-1a0f-7b522ca932d2-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2018-03-05 16:41         ` Keith Packard
  0 siblings, 0 replies; 1546+ messages in thread
From: Keith Packard @ 2018-03-05 16:41 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 749 bytes --]

Michel Dänzer <michel-otUistvHUpPR7s880joybQ@public.gmane.org> writes:

> On 2018-03-03 05:49 AM, Keith Packard wrote:
>> Here are the patches to the modesetting driver amended for the amdgpu
>> driver.
>
> Thanks for the patches. Unfortunately, since this driver still has to
> compile and work with xserver >= 1.13, at least patches 1 & 3 cannot be
> applied as is.
>
> I was going to port these and take care of that anyway, though I might
> not get around to it before April. If it can't wait that long, I can
> give you details about what needs to be done.

I'm good with that -- I needed this to test amdgpu vs modesetting for
some applications, and just having the patches with support is good
enough for me.

-- 
-keith

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-03-01 20:31   ` Re: Greg KH
@ 2018-03-08 18:23     ` Nishka Dasgupta
  2018-03-08 18:33       ` Re: Greg KH
  0 siblings, 1 reply; 1546+ messages in thread
From: Nishka Dasgupta @ 2018-03-08 18:23 UTC (permalink / raw)
  To: gregkh
  Cc: outreachy-kernel, eric, stefan.wahren, f.fainelli, rjui, sbranden,
	bcm-kernel-feedback-list

This is with reference to the Patch I submitted for
staging/vc04_services/bcm2835-camera: Add blank line. (Since the last
message on that thread, I have managed to configure mutt, but several of
the more recent emails, however, failed to load in mutt, which is why I
am still not replying inline. I should be able to respond to all future
emails inline, however.)

The last email on the thread ended:
> Look at the code, and see what checkpatch is telling you and see if
> that actually matches with what the code shows. Then make the change
> that you know is correct based on your knowledge of C, and how the
> code should look. Hint, checkpatch is wrong here, but the code is also
> wrong as-is.
>
> thanks, greg k-h

I am afraid I still have not been able to locate the error. Is it a
problem of too many or too few asterisks in the line "static
VCHI_CONNECTION_T *vchi_connection"?

Thank you for your help!

Regards, 
Nishka Dasgupta

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-03-08 18:23 ` Ivan Lapuz
@ 2018-03-08 18:33   ` Tommy Bowditch
  2018-03-08 18:36   ` Re: Ibrahim Tachijian
  1 sibling, 0 replies; 1546+ messages in thread
From: Tommy Bowditch @ 2018-03-08 18:33 UTC (permalink / raw)
  To: Ivan Lapuz; +Cc: wireguard

[-- Attachment #1: Type: text/plain, Size: 390 bytes --]

What step do you need help with? What are you stuck on?

T

On Thu, Mar 8, 2018 at 6:23 PM, Ivan Lapuz <ivanlapuz9@gmail.com> wrote:

> Hi there pls can you help me how to set up correctly for my wireguard
> interface..thankyou
>
> _______________________________________________
> WireGuard mailing list
> WireGuard@lists.zx2c4.com
> https://lists.zx2c4.com/mailman/listinfo/wireguard
>
>

[-- Attachment #2: Type: text/html, Size: 897 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-03-08 18:23     ` Re: Nishka Dasgupta
@ 2018-03-08 18:33       ` Greg KH
  0 siblings, 0 replies; 1546+ messages in thread
From: Greg KH @ 2018-03-08 18:33 UTC (permalink / raw)
  To: Nishka Dasgupta
  Cc: outreachy-kernel, eric, stefan.wahren, f.fainelli, rjui, sbranden,
	bcm-kernel-feedback-list

On Thu, Mar 08, 2018 at 11:53:22PM +0530, Nishka Dasgupta wrote:
> The last email on the thread ended:
> > Look at the code, and see what checkpatch is telling you and see if
> > that actually matches with what the code shows. Then make the change
> > that you know is correct based on your knowledge of C, and how the
> > code should look. Hint, checkpatch is wrong here, but the code is also
> > wrong as-is.
> >
> > thanks, greg k-h
> 
> I am afraid I still have not been able to locate the error. Is it a
> problem of too many or too few asterisks in the line "static
> VCHI_CONNECTION_T *vchi_connection"?

I have no context at all here, to know what you are talking about, and
your emails are long-gone from my queue, sorry.

That being said, asterisks mean special things in the C language.  Be
sure you understand what they are, how they work, and when they are
needed.  Perhaps some more C language experience is needed here before
you work on the kernel?

thanks,

greg k-h


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-03-08 18:23 ` Ivan Lapuz
  2018-03-08 18:33   ` Tommy Bowditch
@ 2018-03-08 18:36   ` Ibrahim Tachijian
  1 sibling, 0 replies; 1546+ messages in thread
From: Ibrahim Tachijian @ 2018-03-08 18:36 UTC (permalink / raw)
  To: Ivan Lapuz; +Cc: wireguard

[-- Attachment #1: Type: text/plain, Size: 413 bytes --]

Sure we can.

You can follow the instructions here:

https://www.wireguard.com/quickstart/

On Thu, Mar 8, 2018, 19:26 Ivan Lapuz <ivanlapuz9@gmail.com> wrote:

> Hi there pls can you help me how to set up correctly for my wireguard
> interface..thankyou
> _______________________________________________
> WireGuard mailing list
> WireGuard@lists.zx2c4.com
> https://lists.zx2c4.com/mailman/listinfo/wireguard
>

[-- Attachment #2: Type: text/html, Size: 1005 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CAAEAJfB76xseRqnYQfRihXY6g0Jyqwt8zfddU1W7CXDg3xEFFg@mail.gmail.com>
@ 2018-04-02 11:20 ` Ratheendran R
  2018-04-02 17:19   ` Re: Steve deRosier
  0 siblings, 1 reply; 1546+ messages in thread
From: Ratheendran R @ 2018-04-02 11:20 UTC (permalink / raw)
  To: Ezequiel Garcia; +Cc: backports

Hi All,

I am doing a mt7601 backport 4.14 on linux 2.6.37 kernel.

Now when I build the backorts aganist the 2.6.37 linux build I am
getting the below error.
'error: bit-field =E2=80=98<anonymous>=E2=80=99 width not an integer consta=
nt'

compilation error actuals
/home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/net/=
wireless/mediatek/mt7601u/main.c:181:3:
error: bit-field =E2=80=98<anonymous>=E2=80=99 width not an integer constan=
t
/home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/net/=
wireless/mediatek/mt7601u/main.c:
In function =E2=80=98mt7601u_set_rts_threshold=E2=80=99:
/home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/net/=
wireless/mediatek/mt7601u/main.c:329:2:
error: bit-field =E2=80=98<anonymous>=E2=80=99 width not an integer constan=
t
make[8]: *** [/home/hitem//software/source/wifi-drivers/backports-4.14-rc2-=
1/drivers/net/wireless/mediatek/mt7601u/main.o]
Error 1


Can anyone let me know how to fix this.

Thanks in Advance.
Ratheendran

On 3/15/17, Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> wrote:
> subscribe backports
> --
> To unsubscribe from this list: send the line "unsubscribe backports" in
>
--
To unsubscribe from this list: send the line "unsubscribe backports" in

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-04-02 11:20 ` Re: Ratheendran R
@ 2018-04-02 17:19   ` Steve deRosier
  2018-04-04  7:31     ` Re: Arend van Spriel
  0 siblings, 1 reply; 1546+ messages in thread
From: Steve deRosier @ 2018-04-02 17:19 UTC (permalink / raw)
  To: Ratheendran R; +Cc: Ezequiel Garcia, backports

I apologize for the resend... backports ML didn't pick it up because
apparently there's no way to set my tablet's email client to NOT send
HTML emails. Live and learn.

Hi Ratheendran,

On Mon, Apr 2, 2018 at 4:21 AM Ratheendran R <ratheendran.s@gmail.com> wrot=
e:
>
> Hi All,
>
> I am doing a mt7601 backport 4.14 on linux 2.6.37 kernel.
>
> Now when I build the backorts aganist the 2.6.37 linux build I am
> getting the below error.
> 'error: bit-field =E2=80=98<anonymous>=E2=80=99 width not an integer cons=
tant'
>
> compilation error actuals
> /home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/ne=
t/wireless/mediatek/mt7601u/main.c:181:3:
> error: bit-field =E2=80=98<anonymous>=E2=80=99 width not an integer const=
ant
> /home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/ne=
t/wireless/mediatek/mt7601u/main.c:
> In function =E2=80=98mt7601u_set_rts_threshold=E2=80=99:
> /home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/ne=
t/wireless/mediatek/mt7601u/main.c:329:2:
> error: bit-field =E2=80=98<anonymous>=E2=80=99 width not an integer const=
ant
> make[8]: *** [/home/hitem//software/source/wifi-drivers/backports-4.14-rc=
2-1/drivers/net/wireless/mediatek/mt7601u/main.o]
> Error 1

In order to backport 4.14, I assume you=E2=80=99re using a current backport=
s
build. I=E2=80=99m sorry to tell you, but the current version of backports
isn=E2=80=99t supported to backport that far. The backports wiki say we
support only back to 3.0, and IIRC, in more recent descusions, we
abandoned support even back that far. Though I honestly don=E2=80=99t recal=
l
to which version we=E2=80=99re testing for anymore.

No one ever likes this answer (including me), but perhaps you might
consider upgrading your base kernel release. 2.6.37 was released eight
years ago and 37 major version releases happened between there and
4.14. Your version of the kernel is missing support for some major
features and gobs of security fixes. It=E2=80=99ll be hard work to push it
forward eight years, but once you get there, keeping the kernel
up-to-date is pretty easy.

You might get it to work against such an old kernel, but it might take
a bit of work and ingenuity.

Good luck,
- Steve

--
Steve deRosier
Cal-Sierra Consulting LLC
https://www.cal-sierra.com/
--
To unsubscribe from this list: send the line "unsubscribe backports" in

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-04-02 17:19   ` Re: Steve deRosier
@ 2018-04-04  7:31     ` Arend van Spriel
  0 siblings, 0 replies; 1546+ messages in thread
From: Arend van Spriel @ 2018-04-04  7:31 UTC (permalink / raw)
  To: Steve deRosier, Ratheendran R; +Cc: Ezequiel Garcia, backports

On 4/2/2018 7:19 PM, Steve deRosier wrote:
> I apologize for the resend... backports ML didn't pick it up because
> apparently there's no way to set my tablet's email client to NOT send
> HTML emails. Live and learn.
>
> Hi Ratheendran,
>
> On Mon, Apr 2, 2018 at 4:21 AM Ratheendran R <ratheendran.s@gmail.com> wrote:
>>
>> Hi All,
>>
>> I am doing a mt7601 backport 4.14 on linux 2.6.37 kernel.
>>
>> Now when I build the backorts aganist the 2.6.37 linux build I am
>> getting the below error.
>> 'error: bit-field ‘<anonymous>’ width not an integer constant'
>>
>> compilation error actuals
>> /home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/net/wireless/mediatek/mt7601u/main.c:181:3:
>> error: bit-field ‘<anonymous>’ width not an integer constant
>> /home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/net/wireless/mediatek/mt7601u/main.c:
>> In function ‘mt7601u_set_rts_threshold’:
>> /home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/net/wireless/mediatek/mt7601u/main.c:329:2:
>> error: bit-field ‘<anonymous>’ width not an integer constant
>> make[8]: *** [/home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/net/wireless/mediatek/mt7601u/main.o]
>> Error 1
>
>
> In order to backport 4.14, I assume you’re using a current backports
> build. I’m sorry to tell you, but the current version of backports
> isn’t supported to backport that far. The backports wiki say we
> support only back to 3.0, and IIRC, in more recent descusions, we
> abandoned support even back that far. Though I honestly don’t recall
> to which version we’re testing for anymore.

If my recollection is correct we currently support backports to 3.10 and 
above. Actually, it is in the README. Now the README also refers to the 
wiki for "more-up-to date" information, but for this particular piece of 
information that is not true.

> No one ever likes this answer (including me), but perhaps you might
> consider upgrading your base kernel release. 2.6.37 was released eight
> years ago and 37 major version releases happened between there and
> 4.14. Your version of the kernel is missing support for some major
> features and gobs of security fixes. It’ll be hard work to push it
> forward eight years, but once you get there, keeping the kernel
> up-to-date is pretty easy.
>
> You might get it to work against such an old kernel, but it might take
> a bit of work and ingenuity.

You could try to include mt7601 into a backports-3.14 package, which 
support backporting to 2.6.26. It depends what ieee80211_hw ops it 
implements and what linux infrastructure is used. The bitfields stuff 
would need backporting, which you can probably get from the latest 
backports package. It is going to be a lot more work to get it going so 
the easier path is upgrading your kernel.

Regards,
Arend

--
To unsubscribe from this list: send the line "unsubscribe backports" in

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-08-28 17:34 Bills, Jason M
@ 2018-08-28 17:59 ` Brad Bishop
  2018-08-28 23:26   ` Bills, Jason M
  0 siblings, 1 reply; 1546+ messages in thread
From: Brad Bishop @ 2018-08-28 17:59 UTC (permalink / raw)
  To: Bills, Jason M; +Cc: openbmc@lists.ozlabs.org



> On Aug 28, 2018, at 1:34 PM, Bills, Jason M <jason.m.bills@intel.com> wrote:
> 
> I just added a comment to a github discussion about the IPMI SEL and thought I should share it here as well:

Can you send a link to the issue?

> 
> I have been working on a proof-of-concept to move the IPMI SEL entries out of D-Bus into journald instead.
> 
> Since journald allows custom metadata for log entries, I've thought of having the SEL message logged to the journal and using metadata to store the necessary IPMI info associated with the entry. Here is an example of logging a type 0x02 system event entry to journald:
> 
> sd_journal_send("MESSAGE=%s", message.c_str(),
>                            "PRIORITY=%i", selPriority,
>                            "MESSAGE_ID=%s", selMessageId,
>                            "IPMI_SEL_RECORD_ID=%d", recordId,
>                            "IPMI_SEL_RECORD_TYPE=%x", selSystemType,
>                            "IPMI_SEL_GENERATOR_ID=%x", genId,
>                            "IPMI_SEL_SENSOR_PATH=%s", path.c_str(),
>                            "IPMI_SEL_EVENT_DIR=%x", assert,
>                            "IPMI_SEL_DATA=%s", selDataStr,
>                            NULL);
> Using journald should allow for scaling to more SEL entries which should also enable us to support more generic IPMI behavior such as the Add SEL command.

A design point of OpenBMC from day one was to not design it around IPMI.
At a glance this feels counter to that goal.

I’m not immediately opposed to moving our error logs out of DBus, but can you provide
an extendible abstraction?  Not everyone uses SEL, or IPMI even.  At a minimum please
drop the letters ‘ipmi’ and ‘sel’ :-) from the base design, and save those for something
that translates to IPMI-speak.

As some background, our systems tend towards fewer ‘error logs’ with much more data per
log (4-16k), and yes I admit the current design is biased towards that and does not scale
when we approach 1000s of small SEL entries.

thx - brad

> 
> -Jason

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
  2018-08-28 17:59 ` Brad Bishop
@ 2018-08-28 23:26   ` Bills, Jason M
  2018-09-04 20:46     ` Brad Bishop
  0 siblings, 1 reply; 1546+ messages in thread
From: Bills, Jason M @ 2018-08-28 23:26 UTC (permalink / raw)
  To: Brad Bishop; +Cc: openbmc@lists.ozlabs.org

Here is a link to the issue: https://github.com/openbmc/openbmc/issues/3283#issuecomment-414361325.

The main things that started this proof-of-concept are that we have requirements to be fully IPMI compliant and to support 4000+ SEL entries.  Our attempts to scale the D-Bus logs to that level were not successful, so we started considering directly accessing journald as an alternative.

So far, I've been focused only on IPMI SEL, so I hadn't considered extending the change to non-IPMI error logs; however, these IPMI SEL entries should still fit in well as a subset of all other error logs which could also be moved to the journal.

My goal is to align with the OpenBMC design and keep anything IPMI-related isolated only to things that care about IPMI.  My thinking was that the metadata is a bit like background info, so it is a good place to hide data that only matters to the minority, such as the IPMI-specific data.  With this, the IPMI SEL logs can be included among all the existing error logs but still have the metadata for additional IPMI stuff that doesn't matter for anyone else.

So, for writing logs:
A. non-IPMI error logs can be written as normal
B. IPMI SEL entries are written with the IPMI-specific metadata populated

For reading logs:
A. non-IPMI readers see IPMI SEL entries as normal text logs
B. IPMI readers dump just the IPMI SEL entries and get the associated IPMI-specific info from the metadata

Thanks,
-Jason

-----Original Message-----
From: Brad Bishop <bradleyb@fuzziesquirrel.com> 
Sent: Tuesday, August 28, 2018 10:59 AM
To: Bills, Jason M <jason.m.bills@intel.com>
Cc: openbmc@lists.ozlabs.org
Subject: Re: 

> On Aug 28, 2018, at 1:34 PM, Bills, Jason M <jason.m.bills@intel.com> wrote:
> 
> I just added a comment to a github discussion about the IPMI SEL and thought I should share it here as well:

Can you send a link to the issue?

> 
> I have been working on a proof-of-concept to move the IPMI SEL entries out of D-Bus into journald instead.
> 
> Since journald allows custom metadata for log entries, I've thought of having the SEL message logged to the journal and using metadata to store the necessary IPMI info associated with the entry. Here is an example of logging a type 0x02 system event entry to journald:
> 
> sd_journal_send("MESSAGE=%s", message.c_str(),
>                            "PRIORITY=%i", selPriority,
>                            "MESSAGE_ID=%s", selMessageId,
>                            "IPMI_SEL_RECORD_ID=%d", recordId,
>                            "IPMI_SEL_RECORD_TYPE=%x", selSystemType,
>                            "IPMI_SEL_GENERATOR_ID=%x", genId,
>                            "IPMI_SEL_SENSOR_PATH=%s", path.c_str(),
>                            "IPMI_SEL_EVENT_DIR=%x", assert,
>                            "IPMI_SEL_DATA=%s", selDataStr,
>                            NULL);
> Using journald should allow for scaling to more SEL entries which should also enable us to support more generic IPMI behavior such as the Add SEL command.

A design point of OpenBMC from day one was to not design it around IPMI.
At a glance this feels counter to that goal.

I’m not immediately opposed to moving our error logs out of DBus, but can you provide an extendible abstraction?  Not everyone uses SEL, or IPMI even.  At a minimum please drop the letters ‘ipmi’ and ‘sel’ :-) from the base design, and save those for something that translates to IPMI-speak.

As some background, our systems tend towards fewer ‘error logs’ with much more data per log (4-16k), and yes I admit the current design is biased towards that and does not scale when we approach 1000s of small SEL entries.

thx - brad

> 
> -Jason

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-08-28 23:26   ` Bills, Jason M
@ 2018-09-04 20:46     ` Brad Bishop
  2018-09-04 21:28       ` Re: Ed Tanous
  0 siblings, 1 reply; 1546+ messages in thread
From: Brad Bishop @ 2018-09-04 20:46 UTC (permalink / raw)
  To: Bills, Jason M, Deepak Kodihalli; +Cc: openbmc@lists.ozlabs.org

> On Aug 28, 2018, at 7:26 PM, Bills, Jason M <jason.m.bills@intel.com> wrote:
> 
> Here is a link to the issue: https://github.com/openbmc/openbmc/issues/3283#issuecomment-414361325.
> 
> The main things that started this proof-of-concept are that we have requirements to be fully IPMI compliant and to support 4000+ SEL entries.  Our attempts to scale the D-Bus logs to that level were not successful, so we started considering directly accessing journald as an alternative.
> 
> So far, I've been focused only on IPMI SEL, so I hadn't considered extending the change to non-IPMI error logs; however, these IPMI SEL entries should still fit in well as a subset of all other error logs which could also be moved to the journal.
> 
> My goal is to align with the OpenBMC design and keep anything IPMI-related isolated only to things that care about IPMI.  

But it seems like you are proposing that every application that wants to make
a log needs to have the logic to translate its internal data model to IPMI speak,
so it can make a journal call with all the IPMI metadata populated.  Am I
understanding correctly?  That doesn’t seem aligned with keeping IPMI isolated.

A concrete example - phosphor-hwmon.  How do you intend to figure out something
like IPMI_SEL_SENSOR_PATH in the phosphor-hwmon application?  Actually it would
help quite a bit to understand how each of the fields in your sample below would
be determined by an arbitrary dbus application (like phosphor-hwmon).

Further, if you expand this approach to further log formats other than SEL,
won’t the applications become a mess of translation logic from the applications
data mode <-> log format in use?

> My thinking was that the metadata is a bit like background info, so it is a good place to hide data that only matters to the minority, such as the IPMI-specific data.  With this, the IPMI SEL logs can be included among all the existing error logs but still have the metadata for additional IPMI stuff that doesn't matter for anyone else.
> 
> So, for writing logs:
> A. non-IPMI error logs can be written as normal
> B. IPMI SEL entries are written with the IPMI-specific metadata populated
> 
> For reading logs:
> A. non-IPMI readers see IPMI SEL entries as normal text logs
> B. IPMI readers dump just the IPMI SEL entries and get the associated IPMI-specific info from the metadata

I’d rather have a single approach that works for everyone; although, I’m
not sure how that would look.

> 
> Thanks,
> -Jason

This is called top posting, please try to avoid when using the mail-list.
It makes threaded conversation hard to follow and respond to.  thx.

> 
> -----Original Message-----
> From: Brad Bishop <bradleyb@fuzziesquirrel.com> 
> Sent: Tuesday, August 28, 2018 10:59 AM
> To: Bills, Jason M <jason.m.bills@intel.com>
> Cc: openbmc@lists.ozlabs.org
> Subject: Re: 
> 
> 
> 
>> On Aug 28, 2018, at 1:34 PM, Bills, Jason M <jason.m.bills@intel.com> wrote:
>> 
>> I just added a comment to a github discussion about the IPMI SEL and thought I should share it here as well:
> 
> Can you send a link to the issue?
> 
>> 
>> I have been working on a proof-of-concept to move the IPMI SEL entries out of D-Bus into journald instead.
>> 
>> Since journald allows custom metadata for log entries, I've thought of having the SEL message logged to the journal and using metadata to store the necessary IPMI info associated with the entry. Here is an example of logging a type 0x02 system event entry to journald:
>> 
>> sd_journal_send("MESSAGE=%s", message.c_str(),
>>                           "PRIORITY=%i", selPriority,
>>                           "MESSAGE_ID=%s", selMessageId,
>>                           "IPMI_SEL_RECORD_ID=%d", recordId,
>>                           "IPMI_SEL_RECORD_TYPE=%x", selSystemType,
>>                           "IPMI_SEL_GENERATOR_ID=%x", genId,
>>                           "IPMI_SEL_SENSOR_PATH=%s", path.c_str(),
>>                           "IPMI_SEL_EVENT_DIR=%x", assert,
>>                           "IPMI_SEL_DATA=%s", selDataStr,
>>                           NULL);
>> Using journald should allow for scaling to more SEL entries which should also enable us to support more generic IPMI behavior such as the Add SEL command.
> 
> A design point of OpenBMC from day one was to not design it around IPMI.
> At a glance this feels counter to that goal.
> 
> I’m not immediately opposed to moving our error logs out of DBus, but can you provide an extendible abstraction?  Not everyone uses SEL, or IPMI even.  At a minimum please drop the letters ‘ipmi’ and ‘sel’ :-) from the base design, and save those for something that translates to IPMI-speak.
> 
> As some background, our systems tend towards fewer ‘error logs’ with much more data per log (4-16k), and yes I admit the current design is biased towards that and does not scale when we approach 1000s of small SEL entries.
> 
> thx - brad
> 
>> 
>> -Jason

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: Re:
  2018-09-04 20:46     ` Brad Bishop
@ 2018-09-04 21:28       ` Ed Tanous
  2018-09-04 22:34         ` Re: Brad Bishop
  0 siblings, 1 reply; 1546+ messages in thread
From: Ed Tanous @ 2018-09-04 21:28 UTC (permalink / raw)
  To: openbmc

On 09/04/2018 01:46 PM, Brad Bishop wrote:
> 
> But it seems like you are proposing that every application that wants to make
> a log needs to have the logic to translate its internal data model to IPMI speak,
> so it can make a journal call with all the IPMI metadata populated.  Am I
> understanding correctly?  That doesn’t seem aligned with keeping IPMI isolated.
> 

I think a key here is that not all logs will be implicitly converted to 
IPMI logs.  Having them be identical was the design that we started 
with, and abandoned because IPMI has some requirements that don't 
cleanly map from a standard syslog/text style to IPMI.

> A concrete example - phosphor-hwmon.  How do you intend to figure out something
> like IPMI_SEL_SENSOR_PATH in the phosphor-hwmon application?  Actually it would
> help quite a bit to understand how each of the fields in your sample below would
> be determined by an arbitrary dbus application (like phosphor-hwmon).

I'm not really understanding the root of the question.  If 
phosphor-hwmon is generating a threshold crossing log that stemmed from 
the /xyz/openbmc_project/sensors/my_super_awesome_temp_sensor, then it 
would simply fill that path into the IPMI_SEL_SENSOR_PATH field.  This 
is the same kind of mapping that the associations produce today, but 
captured in journald instead of the mapper.

Our thinking was that we could build either a static library, or a dbus 
daemon to simplify producing IPMI logs.  Because of the IPMI 
requirements around unique record ids, right now we're leaning toward 
the dbus interface with a single daemon responsible for IPMI SEL creation.
While technically it could be a part of phosphor-logging, we really want 
it to be easily removable for future platforms that have no need for 
IPMI, so the thought at this time it to keep it separate.

> 
> Further, if you expand this approach to further log formats other than SEL,
> won’t the applications become a mess of translation logic from the applications
> data mode <-> log format in use?
> 

I'm not really following this question.  Are there other binary log 
formats that we expect to come in the future that aren't text based, and 
could just be a journald translation?  So far as I know, IPMI SEL is the 
only one on my road map that has weird requirements, and needs some 
translation.  I don't expect it to be a mess, and I'm running under the 
assumption that _most_ daemons won't care about or support IPMI given 
its limitations.
You're right, this isn't intended to be a general solution for all 
binary logging formats, it's intended to be a short term hack while the 
industry transitions away from IPMI and toward something easier to 
generate arbitrarily.

> 
> I’d rather have a single approach that works for everyone; although, I’m
> not sure how that would look.
> 
The single approach is where we started, and weren't able to come up 
with anything that even came close to working in a production sense.  If 
you have ideas here on how this could be built that are cleaner than 
what we're proposing, we're very much interested.

> 
> This is called top posting, please try to avoid when using the mail-list.
> It makes threaded conversation hard to follow and respond to.  thx.
> 

(Ed beats Jason with very big stick)

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-09-04 21:28       ` Re: Ed Tanous
@ 2018-09-04 22:34         ` Brad Bishop
  2018-09-04 23:18           ` Re: Ed Tanous
  0 siblings, 1 reply; 1546+ messages in thread
From: Brad Bishop @ 2018-09-04 22:34 UTC (permalink / raw)
  To: Ed Tanous; +Cc: OpenBMC Maillist, Deepak Kodihalli



> On Sep 4, 2018, at 5:28 PM, Ed Tanous <ed.tanous@intel.com> wrote:
> 
> On 09/04/2018 01:46 PM, Brad Bishop wrote:
>> But it seems like you are proposing that every application that wants to make
>> a log needs to have the logic to translate its internal data model to IPMI speak,
>> so it can make a journal call with all the IPMI metadata populated.  Am I
>> understanding correctly?  That doesn’t seem aligned with keeping IPMI isolated.
> 
> I think a key here is that not all logs will be implicitly converted to IPMI logs.  Having them be identical was the design that we started with, and abandoned because IPMI has some requirements that don't cleanly map from a standard syslog/text style to IPMI.
> 
> 
>> A concrete example - phosphor-hwmon.  How do you intend to figure out something
>> like IPMI_SEL_SENSOR_PATH in the phosphor-hwmon application?  Actually it would
>> help quite a bit to understand how each of the fields in your sample below would
>> be determined by an arbitrary dbus application (like phosphor-hwmon).
> 
> I'm not really understanding the root of the question.  If phosphor-hwmon is generating a threshold crossing log that stemmed from the /xyz/openbmc_project/sensors/my_super_awesome_temp_sensor, then it would simply fill that path into the IPMI_SEL_SENSOR_PATH field.  

ok, then this is my ignorance of IPMI showing.  I thought IPMI_SEL_SENSOR_PATH was
an IPMI construct...

If this is the case then why not just call it SENSOR_PATH?  Then other logging formats
could use that metadata key without it being weird that it has ‘ipmi_sel’ in the name
of it.  And can we apply the same logic to the other keys or do some of the other keys
have more complicated translation logic (than none at all as in the case of the sensor
path) ?

> This is the same kind of mapping that the associations produce today, but captured in journald instead of the mapper.
> 
> Our thinking was that we could build either a static library, or a dbus daemon to simplify producing IPMI logs.  Because of the IPMI requirements around unique record ids, right now we're leaning toward the dbus interface with a single daemon responsible for IPMI SEL creation.

Thats great!  This is similar to how the phosphor-logging daemon creates dbus error
objects today.

Would you mind elaborating on this daemon and its dbus API?  I’m guessing it would probably
clear up any concerns I have.

> While technically it could be a part of phosphor-logging,

That isn’t what I was going for.  If you plan to implement a (separate) daemon that acts on
the journald metadata I think that is right approach too.

> we really want it to be easily removable for future platforms that have no need for IPMI, so the thought at this time it to keep it separate.

Agreed.

> 
>> Further, if you expand this approach to further log formats other than SEL,
>> won’t the applications become a mess of translation logic from the applications
>> data mode <-> log format in use?
> 
> I'm not really following this question.  Are there other binary log formats that we expect to come in the future that aren't text based, and could just be a journald translation?

Yes.  We have a binary format called PEL.  I doubt anyone would be interested in using
it but we need a foundation in OpenBMC that enables us to use it...

>  So far as I know, IPMI SEL is the only one on my road map that has weird requirements, and needs some translation.

Where is the translation happening?  In the new ipmi-sel daemon?  Or somewhere else?

>  I don't expect it to be a mess, and I'm running under the assumption that _most_ daemons won't care about or support IPMI given its limitations.

Well _all_ daemons already support IPMI SEL today.  The problem is just that the
implementation doesn’t scale.  I’m confused by _most_ daemons wouldn’t support
IPMI?

> You're right, this isn't intended to be a general solution for all binary logging formats, it's intended to be a short term hack while the industry transitions away from IPMI and toward something easier to generate arbitrarily.
> 
>> I’d rather have a single approach that works for everyone; although, I’m
>> not sure how that would look.
> The single approach is where we started, and weren't able to come up with anything that even came close to working in a production sense.  If you have ideas here on how this could be built that are cleaner than what we're proposing, we're very much interested.

I’m still trying to understand what is being proposed.

> 
>> This is called top posting, please try to avoid when using the mail-list.
>> It makes threaded conversation hard to follow and respond to.  thx.
> 
> (Ed beats Jason with very big stick)

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: Re:
  2018-09-04 22:34         ` Re: Brad Bishop
@ 2018-09-04 23:18           ` Ed Tanous
  2018-09-04 23:42             ` Re: Brad Bishop
  0 siblings, 1 reply; 1546+ messages in thread
From: Ed Tanous @ 2018-09-04 23:18 UTC (permalink / raw)
  To: Brad Bishop; +Cc: OpenBMC Maillist, Deepak Kodihalli

On 09/04/2018 03:34 PM, Brad Bishop wrote:
> 
> ok, then this is my ignorance of IPMI showing.  I thought IPMI_SEL_SENSOR_PATH was
> an IPMI construct...
> 
> If this is the case then why not just call it SENSOR_PATH?  Then other logging formats
> could use that metadata key without it being weird that it has ‘ipmi_sel’ in the name
> of it.  And can we apply the same logic to the other keys or do some of the other keys
> have more complicated translation logic (than none at all as in the case of the sensor
> path) ?

The thinking was that we would namespace all the parameters using 
IPMI_SEL to make it clear that was the only place they were used, and to 
avoid someone else using it inadvertently.  With that said, I could 
understand how it could be confusing.  Jason, any objections to 
un-namespacing the parameters?

> 
> Thats great!  This is similar to how the phosphor-logging daemon creates dbus error
> objects today.
> 
> Would you mind elaborating on this daemon and its dbus API?  I’m guessing it would probably
> clear up any concerns I have.
> 

Patches to phosphor-dbus-interfaces for a suggested interface are being 
put together as we speak.  Hopefully that will clarify it a little bit.

>> While technically it could be a part of phosphor-logging,
> 
> That isn’t what I was going for.  If you plan to implement a (separate) daemon that acts on
> the journald metadata I think that is right approach too.
> 
Agreed.

>> we really want it to be easily removable for future platforms that have no need for IPMI, so the thought at this time it to keep it separate.
> 
> Agreed.
> 
>>
>>> Further, if you expand this approach to further log formats other than SEL,
>>> won’t the applications become a mess of translation logic from the applications
>>> data mode <-> log format in use?
>>
>> I'm not really following this question.  Are there other binary log formats that we expect to come in the future that aren't text based, and could just be a journald translation?
> 
> Yes.  We have a binary format called PEL.  I doubt anyone would be interested in using
> it but we need a foundation in OpenBMC that enables us to use it...
> 

That makes more sense now.  A quick google on PEL makes it look like it 
could follow a similar model to what we're doing with IPMI by adding the 
extra metadata to journald when needed, while still producing the string 
versions for the basic cases.  By foundation do you mean shared code?  A 
quick skim of the implementation makes me suspect that there isn't going 
to be a lot of shared code, although they could share a similar design 
with a different implementation.

>>   So far as I know, IPMI SEL is the only one on my road map that has weird requirements, and needs some translation.
> 
> Where is the translation happening?  In the new ipmi-sel daemon?  Or somewhere else?

The translation would happen on the "addSel" IPMI command that gets used 
in-band by most BIOS implementations.  The ipmi-sel daemon will 
translate the raw bytes to a string, to be used in most modern loggers, 
along with the IPMI metadata, to be used in IPMI to source the various 
"get" SEL entry commands.

> 
>>   I don't expect it to be a mess, and I'm running under the assumption that _most_ daemons won't care about or support IPMI given its limitations.
> 
> Well _all_ daemons already support IPMI SEL today.  The problem is just that the
> implementation doesn’t scale.  I’m confused by _most_ daemons wouldn’t support
> IPMI?
> 

That should've clarified that most daemons won't care about IPMI _SEL_, 
given the extra calls and metadata that needs to be provided to 
implement it correctly.  My teams intention was to support the minimum 
subset of SEL that we can for backward compatibility, while providing 
the advanced logging (journald/syslog/redfish LogService) for a greater 
level of detail and capability.
If this assumption turns out to not be true, and we end up adding IPMI 
SEL logging to all the daemons, so be it, I think it will still scale, 
but I really hope that's not the path we go down.

>> You're right, this isn't intended to be a general solution for all binary logging formats, it's intended to be a short term hack while the industry transitions away from IPMI and toward something easier to generate arbitrarily.
>>
>>> I’d rather have a single approach that works for everyone; although, I’m
>>> not sure how that would look.
>> The single approach is where we started, and weren't able to come up with anything that even came close to working in a production sense.  If you have ideas here on how this could be built that are cleaner than what we're proposing, we're very much interested.
> 
> I’m still trying to understand what is being proposed.
> 
>>
>>> This is called top posting, please try to avoid when using the mail-list.
>>> It makes threaded conversation hard to follow and respond to.  thx.
>>
>> (Ed beats Jason with very big stick)

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-09-04 23:18           ` Re: Ed Tanous
@ 2018-09-04 23:42             ` Brad Bishop
  2018-09-05 21:20               ` Re: Bills, Jason M
  0 siblings, 1 reply; 1546+ messages in thread
From: Brad Bishop @ 2018-09-04 23:42 UTC (permalink / raw)
  To: Ed Tanous; +Cc: OpenBMC Maillist, Deepak Kodihalli



> On Sep 4, 2018, at 7:18 PM, Ed Tanous <ed.tanous@intel.com> wrote:
> 
> On 09/04/2018 03:34 PM, Brad Bishop wrote:
>> ok, then this is my ignorance of IPMI showing.  I thought IPMI_SEL_SENSOR_PATH was
>> an IPMI construct...
>> If this is the case then why not just call it SENSOR_PATH?  Then other logging formats
>> could use that metadata key without it being weird that it has ‘ipmi_sel’ in the name
>> of it.  And can we apply the same logic to the other keys or do some of the other keys
>> have more complicated translation logic (than none at all as in the case of the sensor
>> path) ?
> 
> The thinking was that we would namespace all the parameters using IPMI_SEL to make it clear that was the only place they were used, and to avoid someone else using it inadvertently.  
> With that said, I could understand how it could be confusing.  Jason, any objections to un-namespacing the parameters?

Thanks for being flexible on this but lets wait until we are on the same page before
changing anything.  Why would you want to discourage it from being used in another
context?

> 
>> Thats great!  This is similar to how the phosphor-logging daemon creates dbus error
>> objects today.
>> Would you mind elaborating on this daemon and its dbus API?  I’m guessing it would probably
>> clear up any concerns I have.
> 
> Patches to phosphor-dbus-interfaces for a suggested interface are being put together as we speak.  Hopefully that will clarify it a little bit.

Great, thank you.

> 
>>> While technically it could be a part of phosphor-logging,
>> That isn’t what I was going for.  If you plan to implement a (separate) daemon that acts on
>> the journald metadata I think that is right approach too.
> Agreed.
> 
>>> we really want it to be easily removable for future platforms that have no need for IPMI, so the thought at this time it to keep it separate.
>> Agreed.
>>> 
>>>> Further, if you expand this approach to further log formats other than SEL,
>>>> won’t the applications become a mess of translation logic from the applications
>>>> data mode <-> log format in use?
>>> 
>>> I'm not really following this question.  Are there other binary log formats that we expect to come in the future that aren't text based, and could just be a journald translation?
>> Yes.  We have a binary format called PEL.  I doubt anyone would be interested in using
>> it but we need a foundation in OpenBMC that enables us to use it...
> 
> That makes more sense now.  A quick google on PEL makes it look like it could follow a similar model to what we're doing with IPMI by adding the extra metadata to journald when needed, while still producing the string versions for the basic cases.  By foundation do you mean shared code?  A quick skim of the implementation makes me suspect that there isn't going to be a lot of shared code, although they could share a similar design with a different implementation.

By foundation I simply mean we need a way to support multiple logging formats that doesn’t
require every OpenBMC application to know how to translate from its internal data model
(usually dbus) to N logging formats.

> 
>>>  So far as I know, IPMI SEL is the only one on my road map that has weird requirements, and needs some translation.
>> Where is the translation happening?  In the new ipmi-sel daemon?  Or somewhere else?
> 
> The translation would happen on the "addSel" IPMI command that gets used in-band by most BIOS implementations.  The ipmi-sel daemon will translate the raw bytes to a string, to be used in most modern loggers, along with the IPMI metadata, to be used in IPMI to source the various "get" SEL entry commands.

That all sounds fine.  But what about applications on the BMC creating SELs for their
own errors?  Do you want to do that?  How will that work?

> 
>>>  I don't expect it to be a mess, and I'm running under the assumption that _most_ daemons won't care about or support IPMI given its limitations.
>> Well _all_ daemons already support IPMI SEL today.  The problem is just that the
>> implementation doesn’t scale.  I’m confused by _most_ daemons wouldn’t support
>> IPMI?
> 
> That should've clarified that most daemons won't care about IPMI _SEL_, given the extra calls and metadata that needs to be provided to implement it correctly.  My teams intention was to support the minimum subset of SEL that we can for backward compatibility, while providing the advanced logging (journald/syslog/redfish LogService) for a greater level of detail and capability.
> If this assumption turns out to not be true, and we end up adding IPMI SEL logging to all the daemons, so be it, I think it will still scale, but I really hope that's not the path we go down.

Oh.  Does this mean you intend for code like Jason originally proposed to _only_ appear in
the ipmi-sel daemon?  And not in applications like phosphor-hwmon?

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-09-04 23:42             ` Re: Brad Bishop
@ 2018-09-05 21:20               ` Bills, Jason M
  0 siblings, 0 replies; 1546+ messages in thread
From: Bills, Jason M @ 2018-09-05 21:20 UTC (permalink / raw)
  To: openbmc

>>> ok, then this is my ignorance of IPMI showing.  I thought IPMI_SEL_SENSOR_PATH was
>>> an IPMI construct...
>>> If this is the case then why not just call it SENSOR_PATH?  Then other logging formats
>>> could use that metadata key without it being weird that it has ‘ipmi_sel’ in the name
>>> of it.  And can we apply the same logic to the other keys or do some of the other keys
>>> have more complicated translation logic (than none at all as in the case of the sensor
>>> path) ?
>>
>> The thinking was that we would namespace all the parameters using IPMI_SEL to make it clear that was the only place they were used, and to avoid someone else using it inadvertently.
>> With that said, I could understand how it could be confusing.  Jason, any objections to un-namespacing the parameters?
> 
> Thanks for being flexible on this but lets wait until we are on the same page before
> changing anything.  Why would you want to discourage it from being used in another
> context?
> 
Except for the sensor path, all of the proposed IPMI metadata is 
specific for IPMI:
"IPMI_SEL_RECORD_ID" = Two byte unique ID number for each SEL entry
"IPMI_SEL_RECORD_TYPE" = The type of SEL entry (system or OEM) which 
determines the definition of the remaining bytes
"IPMI_SEL_GENERATOR_ID" = The IPMI Generator ID (usually the IPMB Slave 
Address) of the SEL entry requester
"IPMI_SEL_SENSOR_PATH" = Path of the sensor used to find IPMI data (such 
as sensor number) for the sensor
"IPMI_SEL_EVENT_DIR" = Whether the sensor is asserting or de-asserting
"IPMI_SEL_DATA" = Raw binary data included in the SEL entry

I named them all as IPMI_SEL as a group so they would be clearly 
separate and easy to remove in the future.  However, if any of the 
metadata would be useful elsewhere, the names can be more generic.

>>
>>> Thats great!  This is similar to how the phosphor-logging daemon creates dbus error
>>> objects today.
>>> Would you mind elaborating on this daemon and its dbus API?  I’m guessing it would probably
>>> clear up any concerns I have.
>>
>> Patches to phosphor-dbus-interfaces for a suggested interface are being put together as we speak.  Hopefully that will clarify it a little bit.
> 
> Great, thank you.
> 
I have pushed a suggestion for the interface here: 
https://gerrit.openbmc-project.xyz/#/c/openbmc/phosphor-dbus-interfaces/+/12494/

>>
>>>> While technically it could be a part of phosphor-logging,
>>> That isn’t what I was going for.  If you plan to implement a (separate) daemon that acts on
>>> the journald metadata I think that is right approach too.
>> Agreed.
>>
>>>> we really want it to be easily removable for future platforms that have no need for IPMI, so the thought at this time it to keep it separate.
>>> Agreed.
>>>>
>>>>> Further, if you expand this approach to further log formats other than SEL,
>>>>> won’t the applications become a mess of translation logic from the applications
>>>>> data mode <-> log format in use?
>>>>
>>>> I'm not really following this question.  Are there other binary log formats that we expect to come in the future that aren't text based, and could just be a journald translation?
>>> Yes.  We have a binary format called PEL.  I doubt anyone would be interested in using
>>> it but we need a foundation in OpenBMC that enables us to use it...
>>
>> That makes more sense now.  A quick google on PEL makes it look like it could follow a similar model to what we're doing with IPMI by adding the extra metadata to journald when needed, while still producing the string versions for the basic cases.  By foundation do you mean shared code?  A quick skim of the implementation makes me suspect that there isn't going to be a lot of shared code, although they could share a similar design with a different implementation.
> 
> By foundation I simply mean we need a way to support multiple logging formats that doesn’t
> require every OpenBMC application to know how to translate from its internal data model
> (usually dbus) to N logging formats.
> 
>>
>>>>   So far as I know, IPMI SEL is the only one on my road map that has weird requirements, and needs some translation.
>>> Where is the translation happening?  In the new ipmi-sel daemon?  Or somewhere else?
>>
>> The translation would happen on the "addSel" IPMI command that gets used in-band by most BIOS implementations.  The ipmi-sel daemon will translate the raw bytes to a string, to be used in most modern loggers, along with the IPMI metadata, to be used in IPMI to source the various "get" SEL entry commands.
> 
> That all sounds fine.  But what about applications on the BMC creating SELs for their
> own errors?  Do you want to do that?  How will that work?
> 

An application on the BMC that needs to create a SEL can call the 
IpmiSelAdd method to request a new SEL entry in the journal.
>>
>>>>   I don't expect it to be a mess, and I'm running under the assumption that _most_ daemons won't care about or support IPMI given its limitations.
>>> Well _all_ daemons already support IPMI SEL today.  The problem is just that the
>>> implementation doesn’t scale.  I’m confused by _most_ daemons wouldn’t support
>>> IPMI?
>>
>> That should've clarified that most daemons won't care about IPMI _SEL_, given the extra calls and metadata that needs to be provided to implement it correctly.  My teams intention was to support the minimum subset of SEL that we can for backward compatibility, while providing the advanced logging (journald/syslog/redfish LogService) for a greater level of detail and capability.
>> If this assumption turns out to not be true, and we end up adding IPMI SEL logging to all the daemons, so be it, I think it will still scale, but I really hope that's not the path we go down.
> 
> Oh.  Does this mean you intend for code like Jason originally proposed to _only_ appear in
> the ipmi-sel daemon?  And not in applications like phosphor-hwmon?
> 

Yes, the original proposed code:
             sd_journal_send("MESSAGE=%s", message.c_str(),
                             "PRIORITY=%i", selPriority,
                             "MESSAGE_ID=%s", selMessageId,
                             "IPMI_SEL_RECORD_ID=%d", recordId,
                             "IPMI_SEL_RECORD_TYPE=%x", selSystemType,
                             "IPMI_SEL_GENERATOR_ID=%x", genId,
                             "IPMI_SEL_SENSOR_PATH=%s", path.c_str(),
                             "IPMI_SEL_EVENT_DIR=%x", assert,
                             "IPMI_SEL_DATA=%s", selDataStr,
                             NULL);
will only be in the ipmi-sel daemon.  Applications like phosphor-hwmon 
would use the IpmiSelAdd method to request SEL entries.	

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-10-26 12:54 Mohanraj B
@ 2018-10-27 16:55 ` Jens Axboe
  0 siblings, 0 replies; 1546+ messages in thread
From: Jens Axboe @ 2018-10-27 16:55 UTC (permalink / raw)
  To: Mohanraj B, fio

On 10/26/18 6:54 AM, Mohanraj B wrote:
> Hello,
> 
> I am trying to check how option --clocksource works.
> 
> 
> bash# fio --name job1 --size 10m --clocksource 2
>         valid values: gettimeofday Use gettimeofday(2) for timing
>                     : clock_gettime Use clock_gettime(2) for timing
>                     : cpu        Use CPU private clock
> 
> fio: failed parsing clocksource=2
> 
> bash# fio --name job1 --size 10m --clocksource gettimeofday(2)
> bash: syntax error near unexpected token `('
> 
> Below command works fine.
> bash# fio --name job1 --size 10m --clocksource gettimeofday
> 
> It runs without error but quiet not sure how to see the effect of this
> option. also tried other options - clock_gettime, cpu gettimeofday and
> dont see any difference.
> 
> Also is there any error in documentation passing gettimeofday(2)
> throws parse error.

The format is 'value' 'help', so you'd want to do:

--clocksource=gettimeofday

for instance.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2018-10-29 14:20 Beierl, Mark
  2018-10-29 14:37 ` Re: Mohanraj B
  0 siblings, 1 reply; 1546+ messages in thread
From: Beierl, Mark @ 2018-10-29 14:20 UTC (permalink / raw)
  To: Jens Axboe, Mohanraj B, fio@vger.kernel.org

On 2018-10-27, 12:55, "fio-owner@vger.kernel.org on behalf of Jens Axboe" <fio-owner@vger.kernel.org on behalf of axboe@kernel.dk> wrote:
    
    [EXTERNAL EMAIL] 
    Please report any suspicious attachments, links, or requests for sensitive information.
    
    
    On 10/26/18 6:54 AM, Mohanraj B wrote:
    > Hello,
    > 
    > I am trying to check how option --clocksource works.
    > 
    > 
    > bash# fio --name job1 --size 10m --clocksource 2
    >         valid values: gettimeofday Use gettimeofday(2) for timing
    >                     : clock_gettime Use clock_gettime(2) for timing
    >                     : cpu        Use CPU private clock
    > 
    > fio: failed parsing clocksource=2
    > 
    > bash# fio --name job1 --size 10m --clocksource gettimeofday(2)
    > bash: syntax error near unexpected token `('
    > 
    > Below command works fine.
    > bash# fio --name job1 --size 10m --clocksource gettimeofday
    > 
    > It runs without error but quiet not sure how to see the effect of this
    > option. also tried other options - clock_gettime, cpu gettimeofday and
    > dont see any difference.
    > 
    > Also is there any error in documentation passing gettimeofday(2)
    > throws parse error.
    
    The format is 'value' 'help', so you'd want to do:
    
    --clocksource=gettimeofday
    
    for instance.
    
    -- 
    Jens Axboe

Hello, Mohanraj

The help output that you see above states that using --clocksource=gettimeofday will use the gettimeofday function as defined in the man page in the section (2), which is where all the system calls manuals are stored.  The (2) is  not meant to be part of the command line, it is part of the description of the help text, which tells you where to find more information on what is being used to implement the clocksource.

Hope that helps clarify the help text.

Regards,
Mark




^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-10-29 14:20 Re: Beierl, Mark
@ 2018-10-29 14:37 ` Mohanraj B
  0 siblings, 0 replies; 1546+ messages in thread
From: Mohanraj B @ 2018-10-29 14:37 UTC (permalink / raw)
  To: Beierl, Mark; +Cc: Jens Axboe, fio

[-- Attachment #1: Type: text/plain, Size: 2053 bytes --]

Thanks Axboe and Mark.

On Mon 29 Oct, 2018, 7:50 PM Beierl, Mark, <Mark.Beierl@dell.com> wrote:

> On 2018-10-27, 12:55, "fio-owner@vger.kernel.org on behalf of Jens Axboe"
> <fio-owner@vger.kernel.org on behalf of axboe@kernel.dk> wrote:
>
>     [EXTERNAL EMAIL]
>     Please report any suspicious attachments, links, or requests for
> sensitive information.
>
>
>     On 10/26/18 6:54 AM, Mohanraj B wrote:
>     > Hello,
>     >
>     > I am trying to check how option --clocksource works.
>     >
>     >
>     > bash# fio --name job1 --size 10m --clocksource 2
>     >         valid values: gettimeofday Use gettimeofday(2) for timing
>     >                     : clock_gettime Use clock_gettime(2) for timing
>     >                     : cpu        Use CPU private clock
>     >
>     > fio: failed parsing clocksource=2
>     >
>     > bash# fio --name job1 --size 10m --clocksource gettimeofday(2)
>     > bash: syntax error near unexpected token `('
>     >
>     > Below command works fine.
>     > bash# fio --name job1 --size 10m --clocksource gettimeofday
>     >
>     > It runs without error but quiet not sure how to see the effect of
> this
>     > option. also tried other options - clock_gettime, cpu gettimeofday
> and
>     > dont see any difference.
>     >
>     > Also is there any error in documentation passing gettimeofday(2)
>     > throws parse error.
>
>     The format is 'value' 'help', so you'd want to do:
>
>     --clocksource=gettimeofday
>
>     for instance.
>
>     --
>     Jens Axboe
>
> Hello, Mohanraj
>
> The help output that you see above states that using
> --clocksource=gettimeofday will use the gettimeofday function as defined in
> the man page in the section (2), which is where all the system calls
> manuals are stored.  The (2) is  not meant to be part of the command line,
> it is part of the description of the help text, which tells you where to
> find more information on what is being used to implement the clocksource.
>
> Hope that helps clarify the help text.
>
> Regards,
> Mark
>
>
>
>

[-- Attachment #2: Type: text/html, Size: 2895 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE,
@ 2018-11-06  1:21 Miss Juliet Muhammad
  0 siblings, 0 replies; 1546+ messages in thread
From: Miss Juliet Muhammad @ 2018-11-06  1:21 UTC (permalink / raw)
  To: Recipients

I have a deal for you, in your region.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE,
@ 2018-11-11  4:20 Miss Juliet Muhammad
  0 siblings, 0 replies; 1546+ messages in thread
From: Miss Juliet Muhammad @ 2018-11-11  4:20 UTC (permalink / raw)
  To: Recipients

I have a deal for you, in your region.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CACikiw1uNCYKzo9vjG=AZHpARWv-nzkCX=D-aWBssM7vYZrQdQ@mail.gmail.com>
@ 2018-11-12 10:09 ` Ravi Kumar
  2018-11-15 13:11 ` Re: Ondrej Mosnacek
  1 sibling, 0 replies; 1546+ messages in thread
From: Ravi Kumar @ 2018-11-12 10:09 UTC (permalink / raw)
  To: selinux

<<Sorry re-sending in plan text >>
Hi team ,

On android- with latest kernels 4.14  we are seeing some denials which
seem to be very much genuine to be address . Where kernel is trying to
kill its own  created process ( might be for maintenance) .
These are seen in long Stress testing .  But  I dont see any one
adding such rule in general so the question is  do we see any risk
which made us not to add such rules ?

1.   avc: denied { kill } for pid=2432 comm="irq/66-90b6300."
capability=5 scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0
tclass=capability permissive=0
2.   avc: denied { kill } for pid=69 comm="rcuop/6" capability=5
scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0 tclass=capability
permissive=0
3.   avc: denied { kill } for pid=0 comm="swapper/1" capability=5
scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0 tclass=capability
permissive=0
4.   avc: denied { kill } for pid=4185 comm="kworker/0:4" capability=5
scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0 tclass=capability
permissive=0

This is self capability any one in kernel context  should be able to
do such operations  I guess.

Regards,
Ravi

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CACikiw1uNCYKzo9vjG=AZHpARWv-nzkCX=D-aWBssM7vYZrQdQ@mail.gmail.com>
  2018-11-12 10:09 ` Ravi Kumar
@ 2018-11-15 13:11 ` Ondrej Mosnacek
  1 sibling, 0 replies; 1546+ messages in thread
From: Ondrej Mosnacek @ 2018-11-15 13:11 UTC (permalink / raw)
  To: nxp.ravi; +Cc: selinux, Paul Moore, Stephen Smalley, SElinux list

On Mon, Nov 12, 2018 at 7:56 AM Ravi Kumar <nxp.ravi@gmail.com> wrote:
> Hi team ,
>
> On android- with latest kernels 4.14  we are seeing some denials which seem to be very much genuine to be address . Where kernel is trying to kill its own  created process ( might be for maintenance) .
> These are seen in long Stress testing .  But  I dont see any one adding such rule in general so the question is  do we see any risk  which made us not to add such rules ?
>
> 1.   avc: denied { kill } for pid=2432 comm="irq/66-90b6300." capability=5 scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0 tclass=capability permissive=0
> 2.   avc: denied { kill } for pid=69 comm="rcuop/6" capability=5 scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0 tclass=capability permissive=0
> 3.   avc: denied { kill } for pid=0 comm="swapper/1" capability=5 scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0 tclass=capability permissive=0
> 4.   avc: denied { kill } for pid=4185 comm="kworker/0:4" capability=5 scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0 tclass=capability permissive=0
>
> This is self capability any one in kernel context  should be able to do such operations  I guess.

The reference policy does contain a rule that allows this kind of
operations, see:
https://github.com/SELinuxProject/refpolicy/blob/master/policy/modules/kernel/kernel.te#L203

It is also present in the Fedora policy on my system:

$ sesearch -A -s kernel_t -t kernel_t -c capability -p kill
allow kernel_t kernel_t:capability { audit_control audit_write chown
dac_override dac_read_search fowner fsetid ipc_lock ipc_owner kill
lease linux_immutable mknod net_admin net_bind_service net_broadcast
net_raw setfcap setgid setpcap s
etuid sys_admin sys_boot sys_chroot sys_nice sys_pacct sys_ptrace
sys_rawio sys_resource sys_time sys_tty_config };

Therefore I would say it is perfectly fine to add such rule to your
policy as well.

Cheers,

--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CAJUWh6qyHerKg=-oaFN+USa10_Aag5+SYjBOeLCX1qM+WcDUwA@mail.gmail.com>
@ 2018-11-23  7:52 ` Chris Murphy
  2018-11-23  9:34   ` Re: Andy Leadbetter
  0 siblings, 1 reply; 1546+ messages in thread
From: Chris Murphy @ 2018-11-23  7:52 UTC (permalink / raw)
  To: andy.leadbetter, Btrfs BTRFS

On Thu, Nov 22, 2018 at 11:41 PM Andy Leadbetter
<andy.leadbetter@theleadbetters.com> wrote:
>
> I have a failing 2TB disk that is part of a 4 disk RAID 6 system.  I
> have added a new 2TB disk to the computer, and started a BTRFS replace
> for the old and new disk.  The process starts correctly however some
> hours into the job, there is an error and kernel oops. relevant log
> below.

The relevant log is the entire dmesg, not a snippet. It's decently
likely there's more than one thing going on here. We also need full
output of 'smartctl -x' for all four drives, and also 'smartctl -l
scterc' for all four drives, and also 'cat
/sys/block/sda/device/timeout' for all four drives. And which bcache
mode you're using.

The call trace provided is from kernel 4.15 which is sufficiently long
ago I think any dev working on raid56 might want to see where it's
getting tripped up on something a lot newer, and this is why:

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/diff/fs/btrfs/raid56.c?id=v4.19.3&id2=v4.15.1

That's a lot of changes in just the raid56 code between 4.15 and 4.19.
And then in you call trace, btrfs_dev_replace_start is found in
dev-replace.c which likewise has a lot of changes. But then also, I
think 4.15 might still be in the era where it was not recommended to
use 'btrfs dev replace' for raid56, only non-raid56. I'm not sure if
the problems with device replace were fixed, and if they were fixed
kernel or progs side. Anyway, the latest I recall, it was recommended
on raid56 to 'btrfs dev add' then 'btrfs dev remove'.

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/diff/fs/btrfs/dev-replace.c?id=v4.19.3&id2=v4.15.1

And that's only a few hundred changes for each. Check out inode.c -
there are over 2000 changes.

> The disks are configured on top of bcache, in 5 arrays with a small
> 128GB SSD cache shared.  The system in this configuration has worked
> perfectly for 3 years, until 2 weeks ago csum errors started
> appearing.  I have a crashplan backup of all files on the disk, so I
> am not concerned about data loss, but I would like to avoid rebuild
> the system.

btrfs-progs 4.17 still considers raid56 experimental, not for
production use. And three years ago, the current upstream kernel
released was 4.3 so I'm gonna guess the kernel history of this file
system goes back older than that, very close to raid56 code birth. And
then adding bcache to this mix just makes it all the more complicated.

>
> btrfs dev stats shows
> [/dev/bcache0].write_io_errs    0
> [/dev/bcache0].read_io_errs     0
> [/dev/bcache0].flush_io_errs    0
> [/dev/bcache0].corruption_errs  0
> [/dev/bcache0].generation_errs  0
> [/dev/bcache1].write_io_errs    0
> [/dev/bcache1].read_io_errs     20
> [/dev/bcache1].flush_io_errs    0
> [/dev/bcache1].corruption_errs  0
> [/dev/bcache1].generation_errs  14
> [/dev/bcache3].write_io_errs    0
> [/dev/bcache3].read_io_errs     0
> [/dev/bcache3].flush_io_errs    0
> [/dev/bcache3].corruption_errs  0
> [/dev/bcache3].generation_errs  19
> [/dev/bcache2].write_io_errs    0
> [/dev/bcache2].read_io_errs     0
> [/dev/bcache2].flush_io_errs    0
> [/dev/bcache2].corruption_errs  0
> [/dev/bcache2].generation_errs  2

3 of 4 drives have at least one generation error. While there are no
corruptions reported, generation errors can be really tricky to
recover from at all. If only one device had only read errors, this
would be a lot less difficult.

> I've tried the latest kernel, and the latest tools, but nothing will
> allow me to replace, or delete the failed disk.

If the file system is mounted, I would try to make a local backup ASAP
before you lose the whole volume. Whether it's LVM pool of two drives
(linear/concat) with XFS, or if you go with Btrfs -dsingle -mraid1
(also basically a concat) doesn't really matter, but I'd get whatever
you can off the drive. I expect avoiding a rebuild in some form or
another is very wishful thinking and not very likely.

The more changes are made to the file system, repair attempts or
otherwise writing to it, decreases the chance of recovery.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-11-23  7:52 ` Re: Chris Murphy
@ 2018-11-23  9:34   ` Andy Leadbetter
  0 siblings, 0 replies; 1546+ messages in thread
From: Andy Leadbetter @ 2018-11-23  9:34 UTC (permalink / raw)
  To: lists; +Cc: linux-btrfs

Will capture all of that this evening, and try it with the latest
kernel and tools.  Thanks for the input on what info is relevant, with
gather it asap.
On Fri, 23 Nov 2018 at 07:53, Chris Murphy <lists@colorremedies.com> wrote:
>
> On Thu, Nov 22, 2018 at 11:41 PM Andy Leadbetter
> <andy.leadbetter@theleadbetters.com> wrote:
> >
> > I have a failing 2TB disk that is part of a 4 disk RAID 6 system.  I
> > have added a new 2TB disk to the computer, and started a BTRFS replace
> > for the old and new disk.  The process starts correctly however some
> > hours into the job, there is an error and kernel oops. relevant log
> > below.
>
> The relevant log is the entire dmesg, not a snippet. It's decently
> likely there's more than one thing going on here. We also need full
> output of 'smartctl -x' for all four drives, and also 'smartctl -l
> scterc' for all four drives, and also 'cat
> /sys/block/sda/device/timeout' for all four drives. And which bcache
> mode you're using.
>
> The call trace provided is from kernel 4.15 which is sufficiently long
> ago I think any dev working on raid56 might want to see where it's
> getting tripped up on something a lot newer, and this is why:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/diff/fs/btrfs/raid56.c?id=v4.19.3&id2=v4.15.1
>
> That's a lot of changes in just the raid56 code between 4.15 and 4.19.
> And then in you call trace, btrfs_dev_replace_start is found in
> dev-replace.c which likewise has a lot of changes. But then also, I
> think 4.15 might still be in the era where it was not recommended to
> use 'btrfs dev replace' for raid56, only non-raid56. I'm not sure if
> the problems with device replace were fixed, and if they were fixed
> kernel or progs side. Anyway, the latest I recall, it was recommended
> on raid56 to 'btrfs dev add' then 'btrfs dev remove'.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/diff/fs/btrfs/dev-replace.c?id=v4.19.3&id2=v4.15.1
>
> And that's only a few hundred changes for each. Check out inode.c -
> there are over 2000 changes.
>
>
> > The disks are configured on top of bcache, in 5 arrays with a small
> > 128GB SSD cache shared.  The system in this configuration has worked
> > perfectly for 3 years, until 2 weeks ago csum errors started
> > appearing.  I have a crashplan backup of all files on the disk, so I
> > am not concerned about data loss, but I would like to avoid rebuild
> > the system.
>
> btrfs-progs 4.17 still considers raid56 experimental, not for
> production use. And three years ago, the current upstream kernel
> released was 4.3 so I'm gonna guess the kernel history of this file
> system goes back older than that, very close to raid56 code birth. And
> then adding bcache to this mix just makes it all the more complicated.
>
>
>
> >
> > btrfs dev stats shows
> > [/dev/bcache0].write_io_errs    0
> > [/dev/bcache0].read_io_errs     0
> > [/dev/bcache0].flush_io_errs    0
> > [/dev/bcache0].corruption_errs  0
> > [/dev/bcache0].generation_errs  0
> > [/dev/bcache1].write_io_errs    0
> > [/dev/bcache1].read_io_errs     20
> > [/dev/bcache1].flush_io_errs    0
> > [/dev/bcache1].corruption_errs  0
> > [/dev/bcache1].generation_errs  14
> > [/dev/bcache3].write_io_errs    0
> > [/dev/bcache3].read_io_errs     0
> > [/dev/bcache3].flush_io_errs    0
> > [/dev/bcache3].corruption_errs  0
> > [/dev/bcache3].generation_errs  19
> > [/dev/bcache2].write_io_errs    0
> > [/dev/bcache2].read_io_errs     0
> > [/dev/bcache2].flush_io_errs    0
> > [/dev/bcache2].corruption_errs  0
> > [/dev/bcache2].generation_errs  2
>
>
> 3 of 4 drives have at least one generation error. While there are no
> corruptions reported, generation errors can be really tricky to
> recover from at all. If only one device had only read errors, this
> would be a lot less difficult.
>
>
> > I've tried the latest kernel, and the latest tools, but nothing will
> > allow me to replace, or delete the failed disk.
>
> If the file system is mounted, I would try to make a local backup ASAP
> before you lose the whole volume. Whether it's LVM pool of two drives
> (linear/concat) with XFS, or if you go with Btrfs -dsingle -mraid1
> (also basically a concat) doesn't really matter, but I'd get whatever
> you can off the drive. I expect avoiding a rebuild in some form or
> another is very wishful thinking and not very likely.
>
> The more changes are made to the file system, repair attempts or
> otherwise writing to it, decreases the chance of recovery.
>
> --
> Chris Murphy

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE,
@ 2018-11-24 14:03 Miss Sharifah Ahmad Mustahfa
  0 siblings, 0 replies; 1546+ messages in thread
From: Miss Sharifah Ahmad Mustahfa @ 2018-11-24 14:03 UTC (permalink / raw)
  To: Recipients

Hello,

First of all i will like to apologies for my manner of communication because you do not know me personally, its due to the fact that i have a very important proposal for you.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE,
@ 2018-11-24 14:16 Miss Sharifah Ahmad Mustahfa
  0 siblings, 0 replies; 1546+ messages in thread
From: Miss Sharifah Ahmad Mustahfa @ 2018-11-24 14:16 UTC (permalink / raw)
  To: Recipients

Hello,

First of all i will like to apologies for my manner of communication because you do not know me personally, its due to the fact that i have a very important proposal for you.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE,
@ 2018-11-24 14:16 Miss Sharifah Ahmad Mustahfa
  0 siblings, 0 replies; 1546+ messages in thread
From: Miss Sharifah Ahmad Mustahfa @ 2018-11-24 14:16 UTC (permalink / raw)
  To: Recipients

Hello,

First of all i will like to apologies for my manner of communication because you do not know me personally, its due to the fact that i have a very important proposal for you.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE,
@ 2018-11-24 14:19 Miss Sharifah Ahmad Mustahfa
  0 siblings, 0 replies; 1546+ messages in thread
From: Miss Sharifah Ahmad Mustahfa @ 2018-11-24 14:19 UTC (permalink / raw)
  To: Recipients

Hello,

First of all i will like to apologies for my manner of communication because you do not know me personally, its due to the fact that i have a very important proposal for you.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <20181130011234.32674-1-axboe@kernel.dk>
@ 2018-11-30  2:09 ` Jens Axboe
  0 siblings, 0 replies; 1546+ messages in thread
From: Jens Axboe @ 2018-11-30  2:09 UTC (permalink / raw)
  To: linux-block, osandov

On 11/29/18 6:12 PM, Jens Axboe wrote:
> Three patches here:
> 
> 1) Ensure that we align ->map properly
> 
> 2) v2 of the sbitmap clear cost ammortization. Updated to do a wakeup
>    check AFTER we're done swapping free/cleared masks. Kept the
>    separate alignment for ->word, as it is faster in testing.
> 
> 3) Cost reduction of having to do wait queue checks.

Ignore this one, see v3 posted.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE,
@ 2018-12-04  2:28 Ms Sharifah Ahmad Mustahfa
  0 siblings, 0 replies; 1546+ messages in thread
From: Ms Sharifah Ahmad Mustahfa @ 2018-12-04  2:28 UTC (permalink / raw)




-- 
Hello,

First of all i will like to apologies for my manner of communication 
because you do not know me personally, its due to the fact that i have a 
very important proposal for you.



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2018-12-21 15:22 kenneth johansson
@ 2018-12-22  8:18 ` Richard Weinberger
  0 siblings, 0 replies; 1546+ messages in thread
From: Richard Weinberger @ 2018-12-22  8:18 UTC (permalink / raw)
  To: kenneth johansson; +Cc: linux-mtd

On Fri, Dec 21, 2018 at 4:24 PM kenneth johansson <kenjo@kenjo.org> wrote:
>
> From 9815710fa078241c683de1b49d9a0c9631502e17 Mon Sep 17 00:00:00 2001
> From: Kenneth Johansson <kenjo@kenjo.org>
> Date: Fri, 21 Dec 2018 15:46:24 +0100
> Subject: [PATCH] mtd: rawnand: nandsim: Add support to disable subpage writes.
> X-IMAPbase: 1545405463 2
> Status: O
> X-UID: 1
>
> This is needed if you try to use an already existing ubifs image that
> is created for hardware that do not support subpage write.
>
> It is not enough that you can select what nandchip to emulate as the
> subpage support might not exist in the actual nand driver.

Is this really needed? Usually using ubiattach's -O parameter does the trick.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2019-01-31  5:54     ` Souptick Joarder
@ 2019-01-31 12:58       ` Vladimir Murzin
  2019-02-01 12:32         ` Re: Souptick Joarder
  0 siblings, 1 reply; 1546+ messages in thread
From: Vladimir Murzin @ 2019-01-31 12:58 UTC (permalink / raw)
  To: Souptick Joarder, Mike Rapoport
  Cc: Michal Hocko, Sabyasachi Gupta, linux-kernel,
	Russell King - ARM Linux, rppt, Brajeswar Ghosh, Andrew Morton,
	linux-arm-kernel

Hi Souptick,

On 1/31/19 5:54 AM, Souptick Joarder wrote:
> On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
>>
>> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
>>> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
>>>>
>>>> Remove duplicate headers which are included twice.
>>>>
>>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
>>
>> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
>>
>>> Any comment on this patch ?
> 
> If no further comment, can we get this patch in queue for 5.1 ?

I'd be nice to use proper tags in subject
line. I'd suggest 

[PATCH] ARM: mm: Remove duplicate header

but you can get some inspiration form

git log --oneline --no-merges arch/arm/mm/

In case you want to route it via ARM tree you need to drop it into
Russell's patch system [1]. 

[1] https://www.armlinux.org.uk/developer/patches/

Cheers
Vladimir

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2019-01-31 12:58       ` Vladimir Murzin
@ 2019-02-01 12:32         ` Souptick Joarder
  2019-02-01 12:36           ` Re: Vladimir Murzin
  0 siblings, 1 reply; 1546+ messages in thread
From: Souptick Joarder @ 2019-02-01 12:32 UTC (permalink / raw)
  To: Vladimir Murzin
  Cc: Michal Hocko, Sabyasachi Gupta, Russell King - ARM Linux,
	linux-kernel, rppt, Brajeswar Ghosh, Andrew Morton, Mike Rapoport,
	linux-arm-kernel

On Thu, Jan 31, 2019 at 6:28 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
>
> Hi Souptick,
>
> On 1/31/19 5:54 AM, Souptick Joarder wrote:
> > On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> >>
> >> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
> >>> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
> >>>>
> >>>> Remove duplicate headers which are included twice.
> >>>>
> >>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
> >>
> >> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
> >>
> >>> Any comment on this patch ?
> >
> > If no further comment, can we get this patch in queue for 5.1 ?
>
> I'd be nice to use proper tags in subject
> line. I'd suggest
>
> [PATCH] ARM: mm: Remove duplicate header
>
> but you can get some inspiration form
>
> git log --oneline --no-merges arch/arm/mm/
>
> In case you want to route it via ARM tree you need to drop it into
> Russell's patch system [1].

How to drop it to Russell's patch system other than posting it to
mailing list ? I don't know.
>
> [1] https://www.armlinux.org.uk/developer/patches/
>
> Cheers
> Vladimir

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2019-02-01 12:32         ` Re: Souptick Joarder
@ 2019-02-01 12:36           ` Vladimir Murzin
  2019-02-01 12:41             ` Re: Souptick Joarder
  0 siblings, 1 reply; 1546+ messages in thread
From: Vladimir Murzin @ 2019-02-01 12:36 UTC (permalink / raw)
  To: Souptick Joarder
  Cc: Michal Hocko, Sabyasachi Gupta, Russell King - ARM Linux,
	linux-kernel, rppt, Brajeswar Ghosh, Andrew Morton, Mike Rapoport,
	linux-arm-kernel

On 2/1/19 12:32 PM, Souptick Joarder wrote:
> On Thu, Jan 31, 2019 at 6:28 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
>>
>> Hi Souptick,
>>
>> On 1/31/19 5:54 AM, Souptick Joarder wrote:
>>> On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
>>>>
>>>> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
>>>>> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
>>>>>>
>>>>>> Remove duplicate headers which are included twice.
>>>>>>
>>>>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
>>>>
>>>> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
>>>>
>>>>> Any comment on this patch ?
>>>
>>> If no further comment, can we get this patch in queue for 5.1 ?
>>
>> I'd be nice to use proper tags in subject
>> line. I'd suggest
>>
>> [PATCH] ARM: mm: Remove duplicate header
>>
>> but you can get some inspiration form
>>
>> git log --oneline --no-merges arch/arm/mm/
>>
>> In case you want to route it via ARM tree you need to drop it into
>> Russell's patch system [1].
> 
> How to drop it to Russell's patch system other than posting it to
> mailing list ? I don't know.

https://www.armlinux.org.uk/developer/patches/info.php

Vladimir

>>
>> [1] https://www.armlinux.org.uk/developer/patches/
>>
>> Cheers
>> Vladimir
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2019-02-01 12:36           ` Re: Vladimir Murzin
@ 2019-02-01 12:41             ` Souptick Joarder
  2019-02-01 13:02               ` Re: Vladimir Murzin
  2019-02-01 15:15               ` Re: Russell King - ARM Linux admin
  0 siblings, 2 replies; 1546+ messages in thread
From: Souptick Joarder @ 2019-02-01 12:41 UTC (permalink / raw)
  To: Vladimir Murzin
  Cc: Michal Hocko, Sabyasachi Gupta, Russell King - ARM Linux,
	linux-kernel, rppt, Brajeswar Ghosh, Andrew Morton, Mike Rapoport,
	linux-arm-kernel

On Fri, Feb 1, 2019 at 6:06 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
>
> On 2/1/19 12:32 PM, Souptick Joarder wrote:
> > On Thu, Jan 31, 2019 at 6:28 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
> >>
> >> Hi Souptick,
> >>
> >> On 1/31/19 5:54 AM, Souptick Joarder wrote:
> >>> On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> >>>>
> >>>> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
> >>>>> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
> >>>>>>
> >>>>>> Remove duplicate headers which are included twice.
> >>>>>>
> >>>>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
> >>>>
> >>>> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
> >>>>
> >>>>> Any comment on this patch ?
> >>>
> >>> If no further comment, can we get this patch in queue for 5.1 ?
> >>
> >> I'd be nice to use proper tags in subject
> >> line. I'd suggest
> >>
> >> [PATCH] ARM: mm: Remove duplicate header
> >>
> >> but you can get some inspiration form
> >>
> >> git log --oneline --no-merges arch/arm/mm/
> >>
> >> In case you want to route it via ARM tree you need to drop it into
> >> Russell's patch system [1].
> >
> > How to drop it to Russell's patch system other than posting it to
> > mailing list ? I don't know.
>
> https://www.armlinux.org.uk/developer/patches/info.php

This link is not reachable.

>
> Vladimir
>
> >>
> >> [1] https://www.armlinux.org.uk/developer/patches/
> >>
> >> Cheers
> >> Vladimir
> >
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2019-02-01 12:41             ` Re: Souptick Joarder
@ 2019-02-01 13:02               ` Vladimir Murzin
  2019-02-01 15:15               ` Re: Russell King - ARM Linux admin
  1 sibling, 0 replies; 1546+ messages in thread
From: Vladimir Murzin @ 2019-02-01 13:02 UTC (permalink / raw)
  To: Souptick Joarder
  Cc: Michal Hocko, Sabyasachi Gupta, Russell King - ARM Linux,
	linux-kernel, rppt, Brajeswar Ghosh, Andrew Morton, Mike Rapoport,
	linux-arm-kernel

On 2/1/19 12:41 PM, Souptick Joarder wrote:
> On Fri, Feb 1, 2019 at 6:06 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
>>
>> On 2/1/19 12:32 PM, Souptick Joarder wrote:
>>> On Thu, Jan 31, 2019 at 6:28 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
>>>>
>>>> Hi Souptick,
>>>>
>>>> On 1/31/19 5:54 AM, Souptick Joarder wrote:
>>>>> On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
>>>>>>
>>>>>> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
>>>>>>> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
>>>>>>>>
>>>>>>>> Remove duplicate headers which are included twice.
>>>>>>>>
>>>>>>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
>>>>>>
>>>>>> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
>>>>>>
>>>>>>> Any comment on this patch ?
>>>>>
>>>>> If no further comment, can we get this patch in queue for 5.1 ?
>>>>
>>>> I'd be nice to use proper tags in subject
>>>> line. I'd suggest
>>>>
>>>> [PATCH] ARM: mm: Remove duplicate header
>>>>
>>>> but you can get some inspiration form
>>>>
>>>> git log --oneline --no-merges arch/arm/mm/
>>>>
>>>> In case you want to route it via ARM tree you need to drop it into
>>>> Russell's patch system [1].
>>>
>>> How to drop it to Russell's patch system other than posting it to
>>> mailing list ? I don't know.
>>
>> https://www.armlinux.org.uk/developer/patches/info.php
> 
> This link is not reachable.
> 

Bad luck :(

Vladimir

>>
>> Vladimir
>>
>>>>
>>>> [1] https://www.armlinux.org.uk/developer/patches/
>>>>
>>>> Cheers
>>>> Vladimir
>>>
>>
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2019-02-01 12:41             ` Re: Souptick Joarder
  2019-02-01 13:02               ` Re: Vladimir Murzin
@ 2019-02-01 15:15               ` Russell King - ARM Linux admin
  2019-02-01 15:22                 ` Re: Russell King - ARM Linux admin
  1 sibling, 1 reply; 1546+ messages in thread
From: Russell King - ARM Linux admin @ 2019-02-01 15:15 UTC (permalink / raw)
  To: Souptick Joarder
  Cc: Vladimir Murzin, Sabyasachi Gupta, Michal Hocko, linux-kernel,
	Mike Rapoport, rppt, Brajeswar Ghosh, Andrew Morton,
	linux-arm-kernel

On Fri, Feb 01, 2019 at 06:11:21PM +0530, Souptick Joarder wrote:
> On Fri, Feb 1, 2019 at 6:06 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
> >
> > On 2/1/19 12:32 PM, Souptick Joarder wrote:
> > > On Thu, Jan 31, 2019 at 6:28 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
> > >>
> > >> Hi Souptick,
> > >>
> > >> On 1/31/19 5:54 AM, Souptick Joarder wrote:
> > >>> On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > >>>>
> > >>>> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
> > >>>>> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
> > >>>>>>
> > >>>>>> Remove duplicate headers which are included twice.
> > >>>>>>
> > >>>>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
> > >>>>
> > >>>> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
> > >>>>
> > >>>>> Any comment on this patch ?
> > >>>
> > >>> If no further comment, can we get this patch in queue for 5.1 ?
> > >>
> > >> I'd be nice to use proper tags in subject
> > >> line. I'd suggest
> > >>
> > >> [PATCH] ARM: mm: Remove duplicate header
> > >>
> > >> but you can get some inspiration form
> > >>
> > >> git log --oneline --no-merges arch/arm/mm/
> > >>
> > >> In case you want to route it via ARM tree you need to drop it into
> > >> Russell's patch system [1].
> > >
> > > How to drop it to Russell's patch system other than posting it to
> > > mailing list ? I don't know.
> >
> > https://www.armlinux.org.uk/developer/patches/info.php
> 
> This link is not reachable.

In what way?  The site is certainly getting hits over ipv4 and ipv6.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2019-02-01 15:15               ` Re: Russell King - ARM Linux admin
@ 2019-02-01 15:22                 ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 1546+ messages in thread
From: Russell King - ARM Linux admin @ 2019-02-01 15:22 UTC (permalink / raw)
  To: Souptick Joarder
  Cc: Vladimir Murzin, Sabyasachi Gupta, Michal Hocko, linux-kernel,
	Mike Rapoport, rppt, Brajeswar Ghosh, Andrew Morton,
	linux-arm-kernel

On Fri, Feb 01, 2019 at 03:15:11PM +0000, Russell King - ARM Linux admin wrote:
> On Fri, Feb 01, 2019 at 06:11:21PM +0530, Souptick Joarder wrote:
> > On Fri, Feb 1, 2019 at 6:06 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
> > >
> > > On 2/1/19 12:32 PM, Souptick Joarder wrote:
> > > > On Thu, Jan 31, 2019 at 6:28 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
> > > >>
> > > >> Hi Souptick,
> > > >>
> > > >> On 1/31/19 5:54 AM, Souptick Joarder wrote:
> > > >>> On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > > >>>>
> > > >>>> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
> > > >>>>> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
> > > >>>>>>
> > > >>>>>> Remove duplicate headers which are included twice.
> > > >>>>>>
> > > >>>>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
> > > >>>>
> > > >>>> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
> > > >>>>
> > > >>>>> Any comment on this patch ?
> > > >>>
> > > >>> If no further comment, can we get this patch in queue for 5.1 ?
> > > >>
> > > >> I'd be nice to use proper tags in subject
> > > >> line. I'd suggest
> > > >>
> > > >> [PATCH] ARM: mm: Remove duplicate header
> > > >>
> > > >> but you can get some inspiration form
> > > >>
> > > >> git log --oneline --no-merges arch/arm/mm/
> > > >>
> > > >> In case you want to route it via ARM tree you need to drop it into
> > > >> Russell's patch system [1].
> > > >
> > > > How to drop it to Russell's patch system other than posting it to
> > > > mailing list ? I don't know.
> > >
> > > https://www.armlinux.org.uk/developer/patches/info.php
> > 
> > This link is not reachable.
> 
> In what way?  The site is certainly getting hits over ipv4 and ipv6.

Ah, I see - the site is accessible over IPv6 using port 80 only, but
port 443 is blocked.  Problem is, I can't test IPv6 from "outside",
so I rely on people *reporting* when things stop working.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2019-02-16  0:08 Graham Loan Firm
  0 siblings, 0 replies; 1546+ messages in thread
From: Graham Loan Firm @ 2019-02-16  0:08 UTC (permalink / raw)




* Sie brauchen eine Art Darlehen?
Brauchen Sie dringend Kredite, um Ihre Schulden zu konsolidieren?
Benötigen Sie ein Auto, ein Geschäft oder ein Darlehen, um Rechnungen zu bezahlen?
Setzen Sie sich noch heute mit uns in Verbindung und holen Sie sich den Kredit, den Sie benötigen. *
* Kontakt Email: grahamloanfirm01@gmail.com
BORROWER DETAILS
Namen:
Adresse:
Telefonnummer:
Betrag nötig:
Dauer:
Alter:
Sex: *
* Email: grahamloanfirm01@gmail.com

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re;
@ 2019-02-16  4:17 Richard Wahl
  0 siblings, 0 replies; 1546+ messages in thread
From: Richard Wahl @ 2019-02-16  4:17 UTC (permalink / raw)
  To: sparclinux

I am Richard Wahl $1,900,000.00USD Has Been Granted To You As A Donation, Kindly reply for claim.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2019-02-18 23:41 Pablo Mancilla
  0 siblings, 0 replies; 1546+ messages in thread
From: Pablo Mancilla @ 2019-02-18 23:41 UTC (permalink / raw)


Good day,
I did send you an email on charity works and I
dont know if you got it.Please reach me for updates or let me know if to
resend

Pablo Mancilla

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2019-02-19  2:20 Pablo Mancilla
  0 siblings, 0 replies; 1546+ messages in thread
From: Pablo Mancilla @ 2019-02-19  2:20 UTC (permalink / raw)
  To: sparclinux

Good day,
I did send you an email on charity works and I
dont know if you got it.Please reach me for updates or let me know if to
resend

Pablo Mancilla

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2019-05-24 15:12   ` Mimi Zohar
@ 2019-05-24 15:42     ` Roberto Sassu
  2019-05-24 15:47       ` Re: Roberto Sassu
  0 siblings, 1 reply; 1546+ messages in thread
From: Roberto Sassu @ 2019-05-24 15:42 UTC (permalink / raw)
  To: Mimi Zohar, Prakhar Srivastava, linux-integrity,
	linux-security-module, linux-kernel
  Cc: mjg59, vgoyal

On 5/24/2019 5:12 PM, Mimi Zohar wrote:
> On Mon, 2019-05-20 at 17:06 -0700, Prakhar Srivastava wrote:
>> A buffer(cmdline args) measured into ima cannot be appraised
>> without already being aware of the buffer contents.Since we
>> don't know what cmdline args will be passed (or need to validate
>> what was passed) it is not possible to appraise it.
>>
>> Since hashs are non reversible the raw buffer is needed to
>> recompute the hash.
>> To regenrate the hash of the buffer and appraise the same
>> the contents of the buffer need to be available.
>>
>> A new template field buf is added to the existing ima template
>> fields, which can be used to store/read the buffer itself.
>> Two new fields are added to the ima_event_data to carry the
>> buf and buf_len whenever necessary.
>>
>> Updated the process_buffer_measurement call to add the buf
>> to the ima_event_data.
>> process_buffer_measurement added in "Add a new ima hook
>> ima_kexec_cmdline to measure cmdline args"
>>
>> - Add a new template field 'buf' to be used to store/read
>> the buffer data.
>> - Added two new fields to ima_event_data to hold the buf and
>> buf_len [Suggested by Roberto]
>> -Updated process_buffer_meaurement to add the buffer to
>> ima_event_data
> 
> This patch description can be written more concisely.
> 
> Patch 1/3 in this series introduces measuring the kexec boot command
> line.  This patch defines a new template field for storing the kexec
> boot command line in the measurement list in order for a remote
> attestation server to verify.
> 
> As mentioned, the first patch description should include a shell
> command for verifying the digest in the kexec boot command line
> measurement list record against /proc/cmdline.  This patch description
> should include a shell command showing how to verify the digest based
> on the new field.  Should the new field in the ascii measurement list
> be displayed as a string, not hex?

We should define a new type. If the type is DATA_FMT_STRING, spaces are
replaced with '_'.

Roberto

-- 
HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063
Managing Director: Bo PENG, Jian LI, Yanli SHI

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2019-05-24 15:42     ` Roberto Sassu
@ 2019-05-24 15:47       ` Roberto Sassu
  2019-05-24 18:09         ` Re: Mimi Zohar
  0 siblings, 1 reply; 1546+ messages in thread
From: Roberto Sassu @ 2019-05-24 15:47 UTC (permalink / raw)
  To: Mimi Zohar, Prakhar Srivastava, linux-integrity,
	linux-security-module, linux-kernel
  Cc: mjg59, vgoyal

On 5/24/2019 5:42 PM, Roberto Sassu wrote:
> On 5/24/2019 5:12 PM, Mimi Zohar wrote:
>> On Mon, 2019-05-20 at 17:06 -0700, Prakhar Srivastava wrote:
>>> A buffer(cmdline args) measured into ima cannot be appraised
>>> without already being aware of the buffer contents.Since we
>>> don't know what cmdline args will be passed (or need to validate
>>> what was passed) it is not possible to appraise it.
>>>
>>> Since hashs are non reversible the raw buffer is needed to
>>> recompute the hash.
>>> To regenrate the hash of the buffer and appraise the same
>>> the contents of the buffer need to be available.
>>>
>>> A new template field buf is added to the existing ima template
>>> fields, which can be used to store/read the buffer itself.
>>> Two new fields are added to the ima_event_data to carry the
>>> buf and buf_len whenever necessary.
>>>
>>> Updated the process_buffer_measurement call to add the buf
>>> to the ima_event_data.
>>> process_buffer_measurement added in "Add a new ima hook
>>> ima_kexec_cmdline to measure cmdline args"
>>>
>>> - Add a new template field 'buf' to be used to store/read
>>> the buffer data.
>>> - Added two new fields to ima_event_data to hold the buf and
>>> buf_len [Suggested by Roberto]
>>> -Updated process_buffer_meaurement to add the buffer to
>>> ima_event_data
>>
>> This patch description can be written more concisely.
>>
>> Patch 1/3 in this series introduces measuring the kexec boot command
>> line.  This patch defines a new template field for storing the kexec
>> boot command line in the measurement list in order for a remote
>> attestation server to verify.
>>
>> As mentioned, the first patch description should include a shell
>> command for verifying the digest in the kexec boot command line
>> measurement list record against /proc/cmdline.  This patch description
>> should include a shell command showing how to verify the digest based
>> on the new field.  Should the new field in the ascii measurement list
>> be displayed as a string, not hex?
> 
> We should define a new type. If the type is DATA_FMT_STRING, spaces are
> replaced with '_'.

Or better. Leave it as hex, otherwise there would be a parsing problem
if there are spaces in the data for a field.

Roberto

-- 
HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063
Managing Director: Bo PENG, Jian LI, Yanli SHI

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: Re:
  2019-05-24 15:47       ` Re: Roberto Sassu
@ 2019-05-24 18:09         ` Mimi Zohar
  2019-05-24 19:00           ` Re: prakhar srivastava
  0 siblings, 1 reply; 1546+ messages in thread
From: Mimi Zohar @ 2019-05-24 18:09 UTC (permalink / raw)
  To: Roberto Sassu, Prakhar Srivastava, linux-integrity,
	linux-security-module, linux-kernel
  Cc: mjg59, vgoyal

> >> As mentioned, the first patch description should include a shell
> >> command for verifying the digest in the kexec boot command line
> >> measurement list record against /proc/cmdline.  This patch description
> >> should include a shell command showing how to verify the digest based
> >> on the new field.  Should the new field in the ascii measurement list
> >> be displayed as a string, not hex?
> > 
> > We should define a new type. If the type is DATA_FMT_STRING, spaces are
> > replaced with '_'.
> 
> Or better. Leave it as hex, otherwise there would be a parsing problem
> if there are spaces in the data for a field.

After making a few changes, the measurement list contains the
following kexec-cmdline data:

10 edc32d1e3a5ba7272280a395b6fb56a5ef7c78c3 ima-buf
sha256:4f43b7db850e
88c49dfeffd4b1eb4f021d78033dfb05b07e45eec8d0b45275 
kexec-cmdline
726f6f
743d2f6465762f7364613420726f2072642e6c756b732e757569643d6c756b73
2d6637
3633643737632d653236622d343431642d613734652d62363633636334643832
656120
696d615f706f6c6963793d7463627c61707072616973655f746362

There's probably a better shell command, but the following works to
verify the digest locally against the /proc/cmdline:

$ echo -n -e `cat /proc/cmdline | sed 's/^.*root=/root=/'` | sha256sum
4f43b7db850e88c49dfeffd4b1eb4f021d78033dfb05b07e45eec8d0b4527f65  -

If we leave the "buf" field as ascii-hex, what would the shell command
look like when verifying the digest based on the "buf" field?

Mimi


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: Re:
  2019-05-24 18:09         ` Re: Mimi Zohar
@ 2019-05-24 19:00           ` prakhar srivastava
  2019-05-24 19:15             ` Re: Mimi Zohar
  0 siblings, 1 reply; 1546+ messages in thread
From: prakhar srivastava @ 2019-05-24 19:00 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Roberto Sassu, linux-integrity, linux-security-module,
	linux-kernel, Matthew Garrett, vgoyal

On Fri, May 24, 2019 at 11:09 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
>
> > >> As mentioned, the first patch description should include a shell
> > >> command for verifying the digest in the kexec boot command line
> > >> measurement list record against /proc/cmdline.  This patch description
> > >> should include a shell command showing how to verify the digest based
> > >> on the new field.  Should the new field in the ascii measurement list
> > >> be displayed as a string, not hex?
> > >
> > > We should define a new type. If the type is DATA_FMT_STRING, spaces are
> > > replaced with '_'.
> >
> > Or better. Leave it as hex, otherwise there would be a parsing problem
> > if there are spaces in the data for a field.
>
> After making a few changes, the measurement list contains the
> following kexec-cmdline data:
>
> 10 edc32d1e3a5ba7272280a395b6fb56a5ef7c78c3 ima-buf
> sha256:4f43b7db850e
> 88c49dfeffd4b1eb4f021d78033dfb05b07e45eec8d0b45275
> kexec-cmdline
> 726f6f
> 743d2f6465762f7364613420726f2072642e6c756b732e757569643d6c756b73
> 2d6637
> 3633643737632d653236622d343431642d613734652d62363633636334643832
> 656120
> 696d615f706f6c6963793d7463627c61707072616973655f746362
>
> There's probably a better shell command, but the following works to
> verify the digest locally against the /proc/cmdline:
>
> $ echo -n -e `cat /proc/cmdline | sed 's/^.*root=/root=/'` | sha256sum
> 4f43b7db850e88c49dfeffd4b1eb4f021d78033dfb05b07e45eec8d0b4527f65  -
>
> If we leave the "buf" field as ascii-hex, what would the shell command
> look like when verifying the digest based on the "buf" field?
>
> Mimi
>
To quickly test the sha256 i used the my /proc/cmdline
ro quiet splash vt.handoff=1 ima_policy=tcb ima_appraise=fix
ima_template_fmt=n-ng|d-ng|sig|buf ima_hash=sha256

export $VAL=
726f2071756965742073706c6173682076742e68616e646f66663d3120
696d615f706f6c6963793d74636220696d615f61707072616973653d666
97820696d615f74656d706c6174655f666d743d6e2d6e677c642d6e677c
7369677c62756620696d615f686173683d736861323536

echo -n -e $VAL | xxd -r -p | sha256sum
0d0b891bb730120d9593799cba1a7b3febf68f2bb81fb1304b0c963f95f6bc58  -

I will run it through the code as well, but the shell command should work.

Thanks,
Prakhar Srivastava

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: Re:
  2019-05-24 19:00           ` Re: prakhar srivastava
@ 2019-05-24 19:15             ` Mimi Zohar
  0 siblings, 0 replies; 1546+ messages in thread
From: Mimi Zohar @ 2019-05-24 19:15 UTC (permalink / raw)
  To: prakhar srivastava
  Cc: Roberto Sassu, linux-integrity, linux-security-module,
	linux-kernel, Matthew Garrett, vgoyal

On Fri, 2019-05-24 at 12:00 -0700, prakhar srivastava wrote:
> On Fri, May 24, 2019 at 11:09 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> >
> > > >> As mentioned, the first patch description should include a shell
> > > >> command for verifying the digest in the kexec boot command line
> > > >> measurement list record against /proc/cmdline.  This patch description
> > > >> should include a shell command showing how to verify the digest based
> > > >> on the new field.  Should the new field in the ascii measurement list
> > > >> be displayed as a string, not hex?
> > > >
> > > > We should define a new type. If the type is DATA_FMT_STRING, spaces are
> > > > replaced with '_'.
> > >
> > > Or better. Leave it as hex, otherwise there would be a parsing problem
> > > if there are spaces in the data for a field.
> >
> > After making a few changes, the measurement list contains the
> > following kexec-cmdline data:
> >
> > 10 edc32d1e3a5ba7272280a395b6fb56a5ef7c78c3 ima-buf
> > sha256:4f43b7db850e
> > 88c49dfeffd4b1eb4f021d78033dfb05b07e45eec8d0b45275
> > kexec-cmdline
> > 726f6f
> > 743d2f6465762f7364613420726f2072642e6c756b732e757569643d6c756b73
> > 2d6637
> > 3633643737632d653236622d343431642d613734652d62363633636334643832
> > 656120
> > 696d615f706f6c6963793d7463627c61707072616973655f746362
> >
> > There's probably a better shell command, but the following works to
> > verify the digest locally against the /proc/cmdline:
> >
> > $ echo -n -e `cat /proc/cmdline | sed 's/^.*root=/root=/'` | sha256sum
> > 4f43b7db850e88c49dfeffd4b1eb4f021d78033dfb05b07e45eec8d0b4527f65  -
> >
> > If we leave the "buf" field as ascii-hex, what would the shell command
> > look like when verifying the digest based on the "buf" field?
> >
> > Mimi
> >
> To quickly test the sha256 i used the my /proc/cmdline
> ro quiet splash vt.handoff=1 ima_policy=tcb ima_appraise=fix
> ima_template_fmt=n-ng|d-ng|sig|buf ima_hash=sha256
> 
> export $VAL=
> 726f2071756965742073706c6173682076742e68616e646f66663d3120
> 696d615f706f6c6963793d74636220696d615f61707072616973653d666
> 97820696d615f74656d706c6174655f666d743d6e2d6e677c642d6e677c
> 7369677c62756620696d615f686173683d736861323536
> 
> echo -n -e $VAL | xxd -r -p | sha256sum
> 0d0b891bb730120d9593799cba1a7b3febf68f2bb81fb1304b0c963f95f6bc58  -
> 
> I will run it through the code as well, but the shell command should work.

Yes, that works.

sudo cat /sys/kernel/security/integrity/ima/ascii_runtime_measurements
| grep  kexec-cmdline | cut -d' ' -f 6 | xxd -r -p | sha256sum

Mimi


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2019-06-13  7:02 Erling Persson Foundation
  0 siblings, 0 replies; 1546+ messages in thread
From: Erling Persson Foundation @ 2019-06-13  7:02 UTC (permalink / raw)
  To: sparclinux

Message from Stefan Erling Persson, owner of Erling-Persson family
philanthropic foundation  and you have been selected as benefactor of 3.5
Million Euro from our personal donation in the year 2019. Reply for claim.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <DM5PR19MB165765D43BE979AB51A9897E9EEB0@DM5PR19MB1657.namprd19.prod.outlook.com>
@ 2019-06-18  9:41 ` Enrico Weigelt, metux IT consult
  0 siblings, 0 replies; 1546+ messages in thread
From: Enrico Weigelt, metux IT consult @ 2019-06-18  9:41 UTC (permalink / raw)
  To: Grim, Dennis, linux-iio@vger.kernel.org

On 17.06.19 16:58, Grim, Dennis wrote:
> Is Industrial IO considered to be stable in kernel-3.6.0?
> 

What exactly are you trying to achieve ?

3.6 is *very* old and completely unmaintained. And it's likely to miss
lots of things you'll probably want, sooner or later. And backporting
such far is anything but practical. (I recently had a client who asked
me to backport recent BT features onto some old 3.15 vendor kernel -
that would have taken years to get anythings stable).

Seriously, don't try to use such old code in production systems.
It's better to rebase your individual customizations onto recent
mainline releases.

--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <20190703063132.GA27292@ls3530.dellerweb.de>
@ 2019-07-03  6:38 ` Helge Deller
  0 siblings, 0 replies; 1546+ messages in thread
From: Helge Deller @ 2019-07-03  6:38 UTC (permalink / raw)
  To: Linus Torvalds, linux-parisc, James Bottomley, John David Anglin

Please ignore the last mail.
Somehow a newly-installed mutt misbehaved and sent out an empty email.
Helge

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] ` <20190830202959.3539-1-msuchanek@suse.de>
@ 2019-08-30 20:32   ` Arnd Bergmann
  0 siblings, 0 replies; 1546+ messages in thread
From: Arnd Bergmann @ 2019-08-30 20:32 UTC (permalink / raw)
  To: Michal Suchanek
  Cc: Heiko Carstens, Allison Randal, Linux Kernel Mailing List,
	Paul Mackerras, Alexander Viro, Greg Kroah-Hartman,
	Linux FS-devel Mailing List, Firoz Khan, Thomas Gleixner,
	linuxppc-dev, Christian Brauner

On Fri, Aug 30, 2019 at 10:30 PM Michal Suchanek <msuchanek@suse.de> wrote:
>
> Subject: [PATCH] powerpc: Add back __ARCH_WANT_SYS_LLSEEK macro
>
> This partially reverts commit caf6f9c8a326 ("asm-generic: Remove
> unneeded __ARCH_WANT_SYS_LLSEEK macro")
>
> When CONFIG_COMPAT is disabled on ppc64 the kernel does not build.
>
> There is resistance to both removing the llseek syscall from the 64bit
> syscall tables and building the llseek interface unconditionally.
>
> Link: https://lore.kernel.org/lkml/20190828151552.GA16855@infradead.org/
> Link: https://lore.kernel.org/lkml/20190829214319.498c7de2@naga/
>
> Signed-off-by: Michal Suchanek <msuchanek@suse.de>

Reviewed-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CAGkTAxsV0zS_E64criQM-WtPKpSyW2PL=+fjACvnx2=m7piwXg@mail.gmail.com>
@ 2019-09-27  6:37 ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 1546+ messages in thread
From: Michael Kerrisk (man-pages) @ 2019-09-27  6:37 UTC (permalink / raw)
  To: nilsocket; +Cc: linux-man

Hello

On Fri, 27 Sep 2019 at 08:26, nilsocket <nilsocket@gmail.com> wrote:
>
> In http://man7.org/linux/man-pages/man2/epoll_pwait.2.html#NOTES ,
> through out this section `epoll_wait()` is used. but only once
> `epoll_pwait()` is used, I think it's a typo mistake.
>
> Current:
> While one thread is blocked in a call to epoll_pwait()
>
> Expected Change:
> While one thread is blocked in a call to epoll_wait()

Thanks. Fixed!

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2019-10-27 21:47 Margaret Kwan Wing Han
  0 siblings, 0 replies; 1546+ messages in thread
From: Margaret Kwan Wing Han @ 2019-10-27 21:47 UTC (permalink / raw)
  To: linux-xfs


I need a partner for a legal deal worth $30,500,000 if interested reply me for
more details.

Regards,
Margaret Kwan Wing

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2019-11-14 11:37 SGV INVESTMENT
  0 siblings, 0 replies; 1546+ messages in thread
From: SGV INVESTMENT @ 2019-11-14 11:37 UTC (permalink / raw)
  To: linux-nvdimm

Did you receive our business proposal email ?
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
       [not found] <20191205030032.GA26925@ray.huang@amd.com>
@ 2019-12-09  1:26 ` Quan, Evan
  0 siblings, 0 replies; 1546+ messages in thread
From: Quan, Evan @ 2019-12-09  1:26 UTC (permalink / raw)
  To: Huang, Ray, Wang, Kevin(Yang)
  Cc: Deucher, Alexander, amd-gfx@lists.freedesktop.org

I actually do not see any problem with this change.
1. if smu_read_smc_arg() always return 0, I do not see any meaning to keep "return 0". Making it a "void" API is more reasonable.
2. Making " WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_66, msg);" a separate API is ridiculous while " WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90, 0);" and " WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82, param);" did not. Actually these three combined together makes a real "message sending".

Anyway it's fine with me if you guys can live with original poor code.

> -----Original Message-----
> From: Huang Rui <ray.huang@amd.com>
> Sent: Thursday, December 5, 2019 11:01 AM
> To: Wang, Kevin(Yang) <Kevin1.Wang@amd.com>
> Cc: Quan, Evan <Evan.Quan@amd.com>; amd-gfx@lists.freedesktop.org;
> Deucher, Alexander <Alexander.Deucher@amd.com>
> Subject:
> 
> Bcc:
> Subject: Re: [PATCH 1/2] drm/amd/powerplay: drop unnecessary API wrapper
> and  return value
> Reply-To:
> In-Reply-To:
> <MN2PR12MB32961EFFD79528A4EFF4BF5AA25D0@MN2PR12MB3296.nampr
> d12.prod.outlook.com>
> 
> On Wed, Dec 04, 2019 at 08:41:00PM +0800, Wang, Kevin(Yang) wrote:
> >    [AMD Official Use Only - Internal Distribution Only]
> >
> >    this change doesn't make sense, and if you really think the return
> >    value is useless.
> >    It is more reasonable to accept parameters with return value, not
> >    parameter.
> >    I think these two patches make the code look worse, unless there's a
> >    bug in it.
> >    add [1]@Huang, Ray double check.
> >    Best Regards,
> >    Kevin
> >
> >
> ________________________________________________________________
> __
> >
> >    From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Evan
> >    Quan <evan.quan@amd.com>
> >    Sent: Wednesday, December 4, 2019 5:53 PM
> >    To: amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
> >    Cc: Quan, Evan <Evan.Quan@amd.com>
> >    Subject: [PATCH 1/2] drm/amd/powerplay: drop unnecessary API wrapper
> >    and return value
> >
> >    Some minor cosmetic fixes.
> >    Change-Id: I3ec217289f4cb491720430f2d0b0b4efe5e2b9aa
> >    Signed-off-by: Evan Quan <evan.quan@amd.com>
> >    ---
> >     drivers/gpu/drm/amd/powerplay/amdgpu_smu.c    | 12 ++----
> >     .../gpu/drm/amd/powerplay/inc/amdgpu_smu.h    |  2 +-
> >     drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h |  2 +-
> >     drivers/gpu/drm/amd/powerplay/inc/smu_v12_0.h |  2 +-
> >     drivers/gpu/drm/amd/powerplay/smu_v11_0.c     | 39 +++++--------------
> >     drivers/gpu/drm/amd/powerplay/smu_v12_0.c     | 22 ++---------
> >     6 files changed, 19 insertions(+), 60 deletions(-)
> >    diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> >    b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> >    index 2dd960e85a24..00a0df9b41c9 100644
> >    --- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> >    +++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> >    @@ -198,9 +198,7 @@ int smu_get_smc_version(struct smu_context *smu,
> >    uint32_t *if_version, uint32_t
> >                     if (ret)
> >                             return ret;
> >
> >    -               ret = smu_read_smc_arg(smu, if_version);
> >    -               if (ret)
> >    -                       return ret;
> >    +               smu_read_smc_arg(smu, if_version);
> >             }
> >
> >             if (smu_version) {
> >    @@ -208,9 +206,7 @@ int smu_get_smc_version(struct smu_context *smu,
> >    uint32_t *if_version, uint32_t
> >                     if (ret)
> >                             return ret;
> >
> >    -               ret = smu_read_smc_arg(smu, smu_version);
> >    -               if (ret)
> >    -                       return ret;
> >    +               smu_read_smc_arg(smu, smu_version);
> >             }
> >
> >             return ret;
> >    @@ -339,9 +335,7 @@ int smu_get_dpm_freq_by_index(struct
> smu_context
> >    *smu, enum smu_clk_type clk_typ
> >             if (ret)
> >                     return ret;
> >
> >    -       ret = smu_read_smc_arg(smu, &param);
> >    -       if (ret)
> >    -               return ret;
> >    +       smu_read_smc_arg(smu, &param);
> >
> >             /* BIT31:  0 - Fine grained DPM, 1 - Dicrete DPM
> >              * now, we un-support it */
> >    diff --git a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
> >    b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
> >    index ca3fdc6777cf..e7b18b209bc7 100644
> >    --- a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
> >    +++ b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
> >    @@ -502,7 +502,7 @@ struct pptable_funcs {
> >             int (*system_features_control)(struct smu_context *smu, bool
> >    en);
> >             int (*send_smc_msg_with_param)(struct smu_context *smu,
> >                                            enum smu_message_type msg,
> >    uint32_t param);
> >    -       int (*read_smc_arg)(struct smu_context *smu, uint32_t *arg);
> >    +       void (*read_smc_arg)(struct smu_context *smu, uint32_t *arg);
> >             int (*init_display_count)(struct smu_context *smu, uint32_t
> >    count);
> >             int (*set_allowed_mask)(struct smu_context *smu);
> >             int (*get_enabled_mask)(struct smu_context *smu, uint32_t
> >    *feature_mask, uint32_t num);
> >    diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h
> >    b/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h
> >    index 610e301a5fce..4160147a03f3 100644
> >    --- a/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h
> >    +++ b/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h
> >    @@ -183,7 +183,7 @@ smu_v11_0_send_msg_with_param(struct
> smu_context
> >    *smu,
> >                                   enum smu_message_type msg,
> >                                   uint32_t param);
> >
> >    -int smu_v11_0_read_arg(struct smu_context *smu, uint32_t *arg);
> >    +void smu_v11_0_read_arg(struct smu_context *smu, uint32_t *arg);
> >
> >     int smu_v11_0_init_display_count(struct smu_context *smu, uint32_t
> >    count);
> >
> >    diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu_v12_0.h
> >    b/drivers/gpu/drm/amd/powerplay/inc/smu_v12_0.h
> >    index 922973b7e29f..710af2860a8f 100644
> >    --- a/drivers/gpu/drm/amd/powerplay/inc/smu_v12_0.h
> >    +++ b/drivers/gpu/drm/amd/powerplay/inc/smu_v12_0.h
> >    @@ -40,7 +40,7 @@ struct smu_12_0_cmn2aisc_mapping {
> >     int smu_v12_0_send_msg_without_waiting(struct smu_context *smu,
> >                                                   uint16_t msg);
> >
> >    -int smu_v12_0_read_arg(struct smu_context *smu, uint32_t *arg);
> >    +void smu_v12_0_read_arg(struct smu_context *smu, uint32_t *arg);
> >
> >     int smu_v12_0_wait_for_response(struct smu_context *smu);
> >
> >    diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
> >    b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
> >    index 8683e0678b56..325ec4864f90 100644
> >    --- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
> >    +++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
> >    @@ -53,20 +53,11 @@ MODULE_FIRMWARE("amdgpu/navi12_smc.bin");
> >
> >     #define SMU11_VOLTAGE_SCALE 4
> >
> >    -static int smu_v11_0_send_msg_without_waiting(struct smu_context *smu,
> >    -                                             uint16_t msg)
> >    -{
> >    -       struct amdgpu_device *adev = smu->adev;
> >    -       WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_66, msg);
> >    -       return 0;
> >    -}
> >    -
> >    -int smu_v11_0_read_arg(struct smu_context *smu, uint32_t *arg)
> >    +void smu_v11_0_read_arg(struct smu_context *smu, uint32_t *arg)
> >     {
> >             struct amdgpu_device *adev = smu->adev;
> >
> >             *arg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82);
> >    -       return 0;
> >     }
> >
> >     static int smu_v11_0_wait_for_response(struct smu_context *smu)
> >    @@ -109,7 +100,7 @@ smu_v11_0_send_msg_with_param(struct
> smu_context
> >    *smu,
> >
> >             WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82, param);
> >
> >    -       smu_v11_0_send_msg_without_waiting(smu, (uint16_t)index);
> >    +       WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_66,
> (uint16_t)index);
> >
> >             ret = smu_v11_0_wait_for_response(smu);
> >             if (ret)
> >    @@ -843,16 +834,12 @@ int smu_v11_0_get_enabled_mask(struct
> smu_context
> >    *smu,
> >             ret = smu_send_smc_msg(smu,
> >    SMU_MSG_GetEnabledSmuFeaturesHigh);
> >             if (ret)
> >                     return ret;
> >    -       ret = smu_read_smc_arg(smu, &feature_mask_high);
> >    -       if (ret)
> >    -               return ret;
> >    +       smu_read_smc_arg(smu, &feature_mask_high);
> >
> >             ret = smu_send_smc_msg(smu,
> SMU_MSG_GetEnabledSmuFeaturesLow);
> >             if (ret)
> >                     return ret;
> >    -       ret = smu_read_smc_arg(smu, &feature_mask_low);
> >    -       if (ret)
> >    -               return ret;
> >    +       smu_read_smc_arg(smu, &feature_mask_low);
> >
> >             feature_mask[0] = feature_mask_low;
> >             feature_mask[1] = feature_mask_high;
> >    @@ -924,9 +911,7 @@ smu_v11_0_get_max_sustainable_clock(struct
> >    smu_context *smu, uint32_t *clock,
> >                     return ret;
> >             }
> >
> >    -       ret = smu_read_smc_arg(smu, clock);
> >    -       if (ret)
> >    -               return ret;
> >    +       smu_read_smc_arg(smu, clock);
> >
> >             if (*clock != 0)
> >                     return 0;
> >    @@ -939,7 +924,7 @@ smu_v11_0_get_max_sustainable_clock(struct
> >    smu_context *smu, uint32_t *clock,
> >                     return ret;
> >             }
> >
> >    -       ret = smu_read_smc_arg(smu, clock);
> >    +       smu_read_smc_arg(smu, clock);
> >
> >             return ret;
> >     }
> >    @@ -1107,9 +1092,7 @@ int smu_v11_0_get_current_clk_freq(struct
> >    smu_context *smu,
> >                     if (ret)
> >                             return ret;
> >
> >    -               ret = smu_read_smc_arg(smu, &freq);
> >    -               if (ret)
> >    -                       return ret;
> >    +               smu_read_smc_arg(smu, &freq);
> >             }
> >
> >             freq *= 100;
> >    @@ -1749,18 +1732,14 @@ int smu_v11_0_get_dpm_ultimate_freq(struct
> >    smu_context *smu, enum smu_clk_type c
> >                     ret = smu_send_smc_msg_with_param(smu,
> >    SMU_MSG_GetMaxDpmFreq, param);
> >                     if (ret)
> >                             goto failed;
> >    -               ret = smu_read_smc_arg(smu, max);
> >    -               if (ret)
> >    -                       goto failed;
> >    +               smu_read_smc_arg(smu, max);
> >             }
> >
> >             if (min) {
> >                     ret = smu_send_smc_msg_with_param(smu,
> >    SMU_MSG_GetMinDpmFreq, param);
> >                     if (ret)
> >                             goto failed;
> >    -               ret = smu_read_smc_arg(smu, min);
> >    -               if (ret)
> >    -                       goto failed;
> >    +               smu_read_smc_arg(smu, min);
> >             }
> >
> >     failed:
> >    diff --git a/drivers/gpu/drm/amd/powerplay/smu_v12_0.c
> >    b/drivers/gpu/drm/amd/powerplay/smu_v12_0.c
> >    index 269a7d73b58d..7f5f7e12a41e 100644
> >    --- a/drivers/gpu/drm/amd/powerplay/smu_v12_0.c
> >    +++ b/drivers/gpu/drm/amd/powerplay/smu_v12_0.c
> >    @@ -41,21 +41,11 @@
> >     #define SMUIO_GFX_MISC_CNTL__PWR_GFXOFF_STATUS_MASK
> >    0x00000006L
> >     #define SMUIO_GFX_MISC_CNTL__PWR_GFXOFF_STATUS__SHIFT        0x1
> >
> >    -int smu_v12_0_send_msg_without_waiting(struct smu_context *smu,
> >    -                                             uint16_t msg)
> >    -{
> >    -       struct amdgpu_device *adev = smu->adev;
> >    -
> >    -       WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_66, msg);
> >    -       return 0;
> >    -}
> >    -
> >    -int smu_v12_0_read_arg(struct smu_context *smu, uint32_t *arg)
> >    +void smu_v12_0_read_arg(struct smu_context *smu, uint32_t *arg)
> >     {
> >             struct amdgpu_device *adev = smu->adev;
> >
> >             *arg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82);
> >    -       return 0;
> >     }
> >
> >     int smu_v12_0_wait_for_response(struct smu_context *smu)
> >    @@ -98,7 +88,7 @@ smu_v12_0_send_msg_with_param(struct
> smu_context
> >    *smu,
> >
> >             WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82, param);
> >
> >    -       smu_v12_0_send_msg_without_waiting(smu, (uint16_t)index);
> >    +       WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_66,
> (uint16_t)index);
> 
> smu_v12_0_send_msg_without_waiting() function is more readable than using
> raw register programming.
> 
> Thanks,
> Ray
> 
> >
> >             ret = smu_v12_0_wait_for_response(smu);
> >             if (ret)
> >    @@ -352,9 +342,7 @@ int smu_v12_0_get_dpm_ultimate_freq(struct
> >    smu_context *smu, enum smu_clk_type c
> >                                     pr_err("Attempt to get max GX
> >    frequency from SMC Failed !\n");
> >                                     goto failed;
> >                             }
> >    -                       ret = smu_read_smc_arg(smu, max);
> >    -                       if (ret)
> >    -                               goto failed;
> >    +                       smu_read_smc_arg(smu, max);
> >                             break;
> >                     case SMU_UCLK:
> >                     case SMU_FCLK:
> >    @@ -383,9 +371,7 @@ int smu_v12_0_get_dpm_ultimate_freq(struct
> >    smu_context *smu, enum smu_clk_type c
> >                                     pr_err("Attempt to get min GX
> >    frequency from SMC Failed !\n");
> >                                     goto failed;
> >                             }
> >    -                       ret = smu_read_smc_arg(smu, min);
> >    -                       if (ret)
> >    -                               goto failed;
> >    +                       smu_read_smc_arg(smu, min);
> >                             break;
> >                     case SMU_UCLK:
> >                     case SMU_FCLK:
> >    --
> >    2.24.0
> >    _______________________________________________
> >    amd-gfx mailing list
> >    amd-gfx@lists.freedesktop.org
> >    [2]https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli
> >    sts.freedesktop.org%2Fmailman%2Flistinfo%2Famd-
> gfx&amp;data=02%7C01%7CK
> >
> evin1.Wang%40amd.com%7Cb2381beaed6e4f83074608d7789fe6ef%7C3dd896
> 1fe4884
> >
> e608e11a82d994e183d%7C0%7C0%7C637110500489978071&amp;sdata=U15c
> qXp2n00L
> >    RZDeu2482cwoZmEIrXWHCgF4NFap%2BkQ%3D&amp;reserved=0
> >
> > References
> >
> >    1. mailto:Ray.Huang@amd.com
> >    2.
> > https://nam11.safelinks.protection.outlook.com/?url=https://lists.free
> > desktop.org/mailman/listinfo/amd-
> gfx&amp;data=02|01|Kevin1.Wang@amd.co
> >
> m|b2381beaed6e4f83074608d7789fe6ef|3dd8961fe4884e608e11a82d994e183
> d|0|
> >
> 0|637110500489978071&amp;sdata=U15cqXp2n00LRZDeu2482cwoZmEIrXWH
> CgF4NFa
> > p+kQ=&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2019-12-19 12:31 liming wu
@ 2019-12-20  1:13 ` Andreas Dilger
  0 siblings, 0 replies; 1546+ messages in thread
From: Andreas Dilger @ 2019-12-20  1:13 UTC (permalink / raw)
  To: liming wu; +Cc: Ext4 Developers List

[-- Attachment #1: Type: text/plain, Size: 4753 bytes --]

These messages indicate your storage is not working properly.
It doesn't have anything to do with ext3/ext4.



> On Dec 19, 2019, at 5:31 AM, liming wu <wu860403@gmail.com> wrote:
> 
> Hi
> 
> 
> Who can help analyze the following message . Or give me some advice, I
> will appreciate it very much.
> 
> Dec 17 22:14:42 bdsitdb222 kernel: Buffer I/O error on device dm-7,
> logical block 810449
> Dec 17 22:14:42 bdsitdb222 kernel: lost page write due to I/O error on dm-7
> Dec 17 22:14:48 bdsitdb222 kernel: Buffer I/O error on device dm-7,
> logical block 283536
> Dec 17 22:14:48 bdsitdb222 kernel: lost page write due to I/O error on dm-7
> Dec 17 22:14:48 bdsitdb222 kernel: Buffer I/O error on device dm-7,
> logical block 283537
> Dec 17 22:14:48 bdsitdb222 kernel: lost page write due to I/O error on dm-7
> Dec 17 22:14:48 bdsitdb222 kernel: JBD: Detected IO errors while
> flushing file data on dm-7
> Dec 17 22:15:42 bdsitdb222 kernel: Buffer I/O error on device dm-8,
> logical block 127859
> Dec 17 22:15:42 bdsitdb222 kernel: lost page write due to I/O error on dm-8
> Dec 17 22:15:42 bdsitdb222 kernel: JBD: Detected IO errors while
> flushing file data on dm-8
> Dec 17 22:15:48 bdsitdb222 kernel: Aborting journal on device dm-7.
> Dec 17 22:15:48 bdsitdb222 kernel: EXT3-fs (dm-7): error in
> ext3_new_blocks: Journal has aborted
> Dec 17 22:15:48 bdsitdb222 kernel: EXT3-fs (dm-7): error in
> ext3_reserve_inode_write: Journal has aborted
> Dec 17 22:16:42 bdsitdb222 kernel: Aborting journal on device dm-8.
> Dec 17 22:16:42 bdsitdb222 kernel: EXT3-fs (dm-7): error:
> ext3_journal_start_sb: Detected aborted journal
> Dec 17 22:16:42 bdsitdb222 kernel: EXT3-fs (dm-7): error: remounting
> filesystem read-only
> Dec 17 22:16:48 bdsitdb222 kernel: Buffer I/O error on device dm-7,
> logical block 23527938
> Dec 17 22:16:48 bdsitdb222 kernel: lost page write due to I/O error on dm-7
> Dec 17 22:16:48 bdsitdb222 kernel: Buffer I/O error on device dm-7,
> logical block 0
> Dec 17 22:16:48 bdsitdb222 kernel: lost page write due to I/O error on dm-7
> Dec 17 22:16:48 bdsitdb222 kernel: JBD: I/O error detected when
> updating journal superblock for dm-7.
> Dec 17 22:17:05 bdsitdb222 kernel: EXT3-fs (dm-7): error in
> ext3_orphan_add: Journal has aborted
> Dec 17 22:17:05 bdsitdb222 kernel: __journal_remove_journal_head:
> freeing b_committed_data
> 
> plus info:
> it's KVM
> # uname -a
> Linux bdsitdb222 2.6.32-279.19.1.el6.62.x86_64 #6 SMP Mon Dec 3
> 22:54:25 CST 2018 x86_64 x86_64 x86_64 GNU/Linux1
> 
> # cat /proc/mounts
> rootfs / rootfs rw 0 0
> proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
> sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
> devtmpfs /dev devtmpfs
> rw,nosuid,relatime,size=8157352k,nr_inodes=2039338,mode=755 0 0
> devpts /dev/pts devpts rw,relatime,gid=5,mode=620,ptmxmode=000 0 0
> tmpfs /dev/shm tmpfs rw,nosuid,nodev,relatime 0 0
> /dev/mapper/systemvg-rootlv / ext4 rw,relatime,barrier=1,data=ordered 0 0
> /proc/bus/usb /proc/bus/usb usbfs rw,relatime 0 0
> /dev/vda1 /boot ext4 rw,relatime,barrier=1,data=ordered 0 0
> /dev/mapper/systemvg-homelv /home ext4 rw,relatime,barrier=1,data=ordered 0 0
> /dev/mapper/systemvg-optlv /opt ext3
> rw,relatime,errors=continue,barrier=1,data=ordered 0 0
> /dev/mapper/systemvg-tmplv /tmp ext3
> rw,relatime,errors=continue,barrier=1,data=ordered 0 0
> /dev/mapper/systemvg-usrlv /usr ext4 rw,relatime,barrier=1,data=ordered 0 0
> /dev/mapper/systemvg-varlv /var ext4 rw,relatime,barrier=1,data=ordered 0 0
> /dev/mapper/datavg-datalv /mysql/data ext3
> rw,relatime,errors=continue,barrier=1,data=ordered 0 0
> /dev/mapper/datavg-binloglv /mysql/binlog ext3
> rw,relatime,errors=continue,barrier=1,data=ordered 0 0
> none /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
> sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
> none /sys/kernel/debug debugfs rw,relatime 0 0
> 
> # ll /dev/mapper/
> total 0
> crw-rw---- 1 root root 10, 58 Dec 19 19:21 control
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 datavg-binloglv -> ../dm-3
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 datavg-datalv -> ../dm-2
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-homelv -> ../dm-4
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-optlv -> ../dm-7
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-rootlv -> ../dm-1
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-swaplv -> ../dm-0
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-tmplv -> ../dm-6
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-usrlv -> ../dm-8
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-varlv -> ../dm-5


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <mailman.6.1579205674.8101.b.a.t.m.a.n@lists.open-mesh.org>
@ 2020-01-17  7:44 ` Simon Wunderlich
  0 siblings, 0 replies; 1546+ messages in thread
From: Simon Wunderlich @ 2020-01-17  7:44 UTC (permalink / raw)
  To: b.a.t.m.a.n; +Cc: Martin, Jeremy J CIV USARMY CCDC C5ISR (USA)

[-- Attachment #1: Type: text/plain, Size: 1424 bytes --]

Hi Jeremy,

On Thursday, January 16, 2020 9:06:50 PM CET Martin, Jeremy J CIV USARMY CCDC 
C5ISR (USA) via B.A.T.M.A.N wrote:
> My/My Teams intent is to have 4 radios in total, 2 on one pc and two on
> another. Our plan is to have Batman take care of the switching between
> which radio to use in order to transmit data between these two PC's. One
> radio is high frequency radio (60 Ghz) and the other would be a lower
> frequency radio and the idea is to have batman switch between these radios
> once the higher frequency radio is dropping between a certain TQ.

BATMAN will switch by default when one link has a better TQ (towards the final 
destination) than the other link, so I believe this should happen by default.

> My
> primary questions regarding this scenario would be, 1) Are there specific
> standards the radio chipsets would need to support in order for them to
> work in this scenario?. 

Normally you would want IBSS mode or 802.11s mode work. BATMAN can also work 
in AP/Sta mode, although the packet loss counting may be biased since 
broadcast handling works a bit different than in IBSS/11s. But for point-to-
point links it might just work.

> 2) Would Batman-adv be adequate enough to be able
> to handle a 1Gb/s data transmission and be able to swap accordingly to the
> lower frequency radio?

If your radio and CPU are powerful enough, batman-adv is able to handle it, 
yes.

Cheers,
      Simon

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]                           ` <2ff97414-f0a5-7224-0e53-6cad2ed0ccd2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2020-01-30  8:05                             ` Ben Dooks
  0 siblings, 0 replies; 1546+ messages in thread
From: Ben Dooks @ 2020-01-30  8:05 UTC (permalink / raw)
  To: Dmitry Osipenko, Jon Hunter, Mark Brown
  Cc: linux-kernel-81qHHgoATdFT9dQujB1mzip2UmYkHbXO,
	alsa-devel-K7yf7f+aM1XWsZ/bQMPhNw, Liam Girdwood, Takashi Iwai,
	Thierry Reding, Edward Cragg, linux-tegra-u79uwXL29TY76Z2rM5mHXA

On 29/01/2020 00:17, Dmitry Osipenko wrote:
> 28.01.2020 21:19, Jon Hunter пишет:
>>
>> On 28/01/2020 17:42, Dmitry Osipenko wrote:
>>> 28.01.2020 15:13, Mark Brown пишет:
>>>> On Mon, Jan 27, 2020 at 10:20:25PM +0300, Dmitry Osipenko wrote:
>>>>> 24.01.2020 19:50, Jon Hunter пишет:
>>>>
>>>>>>                  .rates = SNDRV_PCM_RATE_8000_96000,
>>>>>>                  .formats = SNDRV_PCM_FMTBIT_S32_LE |
>>>>>> -                          SNDRV_PCM_FMTBIT_S24_LE |
>>>>>> +                          SNDRV_PCM_FMTBIT_S24_3LE |
>>>>
>>>>> It should solve the problem in my particular case, but I'm not sure that
>>>>> the solution is correct.
>>>>
>>>> If the format implemented by the driver is S24_3LE the driver should
>>>> advertise S24_3LE.
>>>
>>> It should be S24_LE, but seems we still don't know for sure.
>>
>> Why?
> /I think/ sound should be much more distorted if it was S24_3LE, but
> maybe I'm wrong.

S24_3LE is IICC the wrong thing and we can't support it on the tegra3


-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

https://www.codethink.co.uk/privacy.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-02-06  2:24 Viviane Jose Pereira
  0 siblings, 0 replies; 1546+ messages in thread
From: Viviane Jose Pereira @ 2020-02-06  2:24 UTC (permalink / raw)




-- 
Hallo, ich entschuldige mich dafür, dass ich deine Privatsphäre gestört habe. Ich kontaktiere Sie für eine äußerst dringende und vertrauliche Angelegenheit.

Ihnen wurde eine Spende von 15.000.000,00 EUR angeboten Kontakt: cristtom063@gmail.com für weitere Informationen.

Dies ist keine Spam-Nachricht, sondern eine wichtige Mitteilung an Sie. Antworten Sie auf die obige E-Mail, um immer mehr Informationen über die Spende und den Erhalt von Geldern zu erhalten.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-02-06  6:36 Viviane Jose Pereira
  0 siblings, 0 replies; 1546+ messages in thread
From: Viviane Jose Pereira @ 2020-02-06  6:36 UTC (permalink / raw)




-- 
Hallo, ich entschuldige mich dafür, dass ich deine Privatsphäre gestört habe. Ich kontaktiere Sie für eine äußerst dringende und vertrauliche Angelegenheit.

Ihnen wurde eine Spende von 15.000.000,00 EUR angeboten Kontakt: cristtom063@gmail.com für weitere Informationen.

Dies ist keine Spam-Nachricht, sondern eine wichtige Mitteilung an Sie. Antworten Sie auf die obige E-Mail, um immer mehr Informationen über die Spende und den Erhalt von Geldern zu erhalten.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-02-11 22:34 (unknown) Rajat Jain
@ 2020-02-12  9:30     ` Jarkko Nikula
  0 siblings, 0 replies; 1546+ messages in thread
From: Jarkko Nikula @ 2020-02-12  9:30 UTC (permalink / raw)
  To: Rajat Jain, Daniel Mack, Haojian Zhuang, Robert Jarzmik,
	Mark Brown, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-spi-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Evan Green, rajatxjain-Re5JQEeQqe8AvxtiuMwx3w,
	evgreen-hpIqsD4AKlfQT0dZR+AlfA,
	shobhit.srivastava-ral2JQCrhuEAvxtiuMwx3w,
	porselvan.muthukrishnan-ral2JQCrhuEAvxtiuMwx3w, Andy Shevchenko

Hi

+ Andy

On 2/12/20 12:34 AM, Rajat Jain wrote:
> From: Evan Green <evgreen-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> 
> Date: Wed, 29 Jan 2020 13:54:16 -0800
> Subject: [PATCH] spi: pxa2xx: Add CS control clock quirk
> 
This patch subject is missing from mail subject.

> In some circumstances on Intel LPSS controllers, toggling the LPSS
> CS control register doesn't actually cause the CS line to toggle.
> This seems to be failure of dynamic clock gating that occurs after
> going through a suspend/resume transition, where the controller
> is sent through a reset transition. This ruins SPI transactions
> that either rely on delay_usecs, or toggle the CS line without
> sending data.
> 
> Whenever CS is toggled, momentarily set the clock gating register
> to "Force On" to poke the controller into acting on CS.
> 
Could you share the test case how to trigger this? What's the platform 
here? I'd like to check does this reproduce on other Intel LPSS 
platforms so is there need to add quirk for them too.

> Signed-off-by: Evan Green <evgreen-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> Signed-off-by: Rajat Jain <rajatja-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> ---
>   drivers/spi/spi-pxa2xx.c | 23 +++++++++++++++++++++++
>   1 file changed, 23 insertions(+)
> 
> diff --git a/drivers/spi/spi-pxa2xx.c b/drivers/spi/spi-pxa2xx.c
> index 4c7a71f0fb3e..2e318158fca9 100644
> --- a/drivers/spi/spi-pxa2xx.c
> +++ b/drivers/spi/spi-pxa2xx.c
> @@ -70,6 +70,10 @@ MODULE_ALIAS("platform:pxa2xx-spi");
>   #define LPSS_CAPS_CS_EN_SHIFT			9
>   #define LPSS_CAPS_CS_EN_MASK			(0xf << LPSS_CAPS_CS_EN_SHIFT)
>   
> +#define LPSS_PRIV_CLOCK_GATE 0x38
> +#define LPSS_PRIV_CLOCK_GATE_CLK_CTL_MASK 0x3
> +#define LPSS_PRIV_CLOCK_GATE_CLK_CTL_FORCE_ON 0x3
> +
>   struct lpss_config {
>   	/* LPSS offset from drv_data->ioaddr */
>   	unsigned offset;
> @@ -86,6 +90,8 @@ struct lpss_config {
>   	unsigned cs_sel_shift;
>   	unsigned cs_sel_mask;
>   	unsigned cs_num;
> +	/* Quirks */
> +	unsigned cs_clk_stays_gated : 1;
>   };
>   
>   /* Keep these sorted with enum pxa_ssp_type */
> @@ -156,6 +162,7 @@ static const struct lpss_config lpss_platforms[] = {
>   		.tx_threshold_hi = 56,
>   		.cs_sel_shift = 8,
>   		.cs_sel_mask = 3 << 8,
> +		.cs_clk_stays_gated = true,
>   	},
>   };
>   
> @@ -383,6 +390,22 @@ static void lpss_ssp_cs_control(struct spi_device *spi, bool enable)
>   	else
>   		value |= LPSS_CS_CONTROL_CS_HIGH;
>   	__lpss_ssp_write_priv(drv_data, config->reg_cs_ctrl, value);
> +	if (config->cs_clk_stays_gated) {
> +		u32 clkgate;
> +
> +		/*
> +		 * Changing CS alone when dynamic clock gating is on won't
> +		 * actually flip CS at that time. This ruins SPI transfers
> +		 * that specify delays, or have no data. Toggle the clock mode
> +		 * to force on briefly to poke the CS pin to move.
> +		 */
> +		clkgate = __lpss_ssp_read_priv(drv_data, LPSS_PRIV_CLOCK_GATE);
> +		value = (clkgate & ~LPSS_PRIV_CLOCK_GATE_CLK_CTL_MASK) |
> +			LPSS_PRIV_CLOCK_GATE_CLK_CTL_FORCE_ON;
> +
> +		__lpss_ssp_write_priv(drv_data, LPSS_PRIV_CLOCK_GATE, value);
> +		__lpss_ssp_write_priv(drv_data, LPSS_PRIV_CLOCK_GATE, clkgate);
> +	}
>   }
>   
I wonder is it enough to have this quick toggling only or is time or 
actually number of clock cycles dependent? Now there is no delay between 
but I'm thinking if it needs certain number cycles does this still work 
when using low ssp_clk rates similar than in commit d0283eb2dbc1 ("spi: 
pxa2xx: Add output control for multiple Intel LPSS chip selects").

I'm thinking can this be done only once after resume and may other LPSS 
blocks need the same? I.e. should this be done in drivers/mfd/intel-lpss.c?

Jarkko

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-02-12  9:30     ` Jarkko Nikula
  0 siblings, 0 replies; 1546+ messages in thread
From: Jarkko Nikula @ 2020-02-12  9:30 UTC (permalink / raw)
  To: Rajat Jain, Daniel Mack, Haojian Zhuang, Robert Jarzmik,
	Mark Brown, linux-arm-kernel, linux-spi, linux-kernel
  Cc: rajatxjain, shobhit.srivastava, Evan Green, evgreen,
	porselvan.muthukrishnan, Andy Shevchenko

Hi

+ Andy

On 2/12/20 12:34 AM, Rajat Jain wrote:
> From: Evan Green <evgreen@chromium.org>
> 
> Date: Wed, 29 Jan 2020 13:54:16 -0800
> Subject: [PATCH] spi: pxa2xx: Add CS control clock quirk
> 
This patch subject is missing from mail subject.

> In some circumstances on Intel LPSS controllers, toggling the LPSS
> CS control register doesn't actually cause the CS line to toggle.
> This seems to be failure of dynamic clock gating that occurs after
> going through a suspend/resume transition, where the controller
> is sent through a reset transition. This ruins SPI transactions
> that either rely on delay_usecs, or toggle the CS line without
> sending data.
> 
> Whenever CS is toggled, momentarily set the clock gating register
> to "Force On" to poke the controller into acting on CS.
> 
Could you share the test case how to trigger this? What's the platform 
here? I'd like to check does this reproduce on other Intel LPSS 
platforms so is there need to add quirk for them too.

> Signed-off-by: Evan Green <evgreen@chromium.org>
> Signed-off-by: Rajat Jain <rajatja@google.com>
> ---
>   drivers/spi/spi-pxa2xx.c | 23 +++++++++++++++++++++++
>   1 file changed, 23 insertions(+)
> 
> diff --git a/drivers/spi/spi-pxa2xx.c b/drivers/spi/spi-pxa2xx.c
> index 4c7a71f0fb3e..2e318158fca9 100644
> --- a/drivers/spi/spi-pxa2xx.c
> +++ b/drivers/spi/spi-pxa2xx.c
> @@ -70,6 +70,10 @@ MODULE_ALIAS("platform:pxa2xx-spi");
>   #define LPSS_CAPS_CS_EN_SHIFT			9
>   #define LPSS_CAPS_CS_EN_MASK			(0xf << LPSS_CAPS_CS_EN_SHIFT)
>   
> +#define LPSS_PRIV_CLOCK_GATE 0x38
> +#define LPSS_PRIV_CLOCK_GATE_CLK_CTL_MASK 0x3
> +#define LPSS_PRIV_CLOCK_GATE_CLK_CTL_FORCE_ON 0x3
> +
>   struct lpss_config {
>   	/* LPSS offset from drv_data->ioaddr */
>   	unsigned offset;
> @@ -86,6 +90,8 @@ struct lpss_config {
>   	unsigned cs_sel_shift;
>   	unsigned cs_sel_mask;
>   	unsigned cs_num;
> +	/* Quirks */
> +	unsigned cs_clk_stays_gated : 1;
>   };
>   
>   /* Keep these sorted with enum pxa_ssp_type */
> @@ -156,6 +162,7 @@ static const struct lpss_config lpss_platforms[] = {
>   		.tx_threshold_hi = 56,
>   		.cs_sel_shift = 8,
>   		.cs_sel_mask = 3 << 8,
> +		.cs_clk_stays_gated = true,
>   	},
>   };
>   
> @@ -383,6 +390,22 @@ static void lpss_ssp_cs_control(struct spi_device *spi, bool enable)
>   	else
>   		value |= LPSS_CS_CONTROL_CS_HIGH;
>   	__lpss_ssp_write_priv(drv_data, config->reg_cs_ctrl, value);
> +	if (config->cs_clk_stays_gated) {
> +		u32 clkgate;
> +
> +		/*
> +		 * Changing CS alone when dynamic clock gating is on won't
> +		 * actually flip CS at that time. This ruins SPI transfers
> +		 * that specify delays, or have no data. Toggle the clock mode
> +		 * to force on briefly to poke the CS pin to move.
> +		 */
> +		clkgate = __lpss_ssp_read_priv(drv_data, LPSS_PRIV_CLOCK_GATE);
> +		value = (clkgate & ~LPSS_PRIV_CLOCK_GATE_CLK_CTL_MASK) |
> +			LPSS_PRIV_CLOCK_GATE_CLK_CTL_FORCE_ON;
> +
> +		__lpss_ssp_write_priv(drv_data, LPSS_PRIV_CLOCK_GATE, value);
> +		__lpss_ssp_write_priv(drv_data, LPSS_PRIV_CLOCK_GATE, clkgate);
> +	}
>   }
>   
I wonder is it enough to have this quick toggling only or is time or 
actually number of clock cycles dependent? Now there is no delay between 
but I'm thinking if it needs certain number cycles does this still work 
when using low ssp_clk rates similar than in commit d0283eb2dbc1 ("spi: 
pxa2xx: Add output control for multiple Intel LPSS chip selects").

I'm thinking can this be done only once after resume and may other LPSS 
blocks need the same? I.e. should this be done in drivers/mfd/intel-lpss.c?

Jarkko

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-02-12  9:30     ` Re: Jarkko Nikula
@ 2020-02-12 10:24         ` Andy Shevchenko
  -1 siblings, 0 replies; 1546+ messages in thread
From: Andy Shevchenko @ 2020-02-12 10:24 UTC (permalink / raw)
  To: Jarkko Nikula
  Cc: Rajat Jain, Daniel Mack, Haojian Zhuang, Robert Jarzmik,
	Mark Brown, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-spi-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Evan Green,
	rajatxjain-Re5JQEeQqe8AvxtiuMwx3w, evgreen-hpIqsD4AKlfQT0dZR+AlfA,
	shobhit.srivastava-ral2JQCrhuEAvxtiuMwx3w,
	porselvan.muthukrishnan-ral2JQCrhuEAvxtiuMwx3w

On Wed, Feb 12, 2020 at 11:30:51AM +0200, Jarkko Nikula wrote:
> On 2/12/20 12:34 AM, Rajat Jain wrote:

> This patch subject is missing from mail subject.

> I'm thinking can this be done only once after resume and may other LPSS
> blocks need the same? I.e. should this be done in drivers/mfd/intel-lpss.c?

On resume we restore the previously saved context, can we be sure that values
we saved during suspend are correct?

If above won't show any issue, it might be best place to have this quirk
applied in intel_lpss_suspend() / intel_lpss_resume() callbacks as Jarkko
suggested.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-02-12 10:24         ` Andy Shevchenko
  0 siblings, 0 replies; 1546+ messages in thread
From: Andy Shevchenko @ 2020-02-12 10:24 UTC (permalink / raw)
  To: Jarkko Nikula
  Cc: Evan Green, rajatxjain, shobhit.srivastava, linux-kernel,
	Haojian Zhuang, linux-spi, Mark Brown, evgreen, Daniel Mack,
	Rajat Jain, Robert Jarzmik, linux-arm-kernel,
	porselvan.muthukrishnan

On Wed, Feb 12, 2020 at 11:30:51AM +0200, Jarkko Nikula wrote:
> On 2/12/20 12:34 AM, Rajat Jain wrote:

> This patch subject is missing from mail subject.

> I'm thinking can this be done only once after resume and may other LPSS
> blocks need the same? I.e. should this be done in drivers/mfd/intel-lpss.c?

On resume we restore the previously saved context, can we be sure that values
we saved during suspend are correct?

If above won't show any issue, it might be best place to have this quirk
applied in intel_lpss_suspend() / intel_lpss_resume() callbacks as Jarkko
suggested.

-- 
With Best Regards,
Andy Shevchenko



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <20200224173733.16323-1-axboe@kernel.dk>
@ 2020-02-24 17:38 ` Jens Axboe
  0 siblings, 0 replies; 1546+ messages in thread
From: Jens Axboe @ 2020-02-24 17:38 UTC (permalink / raw)
  To: io-uring

On 2/24/20 10:37 AM, Jens Axboe wrote:
> Here's v3 of the poll async retry patchset. Changes since v2:
> 
> - Rebase on for-5.7/io_uring
> - Get rid of REQ_F_WORK bit
> - Improve the tracing additions
> - Fix linked_timeout case
> - Fully restore work from async task handler
> - Credentials now fixed
> - Fix task_works running from SQPOLL
> - Remove task cancellation stuff, we don't need it
> - fdinfo print improvements
> 
> I think this is getting pretty close to mergeable, I haven't found
> any issues with the test cases.

Gah, wrong directory, resending it. Ignore this thread.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-02-26 11:57 (no subject) Ville Syrjälä
@ 2020-02-26 12:08 ` Linus Walleij
  2020-02-26 14:34   ` Re: Ville Syrjälä
  0 siblings, 1 reply; 1546+ messages in thread
From: Linus Walleij @ 2020-02-26 12:08 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, Eric Anholt, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström, Andrzej Hajda,
	Thierry Reding, Laurent Pinchart, Philipp Zabel, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Maxime Ripard, Alexandre Courbot, Fabio Estevam,
	open list:ARM/Amlogic Meson..., Vincent Abriou, Andreas Pretzsch,
	Jernej Skrabec, Alex Gonzalez, Purism Kernel Team,
	Boris Brezillon, Seung-Woo Kim, Christoph Fritz, Kyungmin Park,
	Heiko Stuebner, Eugen Hristev, Giulio Benetti

On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
> On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:

> > I have long suspected that a whole bunch of the "simple" displays
> > are not simple but contains a display controller and memory.
> > That means that the speed over the link to the display and
> > actual refresh rate on the actual display is asymmetric because
> > well we are just updating a RAM, the resolution just limits how
> > much data we are sending, the clock limits the speed on the
> > bus over to the RAM on the other side.
>
> IMO even in command mode mode->clock should probably be the actual
> dotclock used by the display. If there's another clock for the bus
> speed/etc. it should be stored somewhere else.

Good point. For the DSI panels we have the field hs_rate
for the HS clock in struct mipi_dsi_device which is based
on exactly this reasoning. And that is what I actually use for
setting the HS clock.

The problem is however that we in many cases have so
substandard documentation of these panels that we have
absolutely no idea about the dotclock. Maybe we should
just set it to 0 in these cases?

Yours,
Linus Walleij

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-02-26 12:08 ` Linus Walleij
@ 2020-02-26 14:34   ` Ville Syrjälä
  2020-02-26 14:56     ` Re: Linus Walleij
  0 siblings, 1 reply; 1546+ messages in thread
From: Ville Syrjälä @ 2020-02-26 14:34 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, Eric Anholt, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström, Andrzej Hajda,
	Thierry Reding, Laurent Pinchart, Philipp Zabel, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Maxime Ripard, Alexandre Courbot, Fabio Estevam,
	open list:ARM/Amlogic Meson..., Vincent Abriou, Andreas Pretzsch,
	Jernej Skrabec, Alex Gonzalez, Purism Kernel Team,
	Boris Brezillon, Seung-Woo Kim, Christoph Fritz, Kyungmin Park,
	Heiko Stuebner, Eugen Hristev, Giulio Benetti

On Wed, Feb 26, 2020 at 01:08:06PM +0100, Linus Walleij wrote:
> On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> > On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:
> 
> > > I have long suspected that a whole bunch of the "simple" displays
> > > are not simple but contains a display controller and memory.
> > > That means that the speed over the link to the display and
> > > actual refresh rate on the actual display is asymmetric because
> > > well we are just updating a RAM, the resolution just limits how
> > > much data we are sending, the clock limits the speed on the
> > > bus over to the RAM on the other side.
> >
> > IMO even in command mode mode->clock should probably be the actual
> > dotclock used by the display. If there's another clock for the bus
> > speed/etc. it should be stored somewhere else.
> 
> Good point. For the DSI panels we have the field hs_rate
> for the HS clock in struct mipi_dsi_device which is based
> on exactly this reasoning. And that is what I actually use for
> setting the HS clock.
> 
> The problem is however that we in many cases have so
> substandard documentation of these panels that we have
> absolutely no idea about the dotclock. Maybe we should
> just set it to 0 in these cases?

Don't you always have a TE interrupt or something like that
available? Could just measure it from that if no better
information is available?

-- 
Ville Syrjälä
Intel

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-02-26 14:34   ` Re: Ville Syrjälä
@ 2020-02-26 14:56     ` Linus Walleij
  2020-02-26 15:08       ` Re: Ville Syrjälä
  0 siblings, 1 reply; 1546+ messages in thread
From: Linus Walleij @ 2020-02-26 14:56 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, Eric Anholt, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström, Andrzej Hajda,
	Thierry Reding, Laurent Pinchart, Philipp Zabel, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Maxime Ripard, Alexandre Courbot, Fabio Estevam,
	open list:ARM/Amlogic Meson..., Vincent Abriou, Andreas Pretzsch,
	Jernej Skrabec, Alex Gonzalez, Purism Kernel Team,
	Boris Brezillon, Seung-Woo Kim, Christoph Fritz, Kyungmin Park,
	Heiko Stuebner, Eugen Hristev, Giulio Benetti

On Wed, Feb 26, 2020 at 3:34 PM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
> On Wed, Feb 26, 2020 at 01:08:06PM +0100, Linus Walleij wrote:
> > On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
> > <ville.syrjala@linux.intel.com> wrote:
> > > On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:
> >
> > > > I have long suspected that a whole bunch of the "simple" displays
> > > > are not simple but contains a display controller and memory.
> > > > That means that the speed over the link to the display and
> > > > actual refresh rate on the actual display is asymmetric because
> > > > well we are just updating a RAM, the resolution just limits how
> > > > much data we are sending, the clock limits the speed on the
> > > > bus over to the RAM on the other side.
> > >
> > > IMO even in command mode mode->clock should probably be the actual
> > > dotclock used by the display. If there's another clock for the bus
> > > speed/etc. it should be stored somewhere else.
> >
> > Good point. For the DSI panels we have the field hs_rate
> > for the HS clock in struct mipi_dsi_device which is based
> > on exactly this reasoning. And that is what I actually use for
> > setting the HS clock.
> >
> > The problem is however that we in many cases have so
> > substandard documentation of these panels that we have
> > absolutely no idea about the dotclock. Maybe we should
> > just set it to 0 in these cases?
>
> Don't you always have a TE interrupt or something like that
> available? Could just measure it from that if no better
> information is available?

Yes and I did exactly that, so that is why this comment is in
the driver:

static const struct drm_display_mode sony_acx424akp_cmd_mode = {
(...)
        /*
         * Some desired refresh rate, experiments at the maximum "pixel"
         * clock speed (HS clock 420 MHz) yields around 117Hz.
         */
        .vrefresh = 60,

I got a review comment at the time saying 117 Hz was weird.
We didn't reach a proper conclusion on this:
https://lore.kernel.org/dri-devel/CACRpkdYW3YNPSNMY3A44GQn8DqK-n9TLvr7uipF7LM_DHZ5=Lg@mail.gmail.com/

Thierry wasn't sure if 60Hz was good or not, so I just had to
go with something.

We could calculate the resulting pixel clock for ~117 Hz with
this resolution and put that in the clock field but ... don't know
what is the best?

Yours,
Linus Walleij

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-02-26 14:56     ` Re: Linus Walleij
@ 2020-02-26 15:08       ` Ville Syrjälä
  0 siblings, 0 replies; 1546+ messages in thread
From: Ville Syrjälä @ 2020-02-26 15:08 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, Eric Anholt, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström, Andrzej Hajda,
	Thierry Reding, Laurent Pinchart, Philipp Zabel, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Maxime Ripard, Alexandre Courbot, Fabio Estevam,
	open list:ARM/Amlogic Meson..., Vincent Abriou, Andreas Pretzsch,
	Jernej Skrabec, Alex Gonzalez, Purism Kernel Team,
	Boris Brezillon, Seung-Woo Kim, Christoph Fritz, Kyungmin Park,
	Heiko Stuebner, Eugen Hristev, Giulio Benetti

On Wed, Feb 26, 2020 at 03:56:36PM +0100, Linus Walleij wrote:
> On Wed, Feb 26, 2020 at 3:34 PM Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> > On Wed, Feb 26, 2020 at 01:08:06PM +0100, Linus Walleij wrote:
> > > On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
> > > <ville.syrjala@linux.intel.com> wrote:
> > > > On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:
> > >
> > > > > I have long suspected that a whole bunch of the "simple" displays
> > > > > are not simple but contains a display controller and memory.
> > > > > That means that the speed over the link to the display and
> > > > > actual refresh rate on the actual display is asymmetric because
> > > > > well we are just updating a RAM, the resolution just limits how
> > > > > much data we are sending, the clock limits the speed on the
> > > > > bus over to the RAM on the other side.
> > > >
> > > > IMO even in command mode mode->clock should probably be the actual
> > > > dotclock used by the display. If there's another clock for the bus
> > > > speed/etc. it should be stored somewhere else.
> > >
> > > Good point. For the DSI panels we have the field hs_rate
> > > for the HS clock in struct mipi_dsi_device which is based
> > > on exactly this reasoning. And that is what I actually use for
> > > setting the HS clock.
> > >
> > > The problem is however that we in many cases have so
> > > substandard documentation of these panels that we have
> > > absolutely no idea about the dotclock. Maybe we should
> > > just set it to 0 in these cases?
> >
> > Don't you always have a TE interrupt or something like that
> > available? Could just measure it from that if no better
> > information is available?
> 
> Yes and I did exactly that, so that is why this comment is in
> the driver:
> 
> static const struct drm_display_mode sony_acx424akp_cmd_mode = {
> (...)
>         /*
>          * Some desired refresh rate, experiments at the maximum "pixel"
>          * clock speed (HS clock 420 MHz) yields around 117Hz.
>          */
>         .vrefresh = 60,
> 
> I got a review comment at the time saying 117 Hz was weird.
> We didn't reach a proper conclusion on this:
> https://lore.kernel.org/dri-devel/CACRpkdYW3YNPSNMY3A44GQn8DqK-n9TLvr7uipF7LM_DHZ5=Lg@mail.gmail.com/
> 
> Thierry wasn't sure if 60Hz was good or not, so I just had to
> go with something.
> 
> We could calculate the resulting pixel clock for ~117 Hz with
> this resolution and put that in the clock field but ... don't know
> what is the best?

I would vote for that approach.

-- 
Ville Syrjälä
Intel

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-03-03 15:27 Gene Chen
@ 2020-03-04 14:56   ` Matthias Brugger
  0 siblings, 0 replies; 1546+ messages in thread
From: Matthias Brugger @ 2020-03-04 14:56 UTC (permalink / raw)
  To: Gene Chen, lee.jones
  Cc: gene_chen, linux-kernel, cy_huang, linux-mediatek, Wilma.Wu,
	linux-arm-kernel, shufan_lee

Please resend with appropiate commit message.

On 03/03/2020 16:27, Gene Chen wrote:
> Add mfd driver for mt6360 pmic chip include
> Battery Charger/USB_PD/Flash LED/RGB LED/LDO/Buck
> 
> Signed-off-by: Gene Chen <gene_chen@richtek.com
> ---
>  drivers/mfd/Kconfig        |  12 ++
>  drivers/mfd/Makefile       |   1 +
>  drivers/mfd/mt6360-core.c  | 425 +++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/mfd/mt6360.h | 240 +++++++++++++++++++++++++
>  4 files changed, 678 insertions(+)
>  create mode 100644 drivers/mfd/mt6360-core.c
>  create mode 100644 include/linux/mfd/mt6360.h
> 
> changelogs between v1 & v2
> - include missing header file
> 
> changelogs between v2 & v3
> - add changelogs
> 
> changelogs between v3 & v4
> - fix Kconfig description
> - replace mt6360_pmu_info with mt6360_pmu_data
> - replace probe with probe_new
> - remove unnecessary irq_chip variable
> - remove annotation
> - replace MT6360_MFD_CELL with OF_MFD_CELL
> 
> changelogs between v4 & v5
> - remove unnecessary parse dt function
> - use devm_i2c_new_dummy_device
> - add base-commit message
> 
> changelogs between v5 & v6
> - review return value
> - remove i2c id_table
> - use GPL license v2
> 
> changelogs between v6 & v7
> - add author description
> - replace MT6360_REGMAP_IRQ_REG by REGMAP_IRQ_REG_LINE
> - remove mt6360-private.h
> 
> changelogs between v7 & v8
> - fix kbuild auto reboot by include interrupt header
> 
> diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
> index 2b20329..0f8c341 100644
> --- a/drivers/mfd/Kconfig
> +++ b/drivers/mfd/Kconfig
> @@ -857,6 +857,18 @@ config MFD_MAX8998
>  	  additional drivers must be enabled in order to use the functionality
>  	  of the device.
>  
> +config MFD_MT6360
> +	tristate "Mediatek MT6360 SubPMIC"
> +	select MFD_CORE
> +	select REGMAP_I2C
> +	select REGMAP_IRQ
> +	depends on I2C
> +	help
> +	  Say Y here to enable MT6360 PMU/PMIC/LDO functional support.
> +	  PMU part includes Charger, Flashlight, RGB LED
> +	  PMIC part includes 2-channel BUCKs and 2-channel LDOs
> +	  LDO part includes 4-channel LDOs
> +
>  config MFD_MT6397
>  	tristate "MediaTek MT6397 PMIC Support"
>  	select MFD_CORE
> diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
> index b83f172..8c35816 100644
> --- a/drivers/mfd/Makefile
> +++ b/drivers/mfd/Makefile
> @@ -238,6 +238,7 @@ obj-$(CONFIG_INTEL_SOC_PMIC)	+= intel-soc-pmic.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_BXTWC)	+= intel_soc_pmic_bxtwc.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_CHTWC)	+= intel_soc_pmic_chtwc.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_CHTDC_TI)	+= intel_soc_pmic_chtdc_ti.o
> +obj-$(CONFIG_MFD_MT6360)	+= mt6360-core.o
>  mt6397-objs	:= mt6397-core.o mt6397-irq.o
>  obj-$(CONFIG_MFD_MT6397)	+= mt6397.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_MRFLD)	+= intel_soc_pmic_mrfld.o
> diff --git a/drivers/mfd/mt6360-core.c b/drivers/mfd/mt6360-core.c
> new file mode 100644
> index 0000000..d1168f8
> --- /dev/null
> +++ b/drivers/mfd/mt6360-core.c
> @@ -0,0 +1,425 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2019 MediaTek Inc.
> + *
> + * Author: Gene Chen <gene_chen@richtek.com>
> + */
> +
> +#include <linux/i2c.h>
> +#include <linux/init.h>
> +#include <linux/interrupt.h>
> +#include <linux/kernel.h>
> +#include <linux/mfd/core.h>
> +#include <linux/module.h>
> +#include <linux/of_irq.h>
> +#include <linux/of_platform.h>
> +#include <linux/version.h>
> +
> +#include <linux/mfd/mt6360.h>
> +
> +/* reg 0 -> 0 ~ 7 */
> +#define MT6360_CHG_TREG_EVT		(4)
> +#define MT6360_CHG_AICR_EVT		(5)
> +#define MT6360_CHG_MIVR_EVT		(6)
> +#define MT6360_PWR_RDY_EVT		(7)
> +/* REG 1 -> 8 ~ 15 */
> +#define MT6360_CHG_BATSYSUV_EVT		(9)
> +#define MT6360_FLED_CHG_VINOVP_EVT	(11)
> +#define MT6360_CHG_VSYSUV_EVT		(12)
> +#define MT6360_CHG_VSYSOV_EVT		(13)
> +#define MT6360_CHG_VBATOV_EVT		(14)
> +#define MT6360_CHG_VBUSOV_EVT		(15)
> +/* REG 2 -> 16 ~ 23 */
> +/* REG 3 -> 24 ~ 31 */
> +#define MT6360_WD_PMU_DET		(25)
> +#define MT6360_WD_PMU_DONE		(26)
> +#define MT6360_CHG_TMRI			(27)
> +#define MT6360_CHG_ADPBADI		(29)
> +#define MT6360_CHG_RVPI			(30)
> +#define MT6360_OTPI			(31)
> +/* REG 4 -> 32 ~ 39 */
> +#define MT6360_CHG_AICCMEASL		(32)
> +#define MT6360_CHGDET_DONEI		(34)
> +#define MT6360_WDTMRI			(35)
> +#define MT6360_SSFINISHI		(36)
> +#define MT6360_CHG_RECHGI		(37)
> +#define MT6360_CHG_TERMI		(38)
> +#define MT6360_CHG_IEOCI		(39)
> +/* REG 5 -> 40 ~ 47 */
> +#define MT6360_PUMPX_DONEI		(40)
> +#define MT6360_BAT_OVP_ADC_EVT		(41)
> +#define MT6360_TYPEC_OTP_EVT		(42)
> +#define MT6360_ADC_WAKEUP_EVT		(43)
> +#define MT6360_ADC_DONEI		(44)
> +#define MT6360_BST_BATUVI		(45)
> +#define MT6360_BST_VBUSOVI		(46)
> +#define MT6360_BST_OLPI			(47)
> +/* REG 6 -> 48 ~ 55 */
> +#define MT6360_ATTACH_I			(48)
> +#define MT6360_DETACH_I			(49)
> +#define MT6360_QC30_STPDONE		(51)
> +#define MT6360_QC_VBUSDET_DONE		(52)
> +#define MT6360_HVDCP_DET		(53)
> +#define MT6360_CHGDETI			(54)
> +#define MT6360_DCDTI			(55)
> +/* REG 7 -> 56 ~ 63 */
> +#define MT6360_FOD_DONE_EVT		(56)
> +#define MT6360_FOD_OV_EVT		(57)
> +#define MT6360_CHRDET_UVP_EVT		(58)
> +#define MT6360_CHRDET_OVP_EVT		(59)
> +#define MT6360_CHRDET_EXT_EVT		(60)
> +#define MT6360_FOD_LR_EVT		(61)
> +#define MT6360_FOD_HR_EVT		(62)
> +#define MT6360_FOD_DISCHG_FAIL_EVT	(63)
> +/* REG 8 -> 64 ~ 71 */
> +#define MT6360_USBID_EVT		(64)
> +#define MT6360_APWDTRST_EVT		(65)
> +#define MT6360_EN_EVT			(66)
> +#define MT6360_QONB_RST_EVT		(67)
> +#define MT6360_MRSTB_EVT		(68)
> +#define MT6360_OTP_EVT			(69)
> +#define MT6360_VDDAOV_EVT		(70)
> +#define MT6360_SYSUV_EVT		(71)
> +/* REG 9 -> 72 ~ 79 */
> +#define MT6360_FLED_STRBPIN_EVT		(72)
> +#define MT6360_FLED_TORPIN_EVT		(73)
> +#define MT6360_FLED_TX_EVT		(74)
> +#define MT6360_FLED_LVF_EVT		(75)
> +#define MT6360_FLED2_SHORT_EVT		(78)
> +#define MT6360_FLED1_SHORT_EVT		(79)
> +/* REG 10 -> 80 ~ 87 */
> +#define MT6360_FLED2_STRB_EVT		(80)
> +#define MT6360_FLED1_STRB_EVT		(81)
> +#define MT6360_FLED2_STRB_TO_EVT	(82)
> +#define MT6360_FLED1_STRB_TO_EVT	(83)
> +#define MT6360_FLED2_TOR_EVT		(84)
> +#define MT6360_FLED1_TOR_EVT		(85)
> +/* REG 11 -> 88 ~ 95 */
> +/* REG 12 -> 96 ~ 103 */
> +#define MT6360_BUCK1_PGB_EVT		(96)
> +#define MT6360_BUCK1_OC_EVT		(100)
> +#define MT6360_BUCK1_OV_EVT		(101)
> +#define MT6360_BUCK1_UV_EVT		(102)
> +/* REG 13 -> 104 ~ 111 */
> +#define MT6360_BUCK2_PGB_EVT		(104)
> +#define MT6360_BUCK2_OC_EVT		(108)
> +#define MT6360_BUCK2_OV_EVT		(109)
> +#define MT6360_BUCK2_UV_EVT		(110)
> +/* REG 14 -> 112 ~ 119 */
> +#define MT6360_LDO1_OC_EVT		(113)
> +#define MT6360_LDO2_OC_EVT		(114)
> +#define MT6360_LDO3_OC_EVT		(115)
> +#define MT6360_LDO5_OC_EVT		(117)
> +#define MT6360_LDO6_OC_EVT		(118)
> +#define MT6360_LDO7_OC_EVT		(119)
> +/* REG 15 -> 120 ~ 127 */
> +#define MT6360_LDO1_PGB_EVT		(121)
> +#define MT6360_LDO2_PGB_EVT		(122)
> +#define MT6360_LDO3_PGB_EVT		(123)
> +#define MT6360_LDO5_PGB_EVT		(125)
> +#define MT6360_LDO6_PGB_EVT		(126)
> +#define MT6360_LDO7_PGB_EVT		(127)
> +
> +static const struct regmap_irq mt6360_pmu_irqs[] =  {
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TREG_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_AICR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_MIVR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_PWR_RDY_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_BATSYSUV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_CHG_VINOVP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VSYSUV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VSYSOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VBATOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VBUSOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_WD_PMU_DET, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_WD_PMU_DONE, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TMRI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_ADPBADI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_RVPI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_OTPI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_AICCMEASL, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHGDET_DONEI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_WDTMRI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_SSFINISHI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_RECHGI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TERMI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_IEOCI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_PUMPX_DONEI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TREG_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BAT_OVP_ADC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_TYPEC_OTP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_ADC_WAKEUP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_ADC_DONEI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BST_BATUVI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BST_VBUSOVI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BST_OLPI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_ATTACH_I, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_DETACH_I, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_QC30_STPDONE, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_QC_VBUSDET_DONE, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_HVDCP_DET, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHGDETI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_DCDTI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_DONE_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_OV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_UVP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_OVP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_EXT_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_LR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_HR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_DISCHG_FAIL_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_USBID_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_APWDTRST_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_EN_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_QONB_RST_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_MRSTB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_OTP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_VDDAOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_SYSUV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_STRBPIN_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_TORPIN_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_TX_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_LVF_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_SHORT_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_SHORT_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_STRB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_STRB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_STRB_TO_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_STRB_TO_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_TOR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_TOR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_OV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_UV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_OV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_UV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO1_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO2_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO3_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO5_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO6_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO7_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO1_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO2_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO3_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO5_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO6_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO7_PGB_EVT, 8),
> +};
> +
> +static int mt6360_pmu_handle_post_irq(void *irq_drv_data)
> +{
> +	struct mt6360_pmu_data *mpd = irq_drv_data;
> +
> +	return regmap_update_bits(mpd->regmap,
> +		MT6360_PMU_IRQ_SET, MT6360_IRQ_RETRIG, MT6360_IRQ_RETRIG);
> +}
> +
> +static struct regmap_irq_chip mt6360_pmu_irq_chip = {
> +	.irqs = mt6360_pmu_irqs,
> +	.num_irqs = ARRAY_SIZE(mt6360_pmu_irqs),
> +	.num_regs = MT6360_PMU_IRQ_REGNUM,
> +	.mask_base = MT6360_PMU_CHG_MASK1,
> +	.status_base = MT6360_PMU_CHG_IRQ1,
> +	.ack_base = MT6360_PMU_CHG_IRQ1,
> +	.init_ack_masked = true,
> +	.use_ack = true,
> +	.handle_post_irq = mt6360_pmu_handle_post_irq,
> +};
> +
> +static const struct regmap_config mt6360_pmu_regmap_config = {
> +	.reg_bits = 8,
> +	.val_bits = 8,
> +	.max_register = MT6360_PMU_MAXREG,
> +};
> +
> +static const struct resource mt6360_adc_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_ADC_DONEI, "adc_donei"),
> +};
> +
> +static const struct resource mt6360_chg_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_TREG_EVT, "chg_treg_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_PWR_RDY_EVT, "pwr_rdy_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_BATSYSUV_EVT, "chg_batsysuv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VSYSUV_EVT, "chg_vsysuv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VSYSOV_EVT, "chg_vsysov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VBATOV_EVT, "chg_vbatov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VBUSOV_EVT, "chg_vbusov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_AICCMEASL, "chg_aiccmeasl"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_WDTMRI, "wdtmri"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_RECHGI, "chg_rechgi"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_TERMI, "chg_termi"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_IEOCI, "chg_ieoci"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_PUMPX_DONEI, "pumpx_donei"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_ATTACH_I, "attach_i"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHRDET_EXT_EVT, "chrdet_ext_evt"),
> +};
> +
> +static const struct resource mt6360_led_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED_CHG_VINOVP_EVT, "fled_chg_vinovp_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED_LVF_EVT, "fled_lvf_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED2_SHORT_EVT, "fled2_short_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED1_SHORT_EVT, "fled1_short_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED2_STRB_TO_EVT, "fled2_strb_to_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED1_STRB_TO_EVT, "fled1_strb_to_evt"),
> +};
> +
> +static const struct resource mt6360_pmic_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_PGB_EVT, "buck1_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_OC_EVT, "buck1_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_OV_EVT, "buck1_ov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_UV_EVT, "buck1_uv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_PGB_EVT, "buck2_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_OC_EVT, "buck2_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_OV_EVT, "buck2_ov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_UV_EVT, "buck2_uv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO6_OC_EVT, "ldo6_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO7_OC_EVT, "ldo7_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO6_PGB_EVT, "ldo6_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO7_PGB_EVT, "ldo7_pgb_evt"),
> +};
> +
> +static const struct resource mt6360_ldo_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO1_OC_EVT, "ldo1_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO2_OC_EVT, "ldo2_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO3_OC_EVT, "ldo3_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO5_OC_EVT, "ldo5_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO1_PGB_EVT, "ldo1_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO2_PGB_EVT, "ldo2_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO3_PGB_EVT, "ldo3_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO5_PGB_EVT, "ldo5_pgb_evt"),
> +};
> +
> +static const struct mfd_cell mt6360_devs[] = {
> +	OF_MFD_CELL("mt6360_adc", mt6360_adc_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_adc"),
> +	OF_MFD_CELL("mt6360_chg", mt6360_chg_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_chg"),
> +	OF_MFD_CELL("mt6360_led", mt6360_led_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_led"),
> +	OF_MFD_CELL("mt6360_pmic", mt6360_pmic_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_pmic"),
> +	OF_MFD_CELL("mt6360_ldo", mt6360_ldo_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_ldo"),
> +	OF_MFD_CELL("mt6360_tcpc", NULL,
> +		    NULL, 0, 0, "mediatek,mt6360_tcpc"),
> +};
> +
> +static const unsigned short mt6360_slave_addr[MT6360_SLAVE_MAX] = {
> +	MT6360_PMU_SLAVEID,
> +	MT6360_PMIC_SLAVEID,
> +	MT6360_LDO_SLAVEID,
> +	MT6360_TCPC_SLAVEID,
> +};
> +
> +static int mt6360_pmu_probe(struct i2c_client *client)
> +{
> +	struct mt6360_pmu_data *mpd;
> +	unsigned int reg_data;
> +	int i, ret;
> +
> +	mpd = devm_kzalloc(&client->dev, sizeof(*mpd), GFP_KERNEL);
> +	if (!mpd)
> +		return -ENOMEM;
> +
> +	mpd->dev = &client->dev;
> +	i2c_set_clientdata(client, mpd);
> +
> +	mpd->regmap = devm_regmap_init_i2c(client, &mt6360_pmu_regmap_config);
> +	if (IS_ERR(mpd->regmap)) {
> +		dev_err(&client->dev, "Failed to register regmap\n");
> +		return PTR_ERR(mpd->regmap);
> +	}
> +
> +	ret = regmap_read(mpd->regmap, MT6360_PMU_DEV_INFO, &reg_data);
> +	if (ret) {
> +		dev_err(&client->dev, "Device not found\n");
> +		return ret;
> +	}
> +
> +	mpd->chip_rev = reg_data & CHIP_REV_MASK;
> +	if (mpd->chip_rev != CHIP_VEN_MT6360) {
> +		dev_err(&client->dev, "Device not supported\n");
> +		return -ENODEV;
> +	}
> +
> +	mt6360_pmu_irq_chip.irq_drv_data = mpd;
> +	ret = devm_regmap_add_irq_chip(&client->dev, mpd->regmap, client->irq,
> +				       IRQF_TRIGGER_FALLING, 0,
> +				       &mt6360_pmu_irq_chip, &mpd->irq_data);
> +	if (ret) {
> +		dev_err(&client->dev, "Failed to add Regmap IRQ Chip\n");
> +		return ret;
> +	}
> +
> +	mpd->i2c[0] = client;
> +	for (i = 1; i < MT6360_SLAVE_MAX; i++) {
> +		mpd->i2c[i] = devm_i2c_new_dummy_device(&client->dev,
> +							client->adapter,
> +							mt6360_slave_addr[i]);
> +		if (IS_ERR(mpd->i2c[i])) {
> +			dev_err(&client->dev,
> +				"Failed to get new dummy I2C device for address 0x%x",
> +				mt6360_slave_addr[i]);
> +			return PTR_ERR(mpd->i2c[i]);
> +		}
> +		i2c_set_clientdata(mpd->i2c[i], mpd);
> +	}
> +
> +	ret = devm_mfd_add_devices(&client->dev, PLATFORM_DEVID_AUTO,
> +				   mt6360_devs, ARRAY_SIZE(mt6360_devs), NULL,
> +				   0, regmap_irq_get_domain(mpd->irq_data));
> +	if (ret) {
> +		dev_err(&client->dev,
> +			"Failed to register subordinate devices\n");
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int __maybe_unused mt6360_pmu_suspend(struct device *dev)
> +{
> +	struct i2c_client *i2c = to_i2c_client(dev);
> +
> +	if (device_may_wakeup(dev))
> +		enable_irq_wake(i2c->irq);
> +
> +	return 0;
> +}
> +
> +static int __maybe_unused mt6360_pmu_resume(struct device *dev)
> +{
> +
> +	struct i2c_client *i2c = to_i2c_client(dev);
> +
> +	if (device_may_wakeup(dev))
> +		disable_irq_wake(i2c->irq);
> +
> +	return 0;
> +}
> +
> +static SIMPLE_DEV_PM_OPS(mt6360_pmu_pm_ops,
> +			 mt6360_pmu_suspend, mt6360_pmu_resume);
> +
> +static const struct of_device_id __maybe_unused mt6360_pmu_of_id[] = {
> +	{ .compatible = "mediatek,mt6360_pmu", },
> +	{},
> +};
> +MODULE_DEVICE_TABLE(of, mt6360_pmu_of_id);
> +
> +static struct i2c_driver mt6360_pmu_driver = {
> +	.driver = {
> +		.pm = &mt6360_pmu_pm_ops,
> +		.of_match_table = of_match_ptr(mt6360_pmu_of_id),
> +	},
> +	.probe_new = mt6360_pmu_probe,
> +};
> +module_i2c_driver(mt6360_pmu_driver);
> +
> +MODULE_AUTHOR("Gene Chen <gene_chen@richtek.com>");
> +MODULE_DESCRIPTION("MT6360 PMU I2C Driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/linux/mfd/mt6360.h b/include/linux/mfd/mt6360.h
> new file mode 100644
> index 0000000..c03e6d1
> --- /dev/null
> +++ b/include/linux/mfd/mt6360.h
> @@ -0,0 +1,240 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2019 MediaTek Inc.
> + */
> +
> +#ifndef __MT6360_H__
> +#define __MT6360_H__
> +
> +#include <linux/regmap.h>
> +
> +enum {
> +	MT6360_SLAVE_PMU = 0,
> +	MT6360_SLAVE_PMIC,
> +	MT6360_SLAVE_LDO,
> +	MT6360_SLAVE_TCPC,
> +	MT6360_SLAVE_MAX,
> +};
> +
> +#define MT6360_PMU_SLAVEID	(0x34)
> +#define MT6360_PMIC_SLAVEID	(0x1A)
> +#define MT6360_LDO_SLAVEID	(0x64)
> +#define MT6360_TCPC_SLAVEID	(0x4E)
> +
> +struct mt6360_pmu_data {
> +	struct i2c_client *i2c[MT6360_SLAVE_MAX];
> +	struct device *dev;
> +	struct regmap *regmap;
> +	struct regmap_irq_chip_data *irq_data;
> +	unsigned int chip_rev;
> +};
> +
> +/* PMU register defininition */
> +#define MT6360_PMU_DEV_INFO			(0x00)
> +#define MT6360_PMU_CORE_CTRL1			(0x01)
> +#define MT6360_PMU_RST1				(0x02)
> +#define MT6360_PMU_CRCEN			(0x03)
> +#define MT6360_PMU_RST_PAS_CODE1		(0x04)
> +#define MT6360_PMU_RST_PAS_CODE2		(0x05)
> +#define MT6360_PMU_CORE_CTRL2			(0x06)
> +#define MT6360_PMU_TM_PAS_CODE1			(0x07)
> +#define MT6360_PMU_TM_PAS_CODE2			(0x08)
> +#define MT6360_PMU_TM_PAS_CODE3			(0x09)
> +#define MT6360_PMU_TM_PAS_CODE4			(0x0A)
> +#define MT6360_PMU_IRQ_IND			(0x0B)
> +#define MT6360_PMU_IRQ_MASK			(0x0C)
> +#define MT6360_PMU_IRQ_SET			(0x0D)
> +#define MT6360_PMU_SHDN_CTRL			(0x0E)
> +#define MT6360_PMU_TM_INF			(0x0F)
> +#define MT6360_PMU_I2C_CTRL			(0x10)
> +#define MT6360_PMU_CHG_CTRL1			(0x11)
> +#define MT6360_PMU_CHG_CTRL2			(0x12)
> +#define MT6360_PMU_CHG_CTRL3			(0x13)
> +#define MT6360_PMU_CHG_CTRL4			(0x14)
> +#define MT6360_PMU_CHG_CTRL5			(0x15)
> +#define MT6360_PMU_CHG_CTRL6			(0x16)
> +#define MT6360_PMU_CHG_CTRL7			(0x17)
> +#define MT6360_PMU_CHG_CTRL8			(0x18)
> +#define MT6360_PMU_CHG_CTRL9			(0x19)
> +#define MT6360_PMU_CHG_CTRL10			(0x1A)
> +#define MT6360_PMU_CHG_CTRL11			(0x1B)
> +#define MT6360_PMU_CHG_CTRL12			(0x1C)
> +#define MT6360_PMU_CHG_CTRL13			(0x1D)
> +#define MT6360_PMU_CHG_CTRL14			(0x1E)
> +#define MT6360_PMU_CHG_CTRL15			(0x1F)
> +#define MT6360_PMU_CHG_CTRL16			(0x20)
> +#define MT6360_PMU_CHG_AICC_RESULT		(0x21)
> +#define MT6360_PMU_DEVICE_TYPE			(0x22)
> +#define MT6360_PMU_QC_CONTROL1			(0x23)
> +#define MT6360_PMU_QC_CONTROL2			(0x24)
> +#define MT6360_PMU_QC30_CONTROL1		(0x25)
> +#define MT6360_PMU_QC30_CONTROL2		(0x26)
> +#define MT6360_PMU_USB_STATUS1			(0x27)
> +#define MT6360_PMU_QC_STATUS1			(0x28)
> +#define MT6360_PMU_QC_STATUS2			(0x29)
> +#define MT6360_PMU_CHG_PUMP			(0x2A)
> +#define MT6360_PMU_CHG_CTRL17			(0x2B)
> +#define MT6360_PMU_CHG_CTRL18			(0x2C)
> +#define MT6360_PMU_CHRDET_CTRL1			(0x2D)
> +#define MT6360_PMU_CHRDET_CTRL2			(0x2E)
> +#define MT6360_PMU_DPDN_CTRL			(0x2F)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL1		(0x30)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL2		(0x31)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL3		(0x32)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL4		(0x33)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL5		(0x34)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL6		(0x35)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL7		(0x36)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL8		(0x37)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL9		(0x38)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL10		(0x39)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL11		(0x3A)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL12		(0x3B)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL13		(0x3C)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL14		(0x3D)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL15		(0x3E)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL16		(0x3F)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL17		(0x40)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL18		(0x41)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL19		(0x42)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL20		(0x43)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL21		(0x44)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL22		(0x45)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL23		(0x46)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL24		(0x47)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL25		(0x48)
> +#define MT6360_PMU_BC12_CTRL			(0x49)
> +#define MT6360_PMU_CHG_STAT			(0x4A)
> +#define MT6360_PMU_RESV1			(0x4B)
> +#define MT6360_PMU_TYPEC_OTP_TH_SEL_CODEH	(0x4E)
> +#define MT6360_PMU_TYPEC_OTP_TH_SEL_CODEL	(0x4F)
> +#define MT6360_PMU_TYPEC_OTP_HYST_TH		(0x50)
> +#define MT6360_PMU_TYPEC_OTP_CTRL		(0x51)
> +#define MT6360_PMU_ADC_BAT_DATA_H		(0x52)
> +#define MT6360_PMU_ADC_BAT_DATA_L		(0x53)
> +#define MT6360_PMU_IMID_BACKBST_ON		(0x54)
> +#define MT6360_PMU_IMID_BACKBST_OFF		(0x55)
> +#define MT6360_PMU_ADC_CONFIG			(0x56)
> +#define MT6360_PMU_ADC_EN2			(0x57)
> +#define MT6360_PMU_ADC_IDLE_T			(0x58)
> +#define MT6360_PMU_ADC_RPT_1			(0x5A)
> +#define MT6360_PMU_ADC_RPT_2			(0x5B)
> +#define MT6360_PMU_ADC_RPT_3			(0x5C)
> +#define MT6360_PMU_ADC_RPT_ORG1			(0x5D)
> +#define MT6360_PMU_ADC_RPT_ORG2			(0x5E)
> +#define MT6360_PMU_BAT_OVP_TH_SEL_CODEH		(0x5F)
> +#define MT6360_PMU_BAT_OVP_TH_SEL_CODEL		(0x60)
> +#define MT6360_PMU_CHG_CTRL19			(0x61)
> +#define MT6360_PMU_VDDASUPPLY			(0x62)
> +#define MT6360_PMU_BC12_MANUAL			(0x63)
> +#define MT6360_PMU_CHGDET_FUNC			(0x64)
> +#define MT6360_PMU_FOD_CTRL			(0x65)
> +#define MT6360_PMU_CHG_CTRL20			(0x66)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL26		(0x67)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL27		(0x68)
> +#define MT6360_PMU_RESV2			(0x69)
> +#define MT6360_PMU_USBID_CTRL1			(0x6D)
> +#define MT6360_PMU_USBID_CTRL2			(0x6E)
> +#define MT6360_PMU_USBID_CTRL3			(0x6F)
> +#define MT6360_PMU_FLED_CFG			(0x70)
> +#define MT6360_PMU_RESV3			(0x71)
> +#define MT6360_PMU_FLED1_CTRL			(0x72)
> +#define MT6360_PMU_FLED_STRB_CTRL		(0x73)
> +#define MT6360_PMU_FLED1_STRB_CTRL2		(0x74)
> +#define MT6360_PMU_FLED1_TOR_CTRL		(0x75)
> +#define MT6360_PMU_FLED2_CTRL			(0x76)
> +#define MT6360_PMU_RESV4			(0x77)
> +#define MT6360_PMU_FLED2_STRB_CTRL2		(0x78)
> +#define MT6360_PMU_FLED2_TOR_CTRL		(0x79)
> +#define MT6360_PMU_FLED_VMIDTRK_CTRL1		(0x7A)
> +#define MT6360_PMU_FLED_VMID_RTM		(0x7B)
> +#define MT6360_PMU_FLED_VMIDTRK_CTRL2		(0x7C)
> +#define MT6360_PMU_FLED_PWSEL			(0x7D)
> +#define MT6360_PMU_FLED_EN			(0x7E)
> +#define MT6360_PMU_FLED_Hidden1			(0x7F)
> +#define MT6360_PMU_RGB_EN			(0x80)
> +#define MT6360_PMU_RGB1_ISNK			(0x81)
> +#define MT6360_PMU_RGB2_ISNK			(0x82)
> +#define MT6360_PMU_RGB3_ISNK			(0x83)
> +#define MT6360_PMU_RGB_ML_ISNK			(0x84)
> +#define MT6360_PMU_RGB1_DIM			(0x85)
> +#define MT6360_PMU_RGB2_DIM			(0x86)
> +#define MT6360_PMU_RGB3_DIM			(0x87)
> +#define MT6360_PMU_RESV5			(0x88)
> +#define MT6360_PMU_RGB12_Freq			(0x89)
> +#define MT6360_PMU_RGB34_Freq			(0x8A)
> +#define MT6360_PMU_RGB1_Tr			(0x8B)
> +#define MT6360_PMU_RGB1_Tf			(0x8C)
> +#define MT6360_PMU_RGB1_TON_TOFF		(0x8D)
> +#define MT6360_PMU_RGB2_Tr			(0x8E)
> +#define MT6360_PMU_RGB2_Tf			(0x8F)
> +#define MT6360_PMU_RGB2_TON_TOFF		(0x90)
> +#define MT6360_PMU_RGB3_Tr			(0x91)
> +#define MT6360_PMU_RGB3_Tf			(0x92)
> +#define MT6360_PMU_RGB3_TON_TOFF		(0x93)
> +#define MT6360_PMU_RGB_Hidden_CTRL1		(0x94)
> +#define MT6360_PMU_RGB_Hidden_CTRL2		(0x95)
> +#define MT6360_PMU_RESV6			(0x97)
> +#define MT6360_PMU_SPARE1			(0x9A)
> +#define MT6360_PMU_SPARE2			(0xA0)
> +#define MT6360_PMU_SPARE3			(0xB0)
> +#define MT6360_PMU_SPARE4			(0xC0)
> +#define MT6360_PMU_CHG_IRQ1			(0xD0)
> +#define MT6360_PMU_CHG_IRQ2			(0xD1)
> +#define MT6360_PMU_CHG_IRQ3			(0xD2)
> +#define MT6360_PMU_CHG_IRQ4			(0xD3)
> +#define MT6360_PMU_CHG_IRQ5			(0xD4)
> +#define MT6360_PMU_CHG_IRQ6			(0xD5)
> +#define MT6360_PMU_QC_IRQ			(0xD6)
> +#define MT6360_PMU_FOD_IRQ			(0xD7)
> +#define MT6360_PMU_BASE_IRQ			(0xD8)
> +#define MT6360_PMU_FLED_IRQ1			(0xD9)
> +#define MT6360_PMU_FLED_IRQ2			(0xDA)
> +#define MT6360_PMU_RGB_IRQ			(0xDB)
> +#define MT6360_PMU_BUCK1_IRQ			(0xDC)
> +#define MT6360_PMU_BUCK2_IRQ			(0xDD)
> +#define MT6360_PMU_LDO_IRQ1			(0xDE)
> +#define MT6360_PMU_LDO_IRQ2			(0xDF)
> +#define MT6360_PMU_CHG_STAT1			(0xE0)
> +#define MT6360_PMU_CHG_STAT2			(0xE1)
> +#define MT6360_PMU_CHG_STAT3			(0xE2)
> +#define MT6360_PMU_CHG_STAT4			(0xE3)
> +#define MT6360_PMU_CHG_STAT5			(0xE4)
> +#define MT6360_PMU_CHG_STAT6			(0xE5)
> +#define MT6360_PMU_QC_STAT			(0xE6)
> +#define MT6360_PMU_FOD_STAT			(0xE7)
> +#define MT6360_PMU_BASE_STAT			(0xE8)
> +#define MT6360_PMU_FLED_STAT1			(0xE9)
> +#define MT6360_PMU_FLED_STAT2			(0xEA)
> +#define MT6360_PMU_RGB_STAT			(0xEB)
> +#define MT6360_PMU_BUCK1_STAT			(0xEC)
> +#define MT6360_PMU_BUCK2_STAT			(0xED)
> +#define MT6360_PMU_LDO_STAT1			(0xEE)
> +#define MT6360_PMU_LDO_STAT2			(0xEF)
> +#define MT6360_PMU_CHG_MASK1			(0xF0)
> +#define MT6360_PMU_CHG_MASK2			(0xF1)
> +#define MT6360_PMU_CHG_MASK3			(0xF2)
> +#define MT6360_PMU_CHG_MASK4			(0xF3)
> +#define MT6360_PMU_CHG_MASK5			(0xF4)
> +#define MT6360_PMU_CHG_MASK6			(0xF5)
> +#define MT6360_PMU_QC_MASK			(0xF6)
> +#define MT6360_PMU_FOD_MASK			(0xF7)
> +#define MT6360_PMU_BASE_MASK			(0xF8)
> +#define MT6360_PMU_FLED_MASK1			(0xF9)
> +#define MT6360_PMU_FLED_MASK2			(0xFA)
> +#define MT6360_PMU_FAULTB_MASK			(0xFB)
> +#define MT6360_PMU_BUCK1_MASK			(0xFC)
> +#define MT6360_PMU_BUCK2_MASK			(0xFD)
> +#define MT6360_PMU_LDO_MASK1			(0xFE)
> +#define MT6360_PMU_LDO_MASK2			(0xFF)
> +#define MT6360_PMU_MAXREG			(MT6360_PMU_LDO_MASK2)
> +
> +/* MT6360_PMU_IRQ_SET */
> +#define MT6360_PMU_IRQ_REGNUM	(MT6360_PMU_LDO_IRQ2 - MT6360_PMU_CHG_IRQ1 + 1)
> +#define MT6360_IRQ_RETRIG	BIT(2)
> +
> +#define CHIP_VEN_MASK				(0xF0)
> +#define CHIP_VEN_MT6360				(0x50)
> +#define CHIP_REV_MASK				(0x0F)
> +
> +#endif /* __MT6360_H__ */
> 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-03-04 14:56   ` Matthias Brugger
  0 siblings, 0 replies; 1546+ messages in thread
From: Matthias Brugger @ 2020-03-04 14:56 UTC (permalink / raw)
  To: Gene Chen, lee.jones
  Cc: gene_chen, linux-kernel, cy_huang, linux-mediatek, Wilma.Wu,
	linux-arm-kernel, shufan_lee

Please resend with appropiate commit message.

On 03/03/2020 16:27, Gene Chen wrote:
> Add mfd driver for mt6360 pmic chip include
> Battery Charger/USB_PD/Flash LED/RGB LED/LDO/Buck
> 
> Signed-off-by: Gene Chen <gene_chen@richtek.com
> ---
>  drivers/mfd/Kconfig        |  12 ++
>  drivers/mfd/Makefile       |   1 +
>  drivers/mfd/mt6360-core.c  | 425 +++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/mfd/mt6360.h | 240 +++++++++++++++++++++++++
>  4 files changed, 678 insertions(+)
>  create mode 100644 drivers/mfd/mt6360-core.c
>  create mode 100644 include/linux/mfd/mt6360.h
> 
> changelogs between v1 & v2
> - include missing header file
> 
> changelogs between v2 & v3
> - add changelogs
> 
> changelogs between v3 & v4
> - fix Kconfig description
> - replace mt6360_pmu_info with mt6360_pmu_data
> - replace probe with probe_new
> - remove unnecessary irq_chip variable
> - remove annotation
> - replace MT6360_MFD_CELL with OF_MFD_CELL
> 
> changelogs between v4 & v5
> - remove unnecessary parse dt function
> - use devm_i2c_new_dummy_device
> - add base-commit message
> 
> changelogs between v5 & v6
> - review return value
> - remove i2c id_table
> - use GPL license v2
> 
> changelogs between v6 & v7
> - add author description
> - replace MT6360_REGMAP_IRQ_REG by REGMAP_IRQ_REG_LINE
> - remove mt6360-private.h
> 
> changelogs between v7 & v8
> - fix kbuild auto reboot by include interrupt header
> 
> diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
> index 2b20329..0f8c341 100644
> --- a/drivers/mfd/Kconfig
> +++ b/drivers/mfd/Kconfig
> @@ -857,6 +857,18 @@ config MFD_MAX8998
>  	  additional drivers must be enabled in order to use the functionality
>  	  of the device.
>  
> +config MFD_MT6360
> +	tristate "Mediatek MT6360 SubPMIC"
> +	select MFD_CORE
> +	select REGMAP_I2C
> +	select REGMAP_IRQ
> +	depends on I2C
> +	help
> +	  Say Y here to enable MT6360 PMU/PMIC/LDO functional support.
> +	  PMU part includes Charger, Flashlight, RGB LED
> +	  PMIC part includes 2-channel BUCKs and 2-channel LDOs
> +	  LDO part includes 4-channel LDOs
> +
>  config MFD_MT6397
>  	tristate "MediaTek MT6397 PMIC Support"
>  	select MFD_CORE
> diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
> index b83f172..8c35816 100644
> --- a/drivers/mfd/Makefile
> +++ b/drivers/mfd/Makefile
> @@ -238,6 +238,7 @@ obj-$(CONFIG_INTEL_SOC_PMIC)	+= intel-soc-pmic.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_BXTWC)	+= intel_soc_pmic_bxtwc.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_CHTWC)	+= intel_soc_pmic_chtwc.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_CHTDC_TI)	+= intel_soc_pmic_chtdc_ti.o
> +obj-$(CONFIG_MFD_MT6360)	+= mt6360-core.o
>  mt6397-objs	:= mt6397-core.o mt6397-irq.o
>  obj-$(CONFIG_MFD_MT6397)	+= mt6397.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_MRFLD)	+= intel_soc_pmic_mrfld.o
> diff --git a/drivers/mfd/mt6360-core.c b/drivers/mfd/mt6360-core.c
> new file mode 100644
> index 0000000..d1168f8
> --- /dev/null
> +++ b/drivers/mfd/mt6360-core.c
> @@ -0,0 +1,425 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2019 MediaTek Inc.
> + *
> + * Author: Gene Chen <gene_chen@richtek.com>
> + */
> +
> +#include <linux/i2c.h>
> +#include <linux/init.h>
> +#include <linux/interrupt.h>
> +#include <linux/kernel.h>
> +#include <linux/mfd/core.h>
> +#include <linux/module.h>
> +#include <linux/of_irq.h>
> +#include <linux/of_platform.h>
> +#include <linux/version.h>
> +
> +#include <linux/mfd/mt6360.h>
> +
> +/* reg 0 -> 0 ~ 7 */
> +#define MT6360_CHG_TREG_EVT		(4)
> +#define MT6360_CHG_AICR_EVT		(5)
> +#define MT6360_CHG_MIVR_EVT		(6)
> +#define MT6360_PWR_RDY_EVT		(7)
> +/* REG 1 -> 8 ~ 15 */
> +#define MT6360_CHG_BATSYSUV_EVT		(9)
> +#define MT6360_FLED_CHG_VINOVP_EVT	(11)
> +#define MT6360_CHG_VSYSUV_EVT		(12)
> +#define MT6360_CHG_VSYSOV_EVT		(13)
> +#define MT6360_CHG_VBATOV_EVT		(14)
> +#define MT6360_CHG_VBUSOV_EVT		(15)
> +/* REG 2 -> 16 ~ 23 */
> +/* REG 3 -> 24 ~ 31 */
> +#define MT6360_WD_PMU_DET		(25)
> +#define MT6360_WD_PMU_DONE		(26)
> +#define MT6360_CHG_TMRI			(27)
> +#define MT6360_CHG_ADPBADI		(29)
> +#define MT6360_CHG_RVPI			(30)
> +#define MT6360_OTPI			(31)
> +/* REG 4 -> 32 ~ 39 */
> +#define MT6360_CHG_AICCMEASL		(32)
> +#define MT6360_CHGDET_DONEI		(34)
> +#define MT6360_WDTMRI			(35)
> +#define MT6360_SSFINISHI		(36)
> +#define MT6360_CHG_RECHGI		(37)
> +#define MT6360_CHG_TERMI		(38)
> +#define MT6360_CHG_IEOCI		(39)
> +/* REG 5 -> 40 ~ 47 */
> +#define MT6360_PUMPX_DONEI		(40)
> +#define MT6360_BAT_OVP_ADC_EVT		(41)
> +#define MT6360_TYPEC_OTP_EVT		(42)
> +#define MT6360_ADC_WAKEUP_EVT		(43)
> +#define MT6360_ADC_DONEI		(44)
> +#define MT6360_BST_BATUVI		(45)
> +#define MT6360_BST_VBUSOVI		(46)
> +#define MT6360_BST_OLPI			(47)
> +/* REG 6 -> 48 ~ 55 */
> +#define MT6360_ATTACH_I			(48)
> +#define MT6360_DETACH_I			(49)
> +#define MT6360_QC30_STPDONE		(51)
> +#define MT6360_QC_VBUSDET_DONE		(52)
> +#define MT6360_HVDCP_DET		(53)
> +#define MT6360_CHGDETI			(54)
> +#define MT6360_DCDTI			(55)
> +/* REG 7 -> 56 ~ 63 */
> +#define MT6360_FOD_DONE_EVT		(56)
> +#define MT6360_FOD_OV_EVT		(57)
> +#define MT6360_CHRDET_UVP_EVT		(58)
> +#define MT6360_CHRDET_OVP_EVT		(59)
> +#define MT6360_CHRDET_EXT_EVT		(60)
> +#define MT6360_FOD_LR_EVT		(61)
> +#define MT6360_FOD_HR_EVT		(62)
> +#define MT6360_FOD_DISCHG_FAIL_EVT	(63)
> +/* REG 8 -> 64 ~ 71 */
> +#define MT6360_USBID_EVT		(64)
> +#define MT6360_APWDTRST_EVT		(65)
> +#define MT6360_EN_EVT			(66)
> +#define MT6360_QONB_RST_EVT		(67)
> +#define MT6360_MRSTB_EVT		(68)
> +#define MT6360_OTP_EVT			(69)
> +#define MT6360_VDDAOV_EVT		(70)
> +#define MT6360_SYSUV_EVT		(71)
> +/* REG 9 -> 72 ~ 79 */
> +#define MT6360_FLED_STRBPIN_EVT		(72)
> +#define MT6360_FLED_TORPIN_EVT		(73)
> +#define MT6360_FLED_TX_EVT		(74)
> +#define MT6360_FLED_LVF_EVT		(75)
> +#define MT6360_FLED2_SHORT_EVT		(78)
> +#define MT6360_FLED1_SHORT_EVT		(79)
> +/* REG 10 -> 80 ~ 87 */
> +#define MT6360_FLED2_STRB_EVT		(80)
> +#define MT6360_FLED1_STRB_EVT		(81)
> +#define MT6360_FLED2_STRB_TO_EVT	(82)
> +#define MT6360_FLED1_STRB_TO_EVT	(83)
> +#define MT6360_FLED2_TOR_EVT		(84)
> +#define MT6360_FLED1_TOR_EVT		(85)
> +/* REG 11 -> 88 ~ 95 */
> +/* REG 12 -> 96 ~ 103 */
> +#define MT6360_BUCK1_PGB_EVT		(96)
> +#define MT6360_BUCK1_OC_EVT		(100)
> +#define MT6360_BUCK1_OV_EVT		(101)
> +#define MT6360_BUCK1_UV_EVT		(102)
> +/* REG 13 -> 104 ~ 111 */
> +#define MT6360_BUCK2_PGB_EVT		(104)
> +#define MT6360_BUCK2_OC_EVT		(108)
> +#define MT6360_BUCK2_OV_EVT		(109)
> +#define MT6360_BUCK2_UV_EVT		(110)
> +/* REG 14 -> 112 ~ 119 */
> +#define MT6360_LDO1_OC_EVT		(113)
> +#define MT6360_LDO2_OC_EVT		(114)
> +#define MT6360_LDO3_OC_EVT		(115)
> +#define MT6360_LDO5_OC_EVT		(117)
> +#define MT6360_LDO6_OC_EVT		(118)
> +#define MT6360_LDO7_OC_EVT		(119)
> +/* REG 15 -> 120 ~ 127 */
> +#define MT6360_LDO1_PGB_EVT		(121)
> +#define MT6360_LDO2_PGB_EVT		(122)
> +#define MT6360_LDO3_PGB_EVT		(123)
> +#define MT6360_LDO5_PGB_EVT		(125)
> +#define MT6360_LDO6_PGB_EVT		(126)
> +#define MT6360_LDO7_PGB_EVT		(127)
> +
> +static const struct regmap_irq mt6360_pmu_irqs[] =  {
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TREG_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_AICR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_MIVR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_PWR_RDY_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_BATSYSUV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_CHG_VINOVP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VSYSUV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VSYSOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VBATOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VBUSOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_WD_PMU_DET, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_WD_PMU_DONE, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TMRI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_ADPBADI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_RVPI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_OTPI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_AICCMEASL, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHGDET_DONEI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_WDTMRI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_SSFINISHI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_RECHGI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TERMI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_IEOCI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_PUMPX_DONEI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TREG_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BAT_OVP_ADC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_TYPEC_OTP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_ADC_WAKEUP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_ADC_DONEI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BST_BATUVI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BST_VBUSOVI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BST_OLPI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_ATTACH_I, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_DETACH_I, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_QC30_STPDONE, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_QC_VBUSDET_DONE, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_HVDCP_DET, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHGDETI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_DCDTI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_DONE_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_OV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_UVP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_OVP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_EXT_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_LR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_HR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_DISCHG_FAIL_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_USBID_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_APWDTRST_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_EN_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_QONB_RST_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_MRSTB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_OTP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_VDDAOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_SYSUV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_STRBPIN_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_TORPIN_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_TX_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_LVF_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_SHORT_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_SHORT_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_STRB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_STRB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_STRB_TO_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_STRB_TO_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_TOR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_TOR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_OV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_UV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_OV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_UV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO1_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO2_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO3_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO5_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO6_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO7_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO1_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO2_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO3_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO5_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO6_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO7_PGB_EVT, 8),
> +};
> +
> +static int mt6360_pmu_handle_post_irq(void *irq_drv_data)
> +{
> +	struct mt6360_pmu_data *mpd = irq_drv_data;
> +
> +	return regmap_update_bits(mpd->regmap,
> +		MT6360_PMU_IRQ_SET, MT6360_IRQ_RETRIG, MT6360_IRQ_RETRIG);
> +}
> +
> +static struct regmap_irq_chip mt6360_pmu_irq_chip = {
> +	.irqs = mt6360_pmu_irqs,
> +	.num_irqs = ARRAY_SIZE(mt6360_pmu_irqs),
> +	.num_regs = MT6360_PMU_IRQ_REGNUM,
> +	.mask_base = MT6360_PMU_CHG_MASK1,
> +	.status_base = MT6360_PMU_CHG_IRQ1,
> +	.ack_base = MT6360_PMU_CHG_IRQ1,
> +	.init_ack_masked = true,
> +	.use_ack = true,
> +	.handle_post_irq = mt6360_pmu_handle_post_irq,
> +};
> +
> +static const struct regmap_config mt6360_pmu_regmap_config = {
> +	.reg_bits = 8,
> +	.val_bits = 8,
> +	.max_register = MT6360_PMU_MAXREG,
> +};
> +
> +static const struct resource mt6360_adc_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_ADC_DONEI, "adc_donei"),
> +};
> +
> +static const struct resource mt6360_chg_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_TREG_EVT, "chg_treg_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_PWR_RDY_EVT, "pwr_rdy_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_BATSYSUV_EVT, "chg_batsysuv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VSYSUV_EVT, "chg_vsysuv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VSYSOV_EVT, "chg_vsysov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VBATOV_EVT, "chg_vbatov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VBUSOV_EVT, "chg_vbusov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_AICCMEASL, "chg_aiccmeasl"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_WDTMRI, "wdtmri"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_RECHGI, "chg_rechgi"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_TERMI, "chg_termi"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_IEOCI, "chg_ieoci"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_PUMPX_DONEI, "pumpx_donei"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_ATTACH_I, "attach_i"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHRDET_EXT_EVT, "chrdet_ext_evt"),
> +};
> +
> +static const struct resource mt6360_led_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED_CHG_VINOVP_EVT, "fled_chg_vinovp_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED_LVF_EVT, "fled_lvf_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED2_SHORT_EVT, "fled2_short_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED1_SHORT_EVT, "fled1_short_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED2_STRB_TO_EVT, "fled2_strb_to_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED1_STRB_TO_EVT, "fled1_strb_to_evt"),
> +};
> +
> +static const struct resource mt6360_pmic_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_PGB_EVT, "buck1_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_OC_EVT, "buck1_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_OV_EVT, "buck1_ov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_UV_EVT, "buck1_uv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_PGB_EVT, "buck2_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_OC_EVT, "buck2_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_OV_EVT, "buck2_ov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_UV_EVT, "buck2_uv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO6_OC_EVT, "ldo6_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO7_OC_EVT, "ldo7_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO6_PGB_EVT, "ldo6_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO7_PGB_EVT, "ldo7_pgb_evt"),
> +};
> +
> +static const struct resource mt6360_ldo_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO1_OC_EVT, "ldo1_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO2_OC_EVT, "ldo2_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO3_OC_EVT, "ldo3_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO5_OC_EVT, "ldo5_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO1_PGB_EVT, "ldo1_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO2_PGB_EVT, "ldo2_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO3_PGB_EVT, "ldo3_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO5_PGB_EVT, "ldo5_pgb_evt"),
> +};
> +
> +static const struct mfd_cell mt6360_devs[] = {
> +	OF_MFD_CELL("mt6360_adc", mt6360_adc_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_adc"),
> +	OF_MFD_CELL("mt6360_chg", mt6360_chg_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_chg"),
> +	OF_MFD_CELL("mt6360_led", mt6360_led_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_led"),
> +	OF_MFD_CELL("mt6360_pmic", mt6360_pmic_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_pmic"),
> +	OF_MFD_CELL("mt6360_ldo", mt6360_ldo_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_ldo"),
> +	OF_MFD_CELL("mt6360_tcpc", NULL,
> +		    NULL, 0, 0, "mediatek,mt6360_tcpc"),
> +};
> +
> +static const unsigned short mt6360_slave_addr[MT6360_SLAVE_MAX] = {
> +	MT6360_PMU_SLAVEID,
> +	MT6360_PMIC_SLAVEID,
> +	MT6360_LDO_SLAVEID,
> +	MT6360_TCPC_SLAVEID,
> +};
> +
> +static int mt6360_pmu_probe(struct i2c_client *client)
> +{
> +	struct mt6360_pmu_data *mpd;
> +	unsigned int reg_data;
> +	int i, ret;
> +
> +	mpd = devm_kzalloc(&client->dev, sizeof(*mpd), GFP_KERNEL);
> +	if (!mpd)
> +		return -ENOMEM;
> +
> +	mpd->dev = &client->dev;
> +	i2c_set_clientdata(client, mpd);
> +
> +	mpd->regmap = devm_regmap_init_i2c(client, &mt6360_pmu_regmap_config);
> +	if (IS_ERR(mpd->regmap)) {
> +		dev_err(&client->dev, "Failed to register regmap\n");
> +		return PTR_ERR(mpd->regmap);
> +	}
> +
> +	ret = regmap_read(mpd->regmap, MT6360_PMU_DEV_INFO, &reg_data);
> +	if (ret) {
> +		dev_err(&client->dev, "Device not found\n");
> +		return ret;
> +	}
> +
> +	mpd->chip_rev = reg_data & CHIP_REV_MASK;
> +	if (mpd->chip_rev != CHIP_VEN_MT6360) {
> +		dev_err(&client->dev, "Device not supported\n");
> +		return -ENODEV;
> +	}
> +
> +	mt6360_pmu_irq_chip.irq_drv_data = mpd;
> +	ret = devm_regmap_add_irq_chip(&client->dev, mpd->regmap, client->irq,
> +				       IRQF_TRIGGER_FALLING, 0,
> +				       &mt6360_pmu_irq_chip, &mpd->irq_data);
> +	if (ret) {
> +		dev_err(&client->dev, "Failed to add Regmap IRQ Chip\n");
> +		return ret;
> +	}
> +
> +	mpd->i2c[0] = client;
> +	for (i = 1; i < MT6360_SLAVE_MAX; i++) {
> +		mpd->i2c[i] = devm_i2c_new_dummy_device(&client->dev,
> +							client->adapter,
> +							mt6360_slave_addr[i]);
> +		if (IS_ERR(mpd->i2c[i])) {
> +			dev_err(&client->dev,
> +				"Failed to get new dummy I2C device for address 0x%x",
> +				mt6360_slave_addr[i]);
> +			return PTR_ERR(mpd->i2c[i]);
> +		}
> +		i2c_set_clientdata(mpd->i2c[i], mpd);
> +	}
> +
> +	ret = devm_mfd_add_devices(&client->dev, PLATFORM_DEVID_AUTO,
> +				   mt6360_devs, ARRAY_SIZE(mt6360_devs), NULL,
> +				   0, regmap_irq_get_domain(mpd->irq_data));
> +	if (ret) {
> +		dev_err(&client->dev,
> +			"Failed to register subordinate devices\n");
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int __maybe_unused mt6360_pmu_suspend(struct device *dev)
> +{
> +	struct i2c_client *i2c = to_i2c_client(dev);
> +
> +	if (device_may_wakeup(dev))
> +		enable_irq_wake(i2c->irq);
> +
> +	return 0;
> +}
> +
> +static int __maybe_unused mt6360_pmu_resume(struct device *dev)
> +{
> +
> +	struct i2c_client *i2c = to_i2c_client(dev);
> +
> +	if (device_may_wakeup(dev))
> +		disable_irq_wake(i2c->irq);
> +
> +	return 0;
> +}
> +
> +static SIMPLE_DEV_PM_OPS(mt6360_pmu_pm_ops,
> +			 mt6360_pmu_suspend, mt6360_pmu_resume);
> +
> +static const struct of_device_id __maybe_unused mt6360_pmu_of_id[] = {
> +	{ .compatible = "mediatek,mt6360_pmu", },
> +	{},
> +};
> +MODULE_DEVICE_TABLE(of, mt6360_pmu_of_id);
> +
> +static struct i2c_driver mt6360_pmu_driver = {
> +	.driver = {
> +		.pm = &mt6360_pmu_pm_ops,
> +		.of_match_table = of_match_ptr(mt6360_pmu_of_id),
> +	},
> +	.probe_new = mt6360_pmu_probe,
> +};
> +module_i2c_driver(mt6360_pmu_driver);
> +
> +MODULE_AUTHOR("Gene Chen <gene_chen@richtek.com>");
> +MODULE_DESCRIPTION("MT6360 PMU I2C Driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/linux/mfd/mt6360.h b/include/linux/mfd/mt6360.h
> new file mode 100644
> index 0000000..c03e6d1
> --- /dev/null
> +++ b/include/linux/mfd/mt6360.h
> @@ -0,0 +1,240 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2019 MediaTek Inc.
> + */
> +
> +#ifndef __MT6360_H__
> +#define __MT6360_H__
> +
> +#include <linux/regmap.h>
> +
> +enum {
> +	MT6360_SLAVE_PMU = 0,
> +	MT6360_SLAVE_PMIC,
> +	MT6360_SLAVE_LDO,
> +	MT6360_SLAVE_TCPC,
> +	MT6360_SLAVE_MAX,
> +};
> +
> +#define MT6360_PMU_SLAVEID	(0x34)
> +#define MT6360_PMIC_SLAVEID	(0x1A)
> +#define MT6360_LDO_SLAVEID	(0x64)
> +#define MT6360_TCPC_SLAVEID	(0x4E)
> +
> +struct mt6360_pmu_data {
> +	struct i2c_client *i2c[MT6360_SLAVE_MAX];
> +	struct device *dev;
> +	struct regmap *regmap;
> +	struct regmap_irq_chip_data *irq_data;
> +	unsigned int chip_rev;
> +};
> +
> +/* PMU register defininition */
> +#define MT6360_PMU_DEV_INFO			(0x00)
> +#define MT6360_PMU_CORE_CTRL1			(0x01)
> +#define MT6360_PMU_RST1				(0x02)
> +#define MT6360_PMU_CRCEN			(0x03)
> +#define MT6360_PMU_RST_PAS_CODE1		(0x04)
> +#define MT6360_PMU_RST_PAS_CODE2		(0x05)
> +#define MT6360_PMU_CORE_CTRL2			(0x06)
> +#define MT6360_PMU_TM_PAS_CODE1			(0x07)
> +#define MT6360_PMU_TM_PAS_CODE2			(0x08)
> +#define MT6360_PMU_TM_PAS_CODE3			(0x09)
> +#define MT6360_PMU_TM_PAS_CODE4			(0x0A)
> +#define MT6360_PMU_IRQ_IND			(0x0B)
> +#define MT6360_PMU_IRQ_MASK			(0x0C)
> +#define MT6360_PMU_IRQ_SET			(0x0D)
> +#define MT6360_PMU_SHDN_CTRL			(0x0E)
> +#define MT6360_PMU_TM_INF			(0x0F)
> +#define MT6360_PMU_I2C_CTRL			(0x10)
> +#define MT6360_PMU_CHG_CTRL1			(0x11)
> +#define MT6360_PMU_CHG_CTRL2			(0x12)
> +#define MT6360_PMU_CHG_CTRL3			(0x13)
> +#define MT6360_PMU_CHG_CTRL4			(0x14)
> +#define MT6360_PMU_CHG_CTRL5			(0x15)
> +#define MT6360_PMU_CHG_CTRL6			(0x16)
> +#define MT6360_PMU_CHG_CTRL7			(0x17)
> +#define MT6360_PMU_CHG_CTRL8			(0x18)
> +#define MT6360_PMU_CHG_CTRL9			(0x19)
> +#define MT6360_PMU_CHG_CTRL10			(0x1A)
> +#define MT6360_PMU_CHG_CTRL11			(0x1B)
> +#define MT6360_PMU_CHG_CTRL12			(0x1C)
> +#define MT6360_PMU_CHG_CTRL13			(0x1D)
> +#define MT6360_PMU_CHG_CTRL14			(0x1E)
> +#define MT6360_PMU_CHG_CTRL15			(0x1F)
> +#define MT6360_PMU_CHG_CTRL16			(0x20)
> +#define MT6360_PMU_CHG_AICC_RESULT		(0x21)
> +#define MT6360_PMU_DEVICE_TYPE			(0x22)
> +#define MT6360_PMU_QC_CONTROL1			(0x23)
> +#define MT6360_PMU_QC_CONTROL2			(0x24)
> +#define MT6360_PMU_QC30_CONTROL1		(0x25)
> +#define MT6360_PMU_QC30_CONTROL2		(0x26)
> +#define MT6360_PMU_USB_STATUS1			(0x27)
> +#define MT6360_PMU_QC_STATUS1			(0x28)
> +#define MT6360_PMU_QC_STATUS2			(0x29)
> +#define MT6360_PMU_CHG_PUMP			(0x2A)
> +#define MT6360_PMU_CHG_CTRL17			(0x2B)
> +#define MT6360_PMU_CHG_CTRL18			(0x2C)
> +#define MT6360_PMU_CHRDET_CTRL1			(0x2D)
> +#define MT6360_PMU_CHRDET_CTRL2			(0x2E)
> +#define MT6360_PMU_DPDN_CTRL			(0x2F)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL1		(0x30)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL2		(0x31)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL3		(0x32)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL4		(0x33)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL5		(0x34)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL6		(0x35)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL7		(0x36)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL8		(0x37)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL9		(0x38)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL10		(0x39)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL11		(0x3A)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL12		(0x3B)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL13		(0x3C)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL14		(0x3D)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL15		(0x3E)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL16		(0x3F)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL17		(0x40)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL18		(0x41)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL19		(0x42)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL20		(0x43)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL21		(0x44)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL22		(0x45)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL23		(0x46)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL24		(0x47)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL25		(0x48)
> +#define MT6360_PMU_BC12_CTRL			(0x49)
> +#define MT6360_PMU_CHG_STAT			(0x4A)
> +#define MT6360_PMU_RESV1			(0x4B)
> +#define MT6360_PMU_TYPEC_OTP_TH_SEL_CODEH	(0x4E)
> +#define MT6360_PMU_TYPEC_OTP_TH_SEL_CODEL	(0x4F)
> +#define MT6360_PMU_TYPEC_OTP_HYST_TH		(0x50)
> +#define MT6360_PMU_TYPEC_OTP_CTRL		(0x51)
> +#define MT6360_PMU_ADC_BAT_DATA_H		(0x52)
> +#define MT6360_PMU_ADC_BAT_DATA_L		(0x53)
> +#define MT6360_PMU_IMID_BACKBST_ON		(0x54)
> +#define MT6360_PMU_IMID_BACKBST_OFF		(0x55)
> +#define MT6360_PMU_ADC_CONFIG			(0x56)
> +#define MT6360_PMU_ADC_EN2			(0x57)
> +#define MT6360_PMU_ADC_IDLE_T			(0x58)
> +#define MT6360_PMU_ADC_RPT_1			(0x5A)
> +#define MT6360_PMU_ADC_RPT_2			(0x5B)
> +#define MT6360_PMU_ADC_RPT_3			(0x5C)
> +#define MT6360_PMU_ADC_RPT_ORG1			(0x5D)
> +#define MT6360_PMU_ADC_RPT_ORG2			(0x5E)
> +#define MT6360_PMU_BAT_OVP_TH_SEL_CODEH		(0x5F)
> +#define MT6360_PMU_BAT_OVP_TH_SEL_CODEL		(0x60)
> +#define MT6360_PMU_CHG_CTRL19			(0x61)
> +#define MT6360_PMU_VDDASUPPLY			(0x62)
> +#define MT6360_PMU_BC12_MANUAL			(0x63)
> +#define MT6360_PMU_CHGDET_FUNC			(0x64)
> +#define MT6360_PMU_FOD_CTRL			(0x65)
> +#define MT6360_PMU_CHG_CTRL20			(0x66)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL26		(0x67)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL27		(0x68)
> +#define MT6360_PMU_RESV2			(0x69)
> +#define MT6360_PMU_USBID_CTRL1			(0x6D)
> +#define MT6360_PMU_USBID_CTRL2			(0x6E)
> +#define MT6360_PMU_USBID_CTRL3			(0x6F)
> +#define MT6360_PMU_FLED_CFG			(0x70)
> +#define MT6360_PMU_RESV3			(0x71)
> +#define MT6360_PMU_FLED1_CTRL			(0x72)
> +#define MT6360_PMU_FLED_STRB_CTRL		(0x73)
> +#define MT6360_PMU_FLED1_STRB_CTRL2		(0x74)
> +#define MT6360_PMU_FLED1_TOR_CTRL		(0x75)
> +#define MT6360_PMU_FLED2_CTRL			(0x76)
> +#define MT6360_PMU_RESV4			(0x77)
> +#define MT6360_PMU_FLED2_STRB_CTRL2		(0x78)
> +#define MT6360_PMU_FLED2_TOR_CTRL		(0x79)
> +#define MT6360_PMU_FLED_VMIDTRK_CTRL1		(0x7A)
> +#define MT6360_PMU_FLED_VMID_RTM		(0x7B)
> +#define MT6360_PMU_FLED_VMIDTRK_CTRL2		(0x7C)
> +#define MT6360_PMU_FLED_PWSEL			(0x7D)
> +#define MT6360_PMU_FLED_EN			(0x7E)
> +#define MT6360_PMU_FLED_Hidden1			(0x7F)
> +#define MT6360_PMU_RGB_EN			(0x80)
> +#define MT6360_PMU_RGB1_ISNK			(0x81)
> +#define MT6360_PMU_RGB2_ISNK			(0x82)
> +#define MT6360_PMU_RGB3_ISNK			(0x83)
> +#define MT6360_PMU_RGB_ML_ISNK			(0x84)
> +#define MT6360_PMU_RGB1_DIM			(0x85)
> +#define MT6360_PMU_RGB2_DIM			(0x86)
> +#define MT6360_PMU_RGB3_DIM			(0x87)
> +#define MT6360_PMU_RESV5			(0x88)
> +#define MT6360_PMU_RGB12_Freq			(0x89)
> +#define MT6360_PMU_RGB34_Freq			(0x8A)
> +#define MT6360_PMU_RGB1_Tr			(0x8B)
> +#define MT6360_PMU_RGB1_Tf			(0x8C)
> +#define MT6360_PMU_RGB1_TON_TOFF		(0x8D)
> +#define MT6360_PMU_RGB2_Tr			(0x8E)
> +#define MT6360_PMU_RGB2_Tf			(0x8F)
> +#define MT6360_PMU_RGB2_TON_TOFF		(0x90)
> +#define MT6360_PMU_RGB3_Tr			(0x91)
> +#define MT6360_PMU_RGB3_Tf			(0x92)
> +#define MT6360_PMU_RGB3_TON_TOFF		(0x93)
> +#define MT6360_PMU_RGB_Hidden_CTRL1		(0x94)
> +#define MT6360_PMU_RGB_Hidden_CTRL2		(0x95)
> +#define MT6360_PMU_RESV6			(0x97)
> +#define MT6360_PMU_SPARE1			(0x9A)
> +#define MT6360_PMU_SPARE2			(0xA0)
> +#define MT6360_PMU_SPARE3			(0xB0)
> +#define MT6360_PMU_SPARE4			(0xC0)
> +#define MT6360_PMU_CHG_IRQ1			(0xD0)
> +#define MT6360_PMU_CHG_IRQ2			(0xD1)
> +#define MT6360_PMU_CHG_IRQ3			(0xD2)
> +#define MT6360_PMU_CHG_IRQ4			(0xD3)
> +#define MT6360_PMU_CHG_IRQ5			(0xD4)
> +#define MT6360_PMU_CHG_IRQ6			(0xD5)
> +#define MT6360_PMU_QC_IRQ			(0xD6)
> +#define MT6360_PMU_FOD_IRQ			(0xD7)
> +#define MT6360_PMU_BASE_IRQ			(0xD8)
> +#define MT6360_PMU_FLED_IRQ1			(0xD9)
> +#define MT6360_PMU_FLED_IRQ2			(0xDA)
> +#define MT6360_PMU_RGB_IRQ			(0xDB)
> +#define MT6360_PMU_BUCK1_IRQ			(0xDC)
> +#define MT6360_PMU_BUCK2_IRQ			(0xDD)
> +#define MT6360_PMU_LDO_IRQ1			(0xDE)
> +#define MT6360_PMU_LDO_IRQ2			(0xDF)
> +#define MT6360_PMU_CHG_STAT1			(0xE0)
> +#define MT6360_PMU_CHG_STAT2			(0xE1)
> +#define MT6360_PMU_CHG_STAT3			(0xE2)
> +#define MT6360_PMU_CHG_STAT4			(0xE3)
> +#define MT6360_PMU_CHG_STAT5			(0xE4)
> +#define MT6360_PMU_CHG_STAT6			(0xE5)
> +#define MT6360_PMU_QC_STAT			(0xE6)
> +#define MT6360_PMU_FOD_STAT			(0xE7)
> +#define MT6360_PMU_BASE_STAT			(0xE8)
> +#define MT6360_PMU_FLED_STAT1			(0xE9)
> +#define MT6360_PMU_FLED_STAT2			(0xEA)
> +#define MT6360_PMU_RGB_STAT			(0xEB)
> +#define MT6360_PMU_BUCK1_STAT			(0xEC)
> +#define MT6360_PMU_BUCK2_STAT			(0xED)
> +#define MT6360_PMU_LDO_STAT1			(0xEE)
> +#define MT6360_PMU_LDO_STAT2			(0xEF)
> +#define MT6360_PMU_CHG_MASK1			(0xF0)
> +#define MT6360_PMU_CHG_MASK2			(0xF1)
> +#define MT6360_PMU_CHG_MASK3			(0xF2)
> +#define MT6360_PMU_CHG_MASK4			(0xF3)
> +#define MT6360_PMU_CHG_MASK5			(0xF4)
> +#define MT6360_PMU_CHG_MASK6			(0xF5)
> +#define MT6360_PMU_QC_MASK			(0xF6)
> +#define MT6360_PMU_FOD_MASK			(0xF7)
> +#define MT6360_PMU_BASE_MASK			(0xF8)
> +#define MT6360_PMU_FLED_MASK1			(0xF9)
> +#define MT6360_PMU_FLED_MASK2			(0xFA)
> +#define MT6360_PMU_FAULTB_MASK			(0xFB)
> +#define MT6360_PMU_BUCK1_MASK			(0xFC)
> +#define MT6360_PMU_BUCK2_MASK			(0xFD)
> +#define MT6360_PMU_LDO_MASK1			(0xFE)
> +#define MT6360_PMU_LDO_MASK2			(0xFF)
> +#define MT6360_PMU_MAXREG			(MT6360_PMU_LDO_MASK2)
> +
> +/* MT6360_PMU_IRQ_SET */
> +#define MT6360_PMU_IRQ_REGNUM	(MT6360_PMU_LDO_IRQ2 - MT6360_PMU_CHG_IRQ1 + 1)
> +#define MT6360_IRQ_RETRIG	BIT(2)
> +
> +#define CHIP_VEN_MASK				(0xF0)
> +#define CHIP_VEN_MT6360				(0x50)
> +#define CHIP_REV_MASK				(0x0F)
> +
> +#endif /* __MT6360_H__ */
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-03-04 14:56   ` Re: Matthias Brugger
@ 2020-03-04 15:15     ` Lee Jones
  -1 siblings, 0 replies; 1546+ messages in thread
From: Lee Jones @ 2020-03-04 15:15 UTC (permalink / raw)
  To: Matthias Brugger
  Cc: gene_chen, linux-kernel, cy_huang, linux-mediatek, Gene Chen,
	Wilma.Wu, linux-arm-kernel, shufan_lee

On Wed, 04 Mar 2020, Matthias Brugger wrote:

> Please resend with appropiate commit message.

Please refrain from top-posting and don't forget to snip.

> On 03/03/2020 16:27, Gene Chen wrote:
> > Add mfd driver for mt6360 pmic chip include

Looks like your formatting is off.

How was this patch sent?

Best practice is to use `git send-email`.

> > Battery Charger/USB_PD/Flash LED/RGB LED/LDO/Buck
> > 
> > Signed-off-by: Gene Chen <gene_chen@richtek.com
> > ---
> >  drivers/mfd/Kconfig        |  12 ++
> >  drivers/mfd/Makefile       |   1 +
> >  drivers/mfd/mt6360-core.c  | 425 +++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/mfd/mt6360.h | 240 +++++++++++++++++++++++++
> >  4 files changed, 678 insertions(+)
> >  create mode 100644 drivers/mfd/mt6360-core.c
> >  create mode 100644 include/linux/mfd/mt6360.h

-- 
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-03-04 15:15     ` Lee Jones
  0 siblings, 0 replies; 1546+ messages in thread
From: Lee Jones @ 2020-03-04 15:15 UTC (permalink / raw)
  To: Matthias Brugger
  Cc: gene_chen, linux-kernel, cy_huang, linux-mediatek, Gene Chen,
	Wilma.Wu, linux-arm-kernel, shufan_lee

On Wed, 04 Mar 2020, Matthias Brugger wrote:

> Please resend with appropiate commit message.

Please refrain from top-posting and don't forget to snip.

> On 03/03/2020 16:27, Gene Chen wrote:
> > Add mfd driver for mt6360 pmic chip include

Looks like your formatting is off.

How was this patch sent?

Best practice is to use `git send-email`.

> > Battery Charger/USB_PD/Flash LED/RGB LED/LDO/Buck
> > 
> > Signed-off-by: Gene Chen <gene_chen@richtek.com
> > ---
> >  drivers/mfd/Kconfig        |  12 ++
> >  drivers/mfd/Makefile       |   1 +
> >  drivers/mfd/mt6360-core.c  | 425 +++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/mfd/mt6360.h | 240 +++++++++++++++++++++++++
> >  4 files changed, 678 insertions(+)
> >  create mode 100644 drivers/mfd/mt6360-core.c
> >  create mode 100644 include/linux/mfd/mt6360.h

-- 
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-03-04 15:15     ` Re: Lee Jones
@ 2020-03-04 18:00       ` Matthias Brugger
  -1 siblings, 0 replies; 1546+ messages in thread
From: Matthias Brugger @ 2020-03-04 18:00 UTC (permalink / raw)
  To: Lee Jones
  Cc: gene_chen, linux-kernel, cy_huang, linux-mediatek, Gene Chen,
	Wilma.Wu, linux-arm-kernel, shufan_lee



On 04/03/2020 16:15, Lee Jones wrote:
> On Wed, 04 Mar 2020, Matthias Brugger wrote:
> 
>> Please resend with appropiate commit message.
> 
> Please refrain from top-posting and don't forget to snip.
> 

It's difficult to write something below a missing subject line without
top-posting. ;)

Sorry for forgetting to snip.

Regards,
Matthias

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-03-04 18:00       ` Matthias Brugger
  0 siblings, 0 replies; 1546+ messages in thread
From: Matthias Brugger @ 2020-03-04 18:00 UTC (permalink / raw)
  To: Lee Jones
  Cc: gene_chen, linux-kernel, cy_huang, linux-mediatek, Gene Chen,
	Wilma.Wu, linux-arm-kernel, shufan_lee



On 04/03/2020 16:15, Lee Jones wrote:
> On Wed, 04 Mar 2020, Matthias Brugger wrote:
> 
>> Please resend with appropiate commit message.
> 
> Please refrain from top-posting and don't forget to snip.
> 

It's difficult to write something below a missing subject line without
top-posting. ;)

Sorry for forgetting to snip.

Regards,
Matthias

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-03-08 17:19 Francois Pinault
  0 siblings, 0 replies; 1546+ messages in thread
From: Francois Pinault @ 2020-03-08 17:19 UTC (permalink / raw)
  To: keyrings

A donation was made in your favour by Francois Pinault, reply for more details.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-03-08 17:33 ` Francois Pinault
  0 siblings, 0 replies; 1546+ messages in thread
From: Francois Pinault @ 2020-03-08 17:33 UTC (permalink / raw)
  To: linux-fbdev

A donation was made in your favour by Francois Pinault, reply for more details.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-03-08 17:33 ` Francois Pinault
  0 siblings, 0 replies; 1546+ messages in thread
From: Francois Pinault @ 2020-03-08 17:33 UTC (permalink / raw)


A donation was made in your favour by Francois Pinault, reply for more details.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-03-08 17:33 Francois Pinault
  0 siblings, 0 replies; 1546+ messages in thread
From: Francois Pinault @ 2020-03-08 17:33 UTC (permalink / raw)


A donation was made in your favour by Francois Pinault, reply for more details.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-03-08 19:12 Francois Pinault
  0 siblings, 0 replies; 1546+ messages in thread
From: Francois Pinault @ 2020-03-08 19:12 UTC (permalink / raw)
  To: target-devel

A donation was made in your favour by Francois Pinault, reply for more details.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-03-16 23:07 Sankalp Bhardwaj
@ 2020-03-17  9:13 ` Valdis Klētnieks
  2020-03-17 10:10   ` Re: suvrojit
  0 siblings, 1 reply; 1546+ messages in thread
From: Valdis Klētnieks @ 2020-03-17  9:13 UTC (permalink / raw)
  To: Sankalp Bhardwaj; +Cc: kernelnewbies

[-- Attachment #1.1: Type: text/plain, Size: 1820 bytes --]

On Tue, 17 Mar 2020 04:37:58 +0530, Sankalp Bhardwaj said:

> Where to get started?? I am interested in understanding how the
> kernel works but have no prior knowledge... Please help!!

A good place to start is to realize that the answers often depend on what the
question is - and there's usually a difference between the question that is
asked, and the question that the person needs the answer for.  You probably
want to read this:

https://lists.kernelnewbies.org/pipermail/kernelnewbies/2017-April/017765.html

Something that you'll need is a good understanding of operating system
concepts. Almost all modern computer systems have some idea of basic concepts
such as processes, files, a directory structure, security and permissions,
scheduling, locking, and so on.  And for most of these, there is more than one
way to accomplish the goal.

So two books that are useful to read for a compare-and-contrast view are Bach's
book on the System V kernel, and McKusic's book on the BSD kernel - both go
into details of *why* some things are done they are.  It's really helpful to
see stuff like "We need to lock this inode while we do X, because otherwise
another thread could concurrently do Y, and then Bad Thing Z will happen".

Of course, a Linux filesystem that does things differently won't have the same
exact issues, but understanding the *sort* of things that break when you screw
up your locking is quite the useful info, especially if most of your coding has
been in userspace where single-threaded is common and libraries did their own
locking when needed.

I admit that I also learned a bunch from Tanenbaum's "Modern Operating
Systems", but that was a long long time ago in a galaxy far far away, and I
have no idea what the cool kids are reading instead these days...

[-- Attachment #1.2: Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-03-17  9:13 ` Valdis Klētnieks
@ 2020-03-17 10:10   ` suvrojit
  0 siblings, 0 replies; 1546+ messages in thread
From: suvrojit @ 2020-03-17 10:10 UTC (permalink / raw)
  To: Valdis Klētnieks; +Cc: kernelnewbies, Sankalp Bhardwaj


[-- Attachment #1.1: Type: text/plain, Size: 2247 bytes --]

ULK by Bovet Cessati is the book u should start reading Sankalp

On Tue, Mar 17, 2020, 2:44 PM Valdis Klētnieks <valdis.kletnieks@vt.edu>
wrote:

> On Tue, 17 Mar 2020 04:37:58 +0530, Sankalp Bhardwaj said:
>
> > Where to get started?? I am interested in understanding how the
> > kernel works but have no prior knowledge... Please help!!
>
> A good place to start is to realize that the answers often depend on what
> the
> question is - and there's usually a difference between the question that is
> asked, and the question that the person needs the answer for.  You probably
> want to read this:
>
>
> https://lists.kernelnewbies.org/pipermail/kernelnewbies/2017-April/017765.html
>
> Something that you'll need is a good understanding of operating system
> concepts. Almost all modern computer systems have some idea of basic
> concepts
> such as processes, files, a directory structure, security and permissions,
> scheduling, locking, and so on.  And for most of these, there is more than
> one
> way to accomplish the goal.
>
> So two books that are useful to read for a compare-and-contrast view are
> Bach's
> book on the System V kernel, and McKusic's book on the BSD kernel - both go
> into details of *why* some things are done they are.  It's really helpful
> to
> see stuff like "We need to lock this inode while we do X, because otherwise
> another thread could concurrently do Y, and then Bad Thing Z will happen".
>
> Of course, a Linux filesystem that does things differently won't have the
> same
> exact issues, but understanding the *sort* of things that break when you
> screw
> up your locking is quite the useful info, especially if most of your
> coding has
> been in userspace where single-threaded is common and libraries did their
> own
> locking when needed.
>
> I admit that I also learned a bunch from Tanenbaum's "Modern Operating
> Systems", but that was a long long time ago in a galaxy far far away, and I
> have no idea what the cool kids are reading instead these days...
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>

[-- Attachment #1.2: Type: text/html, Size: 2959 bytes --]

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CALeDE9OeBx6v6nGVjeydgF1vpfX1Bus319h3M1=49PMETdaCtw@mail.gmail.com>
@ 2020-03-20 11:49 ` Josh Boyer
  0 siblings, 0 replies; 1546+ messages in thread
From: Josh Boyer @ 2020-03-20 11:49 UTC (permalink / raw)
  To: Peter Robinson; +Cc: Linux Firmware

On Fri, Mar 20, 2020 at 7:47 AM Peter Robinson <pbrobinson@gmail.com> wrote:
>
> subscribe

linux-firmware isn't a mailing list.  It's an alias for the
maintainers.  You can find a lore instance for it though!

https://lore.kernel.org/linux-firmware/

josh

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-03-27  8:36 (unknown) chenanqing
@ 2020-03-27  8:59 ` Ilya Dryomov
  0 siblings, 0 replies; 1546+ messages in thread
From: Ilya Dryomov @ 2020-03-27  8:59 UTC (permalink / raw)
  To: chenanqing; +Cc: LKML, netdev, Ceph Development, kuba, Sage Weil, Jeff Layton

On Fri, Mar 27, 2020 at 9:36 AM <chenanqing@oppo.com> wrote:
>
> From: Chen Anqing <chenanqing@oppo.com>
> To: Ilya Dryomov <idryomov@gmail.com>
> Cc: Jeff Layton <jlayton@kernel.org>,
>         Sage Weil <sage@redhat.com>,
>         Jakub Kicinski <kuba@kernel.org>,
>         ceph-devel@vger.kernel.org,
>         netdev@vger.kernel.org,
>         linux-kernel@vger.kernel.org,
>         chenanqing@oppo.com
> Subject: [PATCH] libceph: we should take compound page into account also
> Date: Fri, 27 Mar 2020 04:36:30 -0400
> Message-Id: <20200327083630.36296-1-chenanqing@oppo.com>
> X-Mailer: git-send-email 2.18.2
>
> the patch is occur at a real crash,which slab is
> come from a compound page,so we need take the compound page
> into account also.
> fixed commit 7e241f647dc7 ("libceph: fall back to sendmsg for slab pages")'
>
> Signed-off-by: Chen Anqing <chenanqing@oppo.com>
> ---
>  net/ceph/messenger.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index f8ca5edc5f2c..e08c1c334cd9 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -582,7 +582,7 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
>          * coalescing neighboring slab objects into a single frag which
>          * triggers one of hardened usercopy checks.
>          */
> -       if (page_count(page) >= 1 && !PageSlab(page))
> +       if (page_count(page) >= 1 && !PageSlab(compound_head(page)))
>                 sendpage = sock->ops->sendpage;
>         else
>                 sendpage = sock_no_sendpage;

Hi Chen,

AFAICT compound pages should already be taken into account, because
PageSlab is defined as:

  __PAGEFLAG(Slab, slab, PF_NO_TAIL)

  #define __PAGEFLAG(uname, lname, policy)                       \
      TESTPAGEFLAG(uname, lname, policy)                         \
      __SETPAGEFLAG(uname, lname, policy)                        \
      __CLEARPAGEFLAG(uname, lname, policy)

  #define TESTPAGEFLAG(uname, lname, policy)                     \
  static __always_inline int Page##uname(struct page *page)      \
      { return test_bit(PG_##lname, &policy(page, 0)->flags); }

and PF_NO_TAIL policy is defined as:

  #define PF_NO_TAIL(page, enforce) ({                        \
      VM_BUG_ON_PGFLAGS(enforce && PageTail(page), page);     \
      PF_POISONED_CHECK(compound_head(page)); })

So compound_head() is called behind the scenes.

Could you please explain what crash did you observe in more detail?
Perhaps you backported this patch to an older kernel?

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-03-27  9:20 (unknown) chenanqing
@ 2020-03-27 15:53     ` Lee Duncan
  0 siblings, 0 replies; 1546+ messages in thread
From: Lee Duncan @ 2020-03-27 15:53 UTC (permalink / raw)
  To: chenanqing-Oq79sGaMObY, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA,
	open-iscsi-/JYPxA39Uh5TLH3MbocFFw,
	ceph-devel-u79uwXL29TY76Z2rM5mHXA,
	martin.petersen-QHcLZuEGTsvQT0dZR+AlfA,
	jejb-tEXmvtCZX7AybS5Ee8rs3A, cleech-H+wXaHxf7aLQT0dZR+AlfA

On 3/27/20 2:20 AM, chenanqing-Oq79sGaMObY@public.gmane.org wrote:
> From: Chen Anqing <chenanqing-Oq79sGaMObY@public.gmane.org>
> To: Lee Duncan <lduncan-IBi9RG/b67k@public.gmane.org>
> Cc: Chris Leech <cleech-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
>         "James E . J . Bottomley" <jejb-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>,
>         "Martin K . Petersen" <martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
>         ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
>         open-iscsi-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org,
>         linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
>         linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
>         chenanqing-Oq79sGaMObY@public.gmane.org
> Subject: [PATCH] scsi: libiscsi: we should take compound page into account also
> Date: Fri, 27 Mar 2020 05:20:01 -0400
> Message-Id: <20200327092001.56879-1-chenanqing-Oq79sGaMObY@public.gmane.org>
> X-Mailer: git-send-email 2.18.2
> 
> the patch is occur at a real crash,which slab is
> come from a compound page,so we need take the compound page
> into account also.
> fixed commit 08b11eaccfcf ("scsi: libiscsi: fall back to
> sendmsg for slab pages").
> 
> Signed-off-by: Chen Anqing <chenanqing-Oq79sGaMObY@public.gmane.org>
> ---
>  drivers/scsi/libiscsi_tcp.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c
> index 6ef93c7af954..98304e5e1f6f 100644
> --- a/drivers/scsi/libiscsi_tcp.c
> +++ b/drivers/scsi/libiscsi_tcp.c
> @@ -128,7 +128,8 @@ static void iscsi_tcp_segment_map(struct iscsi_segment *segment, int recv)
>          * coalescing neighboring slab objects into a single frag which
>          * triggers one of hardened usercopy checks.
>          */
> -       if (!recv && page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)))
> +       if (!recv && page_count(sg_page(sg)) >= 1 &&
> +           !PageSlab(compound_head(sg_page(sg))))
>                 return;
> 
>         if (recv) {
> --
> 2.18.2
> 


This is missing a proper subject ...

-- 
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/open-iscsi/5462bc04-8409-a0c3-628f-640d1c92b8c6%40suse.com.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-03-27 15:53     ` Lee Duncan
  0 siblings, 0 replies; 1546+ messages in thread
From: Lee Duncan @ 2020-03-27 15:53 UTC (permalink / raw)
  To: chenanqing, linux-kernel, linux-scsi, open-iscsi, ceph-devel,
	martin.petersen, jejb, cleech

On 3/27/20 2:20 AM, chenanqing@oppo.com wrote:
> From: Chen Anqing <chenanqing@oppo.com>
> To: Lee Duncan <lduncan@suse.com>
> Cc: Chris Leech <cleech@redhat.com>,
>         "James E . J . Bottomley" <jejb@linux.ibm.com>,
>         "Martin K . Petersen" <martin.petersen@oracle.com>,
>         ceph-devel@vger.kernel.org,
>         open-iscsi@googlegroups.com,
>         linux-scsi@vger.kernel.org,
>         linux-kernel@vger.kernel.org,
>         chenanqing@oppo.com
> Subject: [PATCH] scsi: libiscsi: we should take compound page into account also
> Date: Fri, 27 Mar 2020 05:20:01 -0400
> Message-Id: <20200327092001.56879-1-chenanqing@oppo.com>
> X-Mailer: git-send-email 2.18.2
> 
> the patch is occur at a real crash,which slab is
> come from a compound page,so we need take the compound page
> into account also.
> fixed commit 08b11eaccfcf ("scsi: libiscsi: fall back to
> sendmsg for slab pages").
> 
> Signed-off-by: Chen Anqing <chenanqing@oppo.com>
> ---
>  drivers/scsi/libiscsi_tcp.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c
> index 6ef93c7af954..98304e5e1f6f 100644
> --- a/drivers/scsi/libiscsi_tcp.c
> +++ b/drivers/scsi/libiscsi_tcp.c
> @@ -128,7 +128,8 @@ static void iscsi_tcp_segment_map(struct iscsi_segment *segment, int recv)
>          * coalescing neighboring slab objects into a single frag which
>          * triggers one of hardened usercopy checks.
>          */
> -       if (!recv && page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)))
> +       if (!recv && page_count(sg_page(sg)) >= 1 &&
> +           !PageSlab(compound_head(sg_page(sg))))
>                 return;
> 
>         if (recv) {
> --
> 2.18.2
> 


This is missing a proper subject ...


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] ` <CAPXXXSBYcU1QamovmP-gVTXms67Xi_QpMCV=V3570q1nnuWqNw@mail.gmail.com>
@ 2020-04-04 21:05   ` Ruslan Bilovol
  2020-04-05  1:27       ` Re: Alan Stern
  0 siblings, 1 reply; 1546+ messages in thread
From: Ruslan Bilovol @ 2020-04-04 21:05 UTC (permalink / raw)
  To: Colin Williams, Linux USB, alsa-devel

Hi,

Please also add to CC related mailing lists (alsa-devel, linux-usb) rather
then directly emailing - community may also help with the issue. Also it can be
googled so if somebody else have same issue it can find answers faster.

On Fri, Apr 3, 2020 at 10:56 AM Colin Williams
<colin.williams.seattle@gmail.com> wrote:
>
> https://ubuntuforums.org/showthread.php?t=2439897
>
> On Thu, Apr 2, 2020 at 4:50 PM Colin Williams <colin.williams.seattle@gmail.com> wrote:
>>
>> Hello,
>>
>> Is it possible that one of these commits or related broke support for the Blue Mic Yeti?
>>
>> https://github.com/torvalds/linux/blame/ac438771ccb4479528594c7e19f2c39cf1814a86/sound/usb/stream.c#L816

Tha'ts workaround to ignore last altsetting which is the same as previous.
During UAC3 implementation, I reimplemented that workaround carefully,
but I didn't have (and still do not own) any Blue Mic USB device.
I don't know whether it was tested after that by anyone.

>>
>> I am getting the following when I plug my mic in:

Which kernel version is that? Have you tried latest Linux Kernel?

>>
>> [ 1283.848740] usb 1-1.2: new full-speed USB device number 82 using ehci-pci
>> [ 1283.964802] usb 1-1.2: New USB device found, idVendor=b58e, idProduct=9e84, bcdDevice= 1.00
>> [ 1283.964808] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
>> [ 1283.964810] usb 1-1.2: Product: Yeti Stereo Microphone
>> [ 1283.964812] usb 1-1.2: Manufacturer: Blue Microphones
>> [ 1284.080671] usb 1-1.3: new low-speed USB device number 83 using ehci-pci
>> [ 1284.784678] usb 1-1.3: device descriptor read/64, error -32
>> [ 1285.180674] usb 1-1.3: device descriptor read/64, error -32
>> [ 1285.992682] usb 1-1.3: new low-speed USB device number 84 using ehci-pci
>> [ 1286.696672] usb 1-1.3: device descriptor read/64, error -32
>> [ 1287.092695] usb 1-1.3: device descriptor read/64, error -32
>> [ 1287.200804] usb 1-1-port3: attempt power cycle
>> [ 1287.804662] usb 1-1.3: new low-speed USB device number 85 using ehci-pci
>> [ 1288.220686] usb 1-1.3: device not accepting address 85, error -32
>> [ 1288.508685] usb 1-1.3: new low-speed USB device number 86 using ehci-pci
>> [ 1288.924690] usb 1-1.3: device not accepting address 86, error -32
>> [ 1288.924916] usb 1-1-port3: unable to enumerate USB device
>> [ 1288.925391] usb 1-1.2: USB disconnect, device number 82
>> [ 1289.308736] usb 1-1.3: new low-speed USB device number 87 using ehci-pci
>> [ 1289.596727] usb 1-1.3: device descriptor read/64, error -32
>> [ 1289.992635] usb 1-1.3: device descriptor read/64, error -32
>> [ 1290.596683] usb 1-1.3: new low-speed USB device number 88 using ehci-pci
>> [ 1290.888718] usb 1-1.3: device descriptor read/64, error -32
>> [ 1291.284673] usb 1-1.3: device descriptor read/64, error -32
>> [ 1291.392928] usb 1-1-port3: attempt power cycle

Looking at this log, it seems the issue happens during enumeration,
so mentioned workaround isn't executed yet at this moment.
So it seems this is related to USB core, not to ALSA driver.

Thanks,
Ruslan

>>
>> Furthermore, there is some evidence this is happening to other users:
>>
>> https://askubuntu.com/questions/1027188/external-yeti-michrophone-not-detected-on-ubuntu-18-04
>>
>> Best Regards,
>>
>> Colin Williams
>>
>>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-04-04 21:05   ` Re: Ruslan Bilovol
@ 2020-04-05  1:27       ` Alan Stern
  0 siblings, 0 replies; 1546+ messages in thread
From: Alan Stern @ 2020-04-05  1:27 UTC (permalink / raw)
  To: Ruslan Bilovol; +Cc: alsa-devel, Linux USB, Colin Williams

On Sun, 5 Apr 2020, Ruslan Bilovol wrote:

> Hi,
> 
> Please also add to CC related mailing lists (alsa-devel, linux-usb) rather
> then directly emailing - community may also help with the issue. Also it can be
> googled so if somebody else have same issue it can find answers faster.
> 
> On Fri, Apr 3, 2020 at 10:56 AM Colin Williams
> <colin.williams.seattle@gmail.com> wrote:
> >
> > https://ubuntuforums.org/showthread.php?t=2439897
> >
> > On Thu, Apr 2, 2020 at 4:50 PM Colin Williams <colin.williams.seattle@gmail.com> wrote:
> >>
> >> Hello,
> >>
> >> Is it possible that one of these commits or related broke support for the Blue Mic Yeti?
> >>
> >> https://github.com/torvalds/linux/blame/ac438771ccb4479528594c7e19f2c39cf1814a86/sound/usb/stream.c#L816
> 
> Tha'ts workaround to ignore last altsetting which is the same as previous.
> During UAC3 implementation, I reimplemented that workaround carefully,
> but I didn't have (and still do not own) any Blue Mic USB device.
> I don't know whether it was tested after that by anyone.
> 
> >>
> >> I am getting the following when I plug my mic in:
> 
> Which kernel version is that? Have you tried latest Linux Kernel?
> 
> >>
> >> [ 1283.848740] usb 1-1.2: new full-speed USB device number 82 using ehci-pci
> >> [ 1283.964802] usb 1-1.2: New USB device found, idVendor=b58e, idProduct=9e84, bcdDevice= 1.00
> >> [ 1283.964808] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
> >> [ 1283.964810] usb 1-1.2: Product: Yeti Stereo Microphone
> >> [ 1283.964812] usb 1-1.2: Manufacturer: Blue Microphones
> >> [ 1284.080671] usb 1-1.3: new low-speed USB device number 83 using ehci-pci
> >> [ 1284.784678] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1285.180674] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1285.992682] usb 1-1.3: new low-speed USB device number 84 using ehci-pci
> >> [ 1286.696672] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1287.092695] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1287.200804] usb 1-1-port3: attempt power cycle
> >> [ 1287.804662] usb 1-1.3: new low-speed USB device number 85 using ehci-pci
> >> [ 1288.220686] usb 1-1.3: device not accepting address 85, error -32
> >> [ 1288.508685] usb 1-1.3: new low-speed USB device number 86 using ehci-pci
> >> [ 1288.924690] usb 1-1.3: device not accepting address 86, error -32
> >> [ 1288.924916] usb 1-1-port3: unable to enumerate USB device
> >> [ 1288.925391] usb 1-1.2: USB disconnect, device number 82
> >> [ 1289.308736] usb 1-1.3: new low-speed USB device number 87 using ehci-pci
> >> [ 1289.596727] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1289.992635] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1290.596683] usb 1-1.3: new low-speed USB device number 88 using ehci-pci
> >> [ 1290.888718] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1291.284673] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1291.392928] usb 1-1-port3: attempt power cycle
> 
> Looking at this log, it seems the issue happens during enumeration,
> so mentioned workaround isn't executed yet at this moment.
> So it seems this is related to USB core, not to ALSA driver.

All those errors were for the 1-1.3 device.  The microphone was 1-1.2.
It's not clear from the log above what the relationship between those 
two devices is, but it sure looks like the microphone was enumerated 
okay.

What shows up in /sys/kernel/debug/usb/devices?

Alan Stern


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-04-05  1:27       ` Alan Stern
  0 siblings, 0 replies; 1546+ messages in thread
From: Alan Stern @ 2020-04-05  1:27 UTC (permalink / raw)
  To: Ruslan Bilovol; +Cc: Colin Williams, Linux USB, alsa-devel

On Sun, 5 Apr 2020, Ruslan Bilovol wrote:

> Hi,
> 
> Please also add to CC related mailing lists (alsa-devel, linux-usb) rather
> then directly emailing - community may also help with the issue. Also it can be
> googled so if somebody else have same issue it can find answers faster.
> 
> On Fri, Apr 3, 2020 at 10:56 AM Colin Williams
> <colin.williams.seattle@gmail.com> wrote:
> >
> > https://ubuntuforums.org/showthread.php?t=2439897
> >
> > On Thu, Apr 2, 2020 at 4:50 PM Colin Williams <colin.williams.seattle@gmail.com> wrote:
> >>
> >> Hello,
> >>
> >> Is it possible that one of these commits or related broke support for the Blue Mic Yeti?
> >>
> >> https://github.com/torvalds/linux/blame/ac438771ccb4479528594c7e19f2c39cf1814a86/sound/usb/stream.c#L816
> 
> Tha'ts workaround to ignore last altsetting which is the same as previous.
> During UAC3 implementation, I reimplemented that workaround carefully,
> but I didn't have (and still do not own) any Blue Mic USB device.
> I don't know whether it was tested after that by anyone.
> 
> >>
> >> I am getting the following when I plug my mic in:
> 
> Which kernel version is that? Have you tried latest Linux Kernel?
> 
> >>
> >> [ 1283.848740] usb 1-1.2: new full-speed USB device number 82 using ehci-pci
> >> [ 1283.964802] usb 1-1.2: New USB device found, idVendor=b58e, idProduct=9e84, bcdDevice= 1.00
> >> [ 1283.964808] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
> >> [ 1283.964810] usb 1-1.2: Product: Yeti Stereo Microphone
> >> [ 1283.964812] usb 1-1.2: Manufacturer: Blue Microphones
> >> [ 1284.080671] usb 1-1.3: new low-speed USB device number 83 using ehci-pci
> >> [ 1284.784678] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1285.180674] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1285.992682] usb 1-1.3: new low-speed USB device number 84 using ehci-pci
> >> [ 1286.696672] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1287.092695] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1287.200804] usb 1-1-port3: attempt power cycle
> >> [ 1287.804662] usb 1-1.3: new low-speed USB device number 85 using ehci-pci
> >> [ 1288.220686] usb 1-1.3: device not accepting address 85, error -32
> >> [ 1288.508685] usb 1-1.3: new low-speed USB device number 86 using ehci-pci
> >> [ 1288.924690] usb 1-1.3: device not accepting address 86, error -32
> >> [ 1288.924916] usb 1-1-port3: unable to enumerate USB device
> >> [ 1288.925391] usb 1-1.2: USB disconnect, device number 82
> >> [ 1289.308736] usb 1-1.3: new low-speed USB device number 87 using ehci-pci
> >> [ 1289.596727] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1289.992635] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1290.596683] usb 1-1.3: new low-speed USB device number 88 using ehci-pci
> >> [ 1290.888718] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1291.284673] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1291.392928] usb 1-1-port3: attempt power cycle
> 
> Looking at this log, it seems the issue happens during enumeration,
> so mentioned workaround isn't executed yet at this moment.
> So it seems this is related to USB core, not to ALSA driver.

All those errors were for the 1-1.3 device.  The microphone was 1-1.2.
It's not clear from the log above what the relationship between those 
two devices is, but it sure looks like the microphone was enumerated 
okay.

What shows up in /sys/kernel/debug/usb/devices?

Alan Stern


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]         ` <CAPXXXSAajets4AqcBKt8aRd8V1AL4bjAmCyuBOKr8qBG-AHO1A@mail.gmail.com>
@ 2020-04-05  2:51           ` Colin Williams
  0 siblings, 0 replies; 1546+ messages in thread
From: Colin Williams @ 2020-04-05  2:51 UTC (permalink / raw)
  To: Alan Stern; +Cc: Ruslan Bilovol, Linux USB, alsa-devel

Hello all


This is embarrassing but I think my issues were due to a bad USB cable.


Thank You


On Sat, Apr 4, 2020 at 7:50 PM Colin Williams
<colin.williams.seattle@gmail.com> wrote:
>
> Hello all
>
>
> This is embarrassing but I think my issues were due to a bad USB cable.
>
>
> Thank You
>
>>
>> On Sat, Apr 4, 2020 at 6:27 PM Alan Stern <stern@rowland.harvard.edu> wrote:
>>>
>>> On Sun, 5 Apr 2020, Ruslan Bilovol wrote:
>>>
>>> > Hi,
>>> >
>>> > Please also add to CC related mailing lists (alsa-devel, linux-usb) rather
>>> > then directly emailing - community may also help with the issue. Also it can be
>>> > googled so if somebody else have same issue it can find answers faster.
>>> >
>>> > On Fri, Apr 3, 2020 at 10:56 AM Colin Williams
>>> > <colin.williams.seattle@gmail.com> wrote:
>>> > >
>>> > > https://ubuntuforums.org/showthread.php?t=2439897
>>> > >
>>> > > On Thu, Apr 2, 2020 at 4:50 PM Colin Williams <colin.williams.seattle@gmail.com> wrote:
>>> > >>
>>> > >> Hello,
>>> > >>
>>> > >> Is it possible that one of these commits or related broke support for the Blue Mic Yeti?
>>> > >>
>>> > >> https://github.com/torvalds/linux/blame/ac438771ccb4479528594c7e19f2c39cf1814a86/sound/usb/stream.c#L816
>>> >
>>> > Tha'ts workaround to ignore last altsetting which is the same as previous.
>>> > During UAC3 implementation, I reimplemented that workaround carefully,
>>> > but I didn't have (and still do not own) any Blue Mic USB device.
>>> > I don't know whether it was tested after that by anyone.
>>> >
>>> > >>
>>> > >> I am getting the following when I plug my mic in:
>>> >
>>> > Which kernel version is that? Have you tried latest Linux Kernel?
>>> >
>>> > >>
>>> > >> [ 1283.848740] usb 1-1.2: new full-speed USB device number 82 using ehci-pci
>>> > >> [ 1283.964802] usb 1-1.2: New USB device found, idVendor=b58e, idProduct=9e84, bcdDevice= 1.00
>>> > >> [ 1283.964808] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
>>> > >> [ 1283.964810] usb 1-1.2: Product: Yeti Stereo Microphone
>>> > >> [ 1283.964812] usb 1-1.2: Manufacturer: Blue Microphones
>>> > >> [ 1284.080671] usb 1-1.3: new low-speed USB device number 83 using ehci-pci
>>> > >> [ 1284.784678] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1285.180674] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1285.992682] usb 1-1.3: new low-speed USB device number 84 using ehci-pci
>>> > >> [ 1286.696672] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1287.092695] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1287.200804] usb 1-1-port3: attempt power cycle
>>> > >> [ 1287.804662] usb 1-1.3: new low-speed USB device number 85 using ehci-pci
>>> > >> [ 1288.220686] usb 1-1.3: device not accepting address 85, error -32
>>> > >> [ 1288.508685] usb 1-1.3: new low-speed USB device number 86 using ehci-pci
>>> > >> [ 1288.924690] usb 1-1.3: device not accepting address 86, error -32
>>> > >> [ 1288.924916] usb 1-1-port3: unable to enumerate USB device
>>> > >> [ 1288.925391] usb 1-1.2: USB disconnect, device number 82
>>> > >> [ 1289.308736] usb 1-1.3: new low-speed USB device number 87 using ehci-pci
>>> > >> [ 1289.596727] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1289.992635] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1290.596683] usb 1-1.3: new low-speed USB device number 88 using ehci-pci
>>> > >> [ 1290.888718] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1291.284673] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1291.392928] usb 1-1-port3: attempt power cycle
>>> >
>>> > Looking at this log, it seems the issue happens during enumeration,
>>> > so mentioned workaround isn't executed yet at this moment.
>>> > So it seems this is related to USB core, not to ALSA driver.
>>>
>>> All those errors were for the 1-1.3 device.  The microphone was 1-1.2.
>>> It's not clear from the log above what the relationship between those
>>> two devices is, but it sure looks like the microphone was enumerated
>>> okay.
>>>
>>> What shows up in /sys/kernel/debug/usb/devices?
>>>
>>> Alan Stern
>>>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-04-18 12:26 Levi Brown
  0 siblings, 0 replies; 1546+ messages in thread
From: Levi Brown @ 2020-04-18 12:26 UTC (permalink / raw)
  To: linux-sh

LS0gDQrjgYLjgarjgZ/jgajoqbHjgZfjgZ/jgYTjgafjgZnjgIIg56eB44Gu5Lul5YmN44Gu44Oh
44O844Or44KS5Y+X44GR5Y+W44KK44G+44GX44Gf44GL77yfDQo

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-05-06  5:52 Jiaxun Yang
@ 2020-05-06 17:17 ` Nick Desaulniers
  0 siblings, 0 replies; 1546+ messages in thread
From: Nick Desaulniers @ 2020-05-06 17:17 UTC (permalink / raw)
  To: Jiaxun Yang
  Cc: linux-mips, clang-built-linux, Maciej W . Rozycki, Fangrui Song,
	Kees Cook, Nathan Chancellor, Thomas Bogendoerfer, Paul Burton,
	Masahiro Yamada, Jouni Hogander, Kevin Darbyshire-Bryant,
	Borislav Petkov, Heiko Carstens, LKML

On Tue, May 5, 2020 at 10:52 PM Jiaxun Yang <jiaxun.yang@flygoat.com> wrote:
>
> Subject: [PATCH v6] MIPS: Truncate link address into 32bit for 32bit kernel
> In-Reply-To: <20200413062651.3992652-1-jiaxun.yang@flygoat.com>
>
> LLD failed to link vmlinux with 64bit load address for 32bit ELF
> while bfd will strip 64bit address into 32bit silently.
> To fix LLD build, we should truncate load address provided by platform
> into 32bit for 32bit kernel.
>
> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
> Link: https://github.com/ClangBuiltLinux/linux/issues/786
> Link: https://sourceware.org/bugzilla/show_bug.cgi?id=25784
> Reviewed-by: Fangrui Song <maskray@google.com>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> Tested-by: Nathan Chancellor <natechancellor@gmail.com>
> Cc: Maciej W. Rozycki <macro@linux-mips.org>

Cool, this revision looks a bit simpler. Thanks for chasing this.
Tested-by: Nick Desaulniers <ndesaulniers@google.com>

> ---
> V2: Take MaskRay's shell magic.
>
> V3: After spent an hour on dealing with special character issue in
> Makefile, I gave up to do shell hacks and write a util in C instead.
> Thanks Maciej for pointing out Makefile variable problem.
>
> v4: Finally we managed to find a Makefile method to do it properly
> thanks to Kees. As it's too far from the initial version, I removed
> Review & Test tag from Nick and Fangrui and Cc instead.
>
> v5: Care vmlinuz as well.
>
> v6: Rename to LIKER_LOAD_ADDRESS
> ---
>  arch/mips/Makefile                 | 13 ++++++++++++-
>  arch/mips/boot/compressed/Makefile |  2 +-
>  arch/mips/kernel/vmlinux.lds.S     |  2 +-
>  3 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/arch/mips/Makefile b/arch/mips/Makefile
> index e1c44aed8156..68c0f22fefc0 100644
> --- a/arch/mips/Makefile
> +++ b/arch/mips/Makefile
> @@ -288,12 +288,23 @@ ifdef CONFIG_64BIT
>    endif
>  endif
>
> +# When linking a 32-bit executable the LLVM linker cannot cope with a
> +# 32-bit load address that has been sign-extended to 64 bits.  Simply
> +# remove the upper 32 bits then, as it is safe to do so with other
> +# linkers.
> +ifdef CONFIG_64BIT
> +       load-ld                 = $(load-y)
> +else
> +       load-ld                 = $(subst 0xffffffff,0x,$(load-y))
> +endif
> +
>  KBUILD_AFLAGS  += $(cflags-y)
>  KBUILD_CFLAGS  += $(cflags-y)
> -KBUILD_CPPFLAGS += -DVMLINUX_LOAD_ADDRESS=$(load-y)
> +KBUILD_CPPFLAGS += -DVMLINUX_LOAD_ADDRESS=$(load-y) -DLINKER_LOAD_ADDRESS=$(load-ld)
>  KBUILD_CPPFLAGS += -DDATAOFFSET=$(if $(dataoffset-y),$(dataoffset-y),0)
>
>  bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y) \
> +                 LINKER_LOAD_ADDRESS=$(load-ld) \
>                   VMLINUX_ENTRY_ADDRESS=$(entry-y) \
>                   PLATFORM="$(platform-y)" \
>                   ITS_INPUTS="$(its-y)"
> diff --git a/arch/mips/boot/compressed/Makefile b/arch/mips/boot/compressed/Makefile
> index 0df0ee8a298d..3d391256ab7e 100644
> --- a/arch/mips/boot/compressed/Makefile
> +++ b/arch/mips/boot/compressed/Makefile
> @@ -90,7 +90,7 @@ ifneq ($(zload-y),)
>  VMLINUZ_LOAD_ADDRESS := $(zload-y)
>  else
>  VMLINUZ_LOAD_ADDRESS = $(shell $(obj)/calc_vmlinuz_load_addr \
> -               $(obj)/vmlinux.bin $(VMLINUX_LOAD_ADDRESS))
> +               $(obj)/vmlinux.bin $(LINKER_LOAD_ADDRESS))
>  endif
>  UIMAGE_LOADADDR = $(VMLINUZ_LOAD_ADDRESS)
>
> diff --git a/arch/mips/kernel/vmlinux.lds.S b/arch/mips/kernel/vmlinux.lds.S
> index a5f00ec73ea6..5226cd8e4bee 100644
> --- a/arch/mips/kernel/vmlinux.lds.S
> +++ b/arch/mips/kernel/vmlinux.lds.S
> @@ -55,7 +55,7 @@ SECTIONS
>         /* . = 0xa800000000300000; */
>         . = 0xffffffff80300000;
>  #endif
> -       . = VMLINUX_LOAD_ADDRESS;
> +       . = LINKER_LOAD_ADDRESS;
>         /* read-only */
>         _text = .;      /* Text and read-only data */
>         .text : {
>
> --

-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-05-14  8:17 Maksim Iushchenko
@ 2020-05-14 10:29 ` fboehm
  0 siblings, 0 replies; 1546+ messages in thread
From: fboehm @ 2020-05-14 10:29 UTC (permalink / raw)
  To: b.a.t.m.a.n

Maksim, for clarification:

ATWILC3000 is a Wifi-Module for low-end embedded systems. This module 
consists of a Wifi-Chip + a small processor. The processor does stuff 
like authentication/registration with the Wifi network, WPA-Encryption 
and this kind of things. A typical use-case would be to add a Wifi 
interface to some sort of IoT device or some sort of computer peripheral 
device (like a Wifi-enabled printer or a smart-speaker).

Looking at the driver code it might not be impossible but it's just very 
unlikely that you will be happy to use it in combination with Batman. 
You would first of all need to connect the module to a much more 
powerful processor that runs Linux and Batman. But assuming you anyway 
need such a powerful processor for your application then you have a good 
chance that you can use a real Wifi-Adapter (with USB or PICe interface) 
instead of such a Wifi-Module.

Regards,
Franz

Am 14.05.20 um 10:17 schrieb Maksim Iushchenko:
> Hello,
> I am creating a Wi-Fi ad-hoc network based on batman-adv. I read that
> batman-adv is able to work with any types of interfaces, but I still
> have a question related to ad-hoc networking. Will Wi-Fi ad-hoc
> network (based on batman-adv) work if Wi-Fi chip does not support
> 802.11s standard?
> Unfortunately, there is no mention of ad-hoc mode support in
> documentation of many Wi-Fi chips.
>
> How to check if a Wi-Fi chip is suited to be used to create a Wi-Fi
> ad-hoc network based on batman-adv?
>
> For example, is ATWILC3000-MR110CA an appropriate chip to build a
> Wi-Fi ad-hoc network based on batman-adv? Or maybe you could suggest
> any another Wi-Fi chips?
>
> Thanks in advance

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-05-21  0:22 STOREBRAND
  0 siblings, 0 replies; 1546+ messages in thread
From: STOREBRAND @ 2020-05-21  0:22 UTC (permalink / raw)
  To: linux-m68k

Hello,

     Am Harald Hauge an Investment Manager from Norway. I wish to solicit your interest in an investment project that is currently ongoing in my company (Storebrand);  It is a short term investment with good returns. Simply reply for me to confirm the validity of your email so i shall give you a comprehensive details about the project.

Best Regards,
Harald Hauge
Business Consultant

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re;
@ 2020-06-24 13:54 test02
  0 siblings, 0 replies; 1546+ messages in thread
From: test02 @ 2020-06-24 13:54 UTC (permalink / raw)
  To: Recipients

Congratulations!!!

As part of my humanitarian individual support during this hard times of fighting the Corona Virus (Convid-19); your email account was selected for a Donation of $1,000,000.00 USD for charity and community medical support in your area. 
Please contact us for more information on charles_jackson001@yahoo.com.com

Send Your Response To: charles_jackson001@yahoo.com

Best Regards,

Charles .W. Jackson Jr

-- 
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re;
@ 2020-06-24 13:54 test02
  0 siblings, 0 replies; 1546+ messages in thread
From: test02 @ 2020-06-24 13:54 UTC (permalink / raw)
  To: Recipients

Congratulations!!!

As part of my humanitarian individual support during this hard times of fighting the Corona Virus (Convid-19); your email account was selected for a Donation of $1,000,000.00 USD for charity and community medical support in your area. 
Please contact us for more information on charles_jackson001@yahoo.com.com

Send Your Response To: charles_jackson001@yahoo.com

Best Regards,

Charles .W. Jackson Jr

-- 
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-06-30 17:56 (unknown) Vasiliy Kupriakov
@ 2020-07-10 20:36 ` Andy Shevchenko
  0 siblings, 0 replies; 1546+ messages in thread
From: Andy Shevchenko @ 2020-07-10 20:36 UTC (permalink / raw)
  To: Vasiliy Kupriakov
  Cc: Corentin Chary, Darren Hart, Andy Shevchenko,
	open list:ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS,
	open list:ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS,
	open list

On Tue, Jun 30, 2020 at 8:57 PM Vasiliy Kupriakov <rublag-ns@yandex.ru> wrote:
>
> Subject: [PATCH] platform/x86: asus-wmi: allow BAT1 battery name
>
> The battery on my laptop ASUS TUF Gaming FX706II is named BAT1.
> This patch allows battery extension to load.
>

Pushed to my review and testing queue, thanks!

> Signed-off-by: Vasiliy Kupriakov <rublag-ns@yandex.ru>
> ---
>  drivers/platform/x86/asus-wmi.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c
> index 877aade19497..8f4acdc06b13 100644
> --- a/drivers/platform/x86/asus-wmi.c
> +++ b/drivers/platform/x86/asus-wmi.c
> @@ -441,6 +441,7 @@ static int asus_wmi_battery_add(struct power_supply *battery)
>          * battery is named BATT.
>          */
>         if (strcmp(battery->desc->name, "BAT0") != 0 &&
> +           strcmp(battery->desc->name, "BAT1") != 0 &&
>             strcmp(battery->desc->name, "BATT") != 0)
>                 return -ENODEV;
>
> --
> 2.27.0
>


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-07-16 21:22 Mauro Rossi
@ 2020-07-20  9:00 ` Christian König
  2020-07-20  9:59   ` Re: Mauro Rossi
  0 siblings, 1 reply; 1546+ messages in thread
From: Christian König @ 2020-07-20  9:00 UTC (permalink / raw)
  To: Mauro Rossi, amd-gfx; +Cc: alexander.deucher, harry.wentland

Hi Mauro,

I'm not deep into the whole DC design, so just some general high level 
comments on the cover letter:

1. Please add a subject line to the cover letter, my spam filter thinks 
that this is suspicious otherwise.

2. Then you should probably note how well (badly?) is that tested. Since 
you noted proof of concept it might not even work.

3. How feature complete (HDMI audio?, Freesync?) is it?

Apart from that it looks like a rather impressive piece of work :)

Cheers,
Christian.

Am 16.07.20 um 23:22 schrieb Mauro Rossi:
> The series adds SI support to AMD DC
>
> Changelog:
>
> [RFC]
> Preliminar Proof Of Concept, with DCE8 headers still used in dce60_resources.c
>
> [PATCH v2]
> Rebase on amd-staging-drm-next dated 17-Oct-2018
>
> [PATCH v3]
> Add support for DCE6 specific headers,
> ad hoc DCE6 macros, funtions and fixes,
> rebase on current amd-staging-drm-next
>
>
> Commits [01/27]..[08/27] SI support added in various DC components
>
> [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
> [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
> [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support (v9b)
> [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
> [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
> [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
> [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
> [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
>
> Commits [09/27]..[24/27] DCE6 specific code adaptions
>
> [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI parts (v2)
> [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
> [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific macros,functions
> [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
> [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific macros,functions
> [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific macros,functions
> [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6 specific macros,functions
> [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific macros,functions
> [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific macros,functions
> [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific macros,functions
> [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
> [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init
> [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6 macros,functions
> [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6 specific .cursor_lock
> [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6 specific functions
> [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
>
>
> Commits [25/27]..[27/27] SI support final enablements
>
> [PATCH v3 25/27] drm/amd/display: create plane rotation property for Bonarie and later
> [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
> [PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig (v2)
>
>
> Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-07-20  9:00 ` Christian König
@ 2020-07-20  9:59   ` Mauro Rossi
  2020-07-22  2:51     ` Re: Alex Deucher
  0 siblings, 1 reply; 1546+ messages in thread
From: Mauro Rossi @ 2020-07-20  9:59 UTC (permalink / raw)
  To: christian.koenig; +Cc: Deucher, Alexander, Harry Wentland, amd-gfx

Hi Christian,

On Mon, Jul 20, 2020 at 11:00 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Hi Mauro,
>
> I'm not deep into the whole DC design, so just some general high level
> comments on the cover letter:
>
> 1. Please add a subject line to the cover letter, my spam filter thinks
> that this is suspicious otherwise.

My mistake in the editing of covert letter with git send-email,
I may have forgot to keep the Subject at the top

>
> 2. Then you should probably note how well (badly?) is that tested. Since
> you noted proof of concept it might not even work.

The Changelog is to be read as:

[RFC] was the initial Proof of concept was the RFC and [PATCH v2] was
just a rebase onto amd-staging-drm-next

this series [PATCH v3] has all the known changes required for DCE6 specificity
and based on a long offline thread with Alexander Deutcher and past
dri-devel chats with Harry Wentland.

It was tested for my possibilities of testing with HD7750 and HD7950,
with checks in dmesg output for not getting "missing registers/masks"
kernel WARNING
and with kernel build on Ubuntu 20.04 and with android-x86

The proposal I made to Alex is that AMD testing systems will be used
for further regression testing,
as part of review and validation for eligibility to amd-staging-drm-next

>
> 3. How feature complete (HDMI audio?, Freesync?) is it?

All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
DCE6 (dc/dce60 path) in the last two years from initial submission

>
> Apart from that it looks like a rather impressive piece of work :)
>
> Cheers,
> Christian.

Thanks,
please consider that most of the latest DCE6 specific parts were
possible due to recent Alex support in getting the correct DCE6
headers,
his suggestions and continuous feedback.

I would suggest that Alex comments on the proposed next steps to follow.

Mauro

>
> Am 16.07.20 um 23:22 schrieb Mauro Rossi:
> > The series adds SI support to AMD DC
> >
> > Changelog:
> >
> > [RFC]
> > Preliminar Proof Of Concept, with DCE8 headers still used in dce60_resources.c
> >
> > [PATCH v2]
> > Rebase on amd-staging-drm-next dated 17-Oct-2018
> >
> > [PATCH v3]
> > Add support for DCE6 specific headers,
> > ad hoc DCE6 macros, funtions and fixes,
> > rebase on current amd-staging-drm-next
> >
> >
> > Commits [01/27]..[08/27] SI support added in various DC components
> >
> > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
> > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
> > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support (v9b)
> > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
> > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
> > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
> > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
> > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
> >
> > Commits [09/27]..[24/27] DCE6 specific code adaptions
> >
> > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI parts (v2)
> > [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
> > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific macros,functions
> > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
> > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific macros,functions
> > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific macros,functions
> > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6 specific macros,functions
> > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific macros,functions
> > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific macros,functions
> > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific macros,functions
> > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
> > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init
> > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6 macros,functions
> > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6 specific .cursor_lock
> > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6 specific functions
> > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
> >
> >
> > Commits [25/27]..[27/27] SI support final enablements
> >
> > [PATCH v3 25/27] drm/amd/display: create plane rotation property for Bonarie and later
> > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
> > [PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig (v2)
> >
> >
> > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
> >
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-07-20  9:59   ` Re: Mauro Rossi
@ 2020-07-22  2:51     ` Alex Deucher
  2020-07-22  7:56       ` Re: Mauro Rossi
  0 siblings, 1 reply; 1546+ messages in thread
From: Alex Deucher @ 2020-07-22  2:51 UTC (permalink / raw)
  To: Mauro Rossi
  Cc: Deucher, Alexander, Harry Wentland, Christian Koenig,
	amd-gfx list

On Mon, Jul 20, 2020 at 6:00 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>
> Hi Christian,
>
> On Mon, Jul 20, 2020 at 11:00 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
> >
> > Hi Mauro,
> >
> > I'm not deep into the whole DC design, so just some general high level
> > comments on the cover letter:
> >
> > 1. Please add a subject line to the cover letter, my spam filter thinks
> > that this is suspicious otherwise.
>
> My mistake in the editing of covert letter with git send-email,
> I may have forgot to keep the Subject at the top
>
> >
> > 2. Then you should probably note how well (badly?) is that tested. Since
> > you noted proof of concept it might not even work.
>
> The Changelog is to be read as:
>
> [RFC] was the initial Proof of concept was the RFC and [PATCH v2] was
> just a rebase onto amd-staging-drm-next
>
> this series [PATCH v3] has all the known changes required for DCE6 specificity
> and based on a long offline thread with Alexander Deutcher and past
> dri-devel chats with Harry Wentland.
>
> It was tested for my possibilities of testing with HD7750 and HD7950,
> with checks in dmesg output for not getting "missing registers/masks"
> kernel WARNING
> and with kernel build on Ubuntu 20.04 and with android-x86
>
> The proposal I made to Alex is that AMD testing systems will be used
> for further regression testing,
> as part of review and validation for eligibility to amd-staging-drm-next
>

We will certainly test it once it lands, but presumably this is
working on the SI cards you have access to?

> >
> > 3. How feature complete (HDMI audio?, Freesync?) is it?
>
> All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
> DCE6 (dc/dce60 path) in the last two years from initial submission
>
> >
> > Apart from that it looks like a rather impressive piece of work :)
> >
> > Cheers,
> > Christian.
>
> Thanks,
> please consider that most of the latest DCE6 specific parts were
> possible due to recent Alex support in getting the correct DCE6
> headers,
> his suggestions and continuous feedback.
>
> I would suggest that Alex comments on the proposed next steps to follow.

The code looks pretty good to me.  I'd like to get some feedback from
the display team to see if they have any concerns, but beyond that I
think we can pull it into the tree and continue improving it there.
Do you have a link to a git tree I can pull directly that contains
these patches?  Is this the right branch?
https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_next

Thanks!

Alex

>
> Mauro
>
> >
> > Am 16.07.20 um 23:22 schrieb Mauro Rossi:
> > > The series adds SI support to AMD DC
> > >
> > > Changelog:
> > >
> > > [RFC]
> > > Preliminar Proof Of Concept, with DCE8 headers still used in dce60_resources.c
> > >
> > > [PATCH v2]
> > > Rebase on amd-staging-drm-next dated 17-Oct-2018
> > >
> > > [PATCH v3]
> > > Add support for DCE6 specific headers,
> > > ad hoc DCE6 macros, funtions and fixes,
> > > rebase on current amd-staging-drm-next
> > >
> > >
> > > Commits [01/27]..[08/27] SI support added in various DC components
> > >
> > > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
> > > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
> > > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support (v9b)
> > > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
> > > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
> > > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
> > > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
> > > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
> > >
> > > Commits [09/27]..[24/27] DCE6 specific code adaptions
> > >
> > > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI parts (v2)
> > > [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
> > > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific macros,functions
> > > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
> > > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific macros,functions
> > > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific macros,functions
> > > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6 specific macros,functions
> > > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific macros,functions
> > > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific macros,functions
> > > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific macros,functions
> > > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
> > > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init
> > > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6 macros,functions
> > > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6 specific .cursor_lock
> > > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6 specific functions
> > > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
> > >
> > >
> > > Commits [25/27]..[27/27] SI support final enablements
> > >
> > > [PATCH v3 25/27] drm/amd/display: create plane rotation property for Bonarie and later
> > > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
> > > [PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig (v2)
> > >
> > >
> > > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
> > >
> > > _______________________________________________
> > > amd-gfx mailing list
> > > amd-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-07-22  2:51     ` Re: Alex Deucher
@ 2020-07-22  7:56       ` Mauro Rossi
  2020-07-24 18:31         ` Re: Alex Deucher
  0 siblings, 1 reply; 1546+ messages in thread
From: Mauro Rossi @ 2020-07-22  7:56 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Deucher, Alexander, Harry Wentland, Christian Koenig,
	amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 6915 bytes --]

Hello,
re-sending and copying full DL

On Wed, Jul 22, 2020 at 4:51 AM Alex Deucher <alexdeucher@gmail.com> wrote:

> On Mon, Jul 20, 2020 at 6:00 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
> >
> > Hi Christian,
> >
> > On Mon, Jul 20, 2020 at 11:00 AM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> > >
> > > Hi Mauro,
> > >
> > > I'm not deep into the whole DC design, so just some general high level
> > > comments on the cover letter:
> > >
> > > 1. Please add a subject line to the cover letter, my spam filter thinks
> > > that this is suspicious otherwise.
> >
> > My mistake in the editing of covert letter with git send-email,
> > I may have forgot to keep the Subject at the top
> >
> > >
> > > 2. Then you should probably note how well (badly?) is that tested.
> Since
> > > you noted proof of concept it might not even work.
> >
> > The Changelog is to be read as:
> >
> > [RFC] was the initial Proof of concept was the RFC and [PATCH v2] was
> > just a rebase onto amd-staging-drm-next
> >
> > this series [PATCH v3] has all the known changes required for DCE6
> specificity
> > and based on a long offline thread with Alexander Deutcher and past
> > dri-devel chats with Harry Wentland.
> >
> > It was tested for my possibilities of testing with HD7750 and HD7950,
> > with checks in dmesg output for not getting "missing registers/masks"
> > kernel WARNING
> > and with kernel build on Ubuntu 20.04 and with android-x86
> >
> > The proposal I made to Alex is that AMD testing systems will be used
> > for further regression testing,
> > as part of review and validation for eligibility to amd-staging-drm-next
> >
>
> We will certainly test it once it lands, but presumably this is
> working on the SI cards you have access to?
>

Yes, most of my testing was done with android-x86  Android CTS (EGL, GLES2,
GLES3, VK)

I am also in contact with a person with Firepro W5130M who is running a
piglit session

I had bought an HD7850 to test with Pitcairn, but it arrived as defective
so I could not test with Pitcair



> > >
> > > 3. How feature complete (HDMI audio?, Freesync?) is it?
> >
> > All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
> > DCE6 (dc/dce60 path) in the last two years from initial submission
> >
> > >
> > > Apart from that it looks like a rather impressive piece of work :)
> > >
> > > Cheers,
> > > Christian.
> >
> > Thanks,
> > please consider that most of the latest DCE6 specific parts were
> > possible due to recent Alex support in getting the correct DCE6
> > headers,
> > his suggestions and continuous feedback.
> >
> > I would suggest that Alex comments on the proposed next steps to follow.
>
> The code looks pretty good to me.  I'd like to get some feedback from
> the display team to see if they have any concerns, but beyond that I
> think we can pull it into the tree and continue improving it there.
> Do you have a link to a git tree I can pull directly that contains
> these patches?  Is this the right branch?
> https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_next
>
> Thanks!
>
> Alex
>

The following branch was pushed with the series on top of
amd-staging-drm-next

https://github.com/maurossi/linux/commits/kernel-5.6_si_drm-next


>
> >
> > Mauro
> >
> > >
> > > Am 16.07.20 um 23:22 schrieb Mauro Rossi:
> > > > The series adds SI support to AMD DC
> > > >
> > > > Changelog:
> > > >
> > > > [RFC]
> > > > Preliminar Proof Of Concept, with DCE8 headers still used in
> dce60_resources.c
> > > >
> > > > [PATCH v2]
> > > > Rebase on amd-staging-drm-next dated 17-Oct-2018
> > > >
> > > > [PATCH v3]
> > > > Add support for DCE6 specific headers,
> > > > ad hoc DCE6 macros, funtions and fixes,
> > > > rebase on current amd-staging-drm-next
> > > >
> > > >
> > > > Commits [01/27]..[08/27] SI support added in various DC components
> > > >
> > > > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
> > > > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
> > > > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support
> (v9b)
> > > > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
> > > > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
> > > > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
> > > > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
> > > > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
> > > >
> > > > Commits [09/27]..[24/27] DCE6 specific code adaptions
> > > >
> > > > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI
> parts (v2)
> > > > [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
> > > > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific
> macros,functions
> > > > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
> > > > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific
> macros,functions
> > > > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific
> macros,functions
> > > > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6
> specific macros,functions
> > > > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific
> macros,functions
> > > > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific
> macros,functions
> > > > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific
> macros,functions
> > > > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
> > > > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling
> Horizontal Filter Init
> > > > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6
> macros,functions
> > > > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6
> specific .cursor_lock
> > > > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6
> specific functions
> > > > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
> > > >
> > > >
> > > > Commits [25/27]..[27/27] SI support final enablements
> > > >
> > > > [PATCH v3 25/27] drm/amd/display: create plane rotation property for
> Bonarie and later
> > > > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
> > > > [PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig
> (v2)
> > > >
> > > >
> > > > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
> > > >
> > > > _______________________________________________
> > > > amd-gfx mailing list
> > > > amd-gfx@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > >
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

[-- Attachment #1.2: Type: text/html, Size: 9472 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-07-22  7:56       ` Re: Mauro Rossi
@ 2020-07-24 18:31         ` Alex Deucher
  2020-07-26 15:31           ` Re: Mauro Rossi
  0 siblings, 1 reply; 1546+ messages in thread
From: Alex Deucher @ 2020-07-24 18:31 UTC (permalink / raw)
  To: Mauro Rossi
  Cc: Deucher, Alexander, Harry Wentland, Christian Koenig,
	amd-gfx list

[-- Attachment #1: Type: text/plain, Size: 7470 bytes --]

On Wed, Jul 22, 2020 at 3:57 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>
> Hello,
> re-sending and copying full DL
>
> On Wed, Jul 22, 2020 at 4:51 AM Alex Deucher <alexdeucher@gmail.com> wrote:
>>
>> On Mon, Jul 20, 2020 at 6:00 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>> >
>> > Hi Christian,
>> >
>> > On Mon, Jul 20, 2020 at 11:00 AM Christian König
>> > <ckoenig.leichtzumerken@gmail.com> wrote:
>> > >
>> > > Hi Mauro,
>> > >
>> > > I'm not deep into the whole DC design, so just some general high level
>> > > comments on the cover letter:
>> > >
>> > > 1. Please add a subject line to the cover letter, my spam filter thinks
>> > > that this is suspicious otherwise.
>> >
>> > My mistake in the editing of covert letter with git send-email,
>> > I may have forgot to keep the Subject at the top
>> >
>> > >
>> > > 2. Then you should probably note how well (badly?) is that tested. Since
>> > > you noted proof of concept it might not even work.
>> >
>> > The Changelog is to be read as:
>> >
>> > [RFC] was the initial Proof of concept was the RFC and [PATCH v2] was
>> > just a rebase onto amd-staging-drm-next
>> >
>> > this series [PATCH v3] has all the known changes required for DCE6 specificity
>> > and based on a long offline thread with Alexander Deutcher and past
>> > dri-devel chats with Harry Wentland.
>> >
>> > It was tested for my possibilities of testing with HD7750 and HD7950,
>> > with checks in dmesg output for not getting "missing registers/masks"
>> > kernel WARNING
>> > and with kernel build on Ubuntu 20.04 and with android-x86
>> >
>> > The proposal I made to Alex is that AMD testing systems will be used
>> > for further regression testing,
>> > as part of review and validation for eligibility to amd-staging-drm-next
>> >
>>
>> We will certainly test it once it lands, but presumably this is
>> working on the SI cards you have access to?
>
>
> Yes, most of my testing was done with android-x86  Android CTS (EGL, GLES2, GLES3, VK)
>
> I am also in contact with a person with Firepro W5130M who is running a piglit session
>
> I had bought an HD7850 to test with Pitcairn, but it arrived as defective so I could not test with Pitcair
>
>
>>
>> > >
>> > > 3. How feature complete (HDMI audio?, Freesync?) is it?
>> >
>> > All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
>> > DCE6 (dc/dce60 path) in the last two years from initial submission
>> >
>> > >
>> > > Apart from that it looks like a rather impressive piece of work :)
>> > >
>> > > Cheers,
>> > > Christian.
>> >
>> > Thanks,
>> > please consider that most of the latest DCE6 specific parts were
>> > possible due to recent Alex support in getting the correct DCE6
>> > headers,
>> > his suggestions and continuous feedback.
>> >
>> > I would suggest that Alex comments on the proposed next steps to follow.
>>
>> The code looks pretty good to me.  I'd like to get some feedback from
>> the display team to see if they have any concerns, but beyond that I
>> think we can pull it into the tree and continue improving it there.
>> Do you have a link to a git tree I can pull directly that contains
>> these patches?  Is this the right branch?
>> https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_next
>>
>> Thanks!
>>
>> Alex
>
>
> The following branch was pushed with the series on top of amd-staging-drm-next
>
> https://github.com/maurossi/linux/commits/kernel-5.6_si_drm-next

I gave this a quick test on all of the SI asics and the various
monitors I had available and it looks good.  A few minor patches I
noticed are attached.  If they look good to you, I'll squash them into
the series when I commit it.  I've pushed it to my fdo tree as well:
https://cgit.freedesktop.org/~agd5f/linux/log/?h=si_dc_support

Thanks!

Alex

>
>>
>>
>> >
>> > Mauro
>> >
>> > >
>> > > Am 16.07.20 um 23:22 schrieb Mauro Rossi:
>> > > > The series adds SI support to AMD DC
>> > > >
>> > > > Changelog:
>> > > >
>> > > > [RFC]
>> > > > Preliminar Proof Of Concept, with DCE8 headers still used in dce60_resources.c
>> > > >
>> > > > [PATCH v2]
>> > > > Rebase on amd-staging-drm-next dated 17-Oct-2018
>> > > >
>> > > > [PATCH v3]
>> > > > Add support for DCE6 specific headers,
>> > > > ad hoc DCE6 macros, funtions and fixes,
>> > > > rebase on current amd-staging-drm-next
>> > > >
>> > > >
>> > > > Commits [01/27]..[08/27] SI support added in various DC components
>> > > >
>> > > > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
>> > > > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
>> > > > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support (v9b)
>> > > > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
>> > > > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
>> > > > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
>> > > > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
>> > > > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
>> > > >
>> > > > Commits [09/27]..[24/27] DCE6 specific code adaptions
>> > > >
>> > > > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI parts (v2)
>> > > > [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
>> > > > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific macros,functions
>> > > > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
>> > > > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific macros,functions
>> > > > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific macros,functions
>> > > > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6 specific macros,functions
>> > > > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific macros,functions
>> > > > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific macros,functions
>> > > > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific macros,functions
>> > > > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
>> > > > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init
>> > > > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6 macros,functions
>> > > > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6 specific .cursor_lock
>> > > > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6 specific functions
>> > > > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
>> > > >
>> > > >
>> > > > Commits [25/27]..[27/27] SI support final enablements
>> > > >
>> > > > [PATCH v3 25/27] drm/amd/display: create plane rotation property for Bonarie and later
>> > > > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
>> > > > [PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig (v2)
>> > > >
>> > > >
>> > > > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
>> > > >
>> > > > _______________________________________________
>> > > > amd-gfx mailing list
>> > > > amd-gfx@lists.freedesktop.org
>> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> > >
>> > _______________________________________________
>> > amd-gfx mailing list
>> > amd-gfx@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[-- Attachment #2: 0002-drm-amdgpu-display-addming-return-type-for-dce60_pro.patch --]
[-- Type: text/x-patch, Size: 982 bytes --]

From 782fea4387d22686856c87b8ac0491a43a4d944c Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Thu, 23 Jul 2020 21:05:41 -0400
Subject: [PATCH 2/3] drm/amdgpu/display: addming return type for
 dce60_program_front_end_for_pipe

Probably a copy/paste typo.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/display/dc/dce60/dce60_hw_sequencer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce60/dce60_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dce60/dce60_hw_sequencer.c
index 66e5a1ba2a58..920c7ae29d53 100644
--- a/drivers/gpu/drm/amd/display/dc/dce60/dce60_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce60/dce60_hw_sequencer.c
@@ -266,7 +266,7 @@ static void dce60_program_scaler(const struct dc *dc,
 		&pipe_ctx->plane_res.scl_data);
 }
 
-
+static void
 dce60_program_front_end_for_pipe(
 		struct dc *dc, struct pipe_ctx *pipe_ctx)
 {
-- 
2.25.4


[-- Attachment #3: 0003-drm-amdgpu-display-Fix-up-PLL-handling-for-DCE6.patch --]
[-- Type: text/x-patch, Size: 1855 bytes --]

From 2b18098918717d9ee4c69a47be3527d1cc812b7f Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Fri, 24 Jul 2020 11:41:31 -0400
Subject: [PATCH 3/3] drm/amdgpu/display: Fix up PLL handling for DCE6

DCE6.0 supports 2 PLLs.  DCE6.1 supports 3 PLLs.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/display/dc/dce60/dce60_resource.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce60/dce60_resource.c b/drivers/gpu/drm/amd/display/dc/dce60/dce60_resource.c
index 261333afc936..5a5a9cb77acb 100644
--- a/drivers/gpu/drm/amd/display/dc/dce60/dce60_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce60/dce60_resource.c
@@ -379,7 +379,7 @@ static const struct resource_caps res_cap_61 = {
 		.num_timing_generator = 4,
 		.num_audio = 6,
 		.num_stream_encoder = 6,
-		.num_pll = 2,
+		.num_pll = 3,
 		.num_ddc = 6,
 };
 
@@ -983,9 +983,7 @@ static bool dce60_construct(
 				dce60_clock_source_create(ctx, bp, CLOCK_SOURCE_ID_PLL0, &clk_src_regs[0], false);
 		pool->base.clock_sources[1] =
 				dce60_clock_source_create(ctx, bp, CLOCK_SOURCE_ID_PLL1, &clk_src_regs[1], false);
-		pool->base.clock_sources[2] =
-				dce60_clock_source_create(ctx, bp, CLOCK_SOURCE_ID_PLL2, &clk_src_regs[2], false);
-		pool->base.clk_src_count = 3;
+		pool->base.clk_src_count = 2;
 
 	} else {
 		pool->base.dp_clock_source =
@@ -993,9 +991,7 @@ static bool dce60_construct(
 
 		pool->base.clock_sources[0] =
 				dce60_clock_source_create(ctx, bp, CLOCK_SOURCE_ID_PLL1, &clk_src_regs[1], false);
-		pool->base.clock_sources[1] =
-				dce60_clock_source_create(ctx, bp, CLOCK_SOURCE_ID_PLL2, &clk_src_regs[2], false);
-		pool->base.clk_src_count = 2;
+		pool->base.clk_src_count = 1;
 	}
 
 	if (pool->base.dp_clock_source == NULL) {
-- 
2.25.4


[-- Attachment #4: 0001-drm-amdgpu-display-remove-unused-variable-in-dce60_c.patch --]
[-- Type: text/x-patch, Size: 1084 bytes --]

From 2ced8e528937051e4d8536718c6dc776e0b46314 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Thu, 23 Jul 2020 21:02:14 -0400
Subject: [PATCH 1/3] drm/amdgpu/display: remove unused variable in
 dce60_configure_crc

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/display/dc/dce60/dce60_timing_generator.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce60/dce60_timing_generator.c b/drivers/gpu/drm/amd/display/dc/dce60/dce60_timing_generator.c
index 4a5b7a0940c6..fc1af0ff0ca4 100644
--- a/drivers/gpu/drm/amd/display/dc/dce60/dce60_timing_generator.c
+++ b/drivers/gpu/drm/amd/display/dc/dce60/dce60_timing_generator.c
@@ -192,8 +192,6 @@ static bool dce60_is_tg_enabled(struct timing_generator *tg)
 bool dce60_configure_crc(struct timing_generator *tg,
 			  const struct crc_params *params)
 {
-	struct dce110_timing_generator *tg110 = DCE110TG_FROM_TG(tg);
-
 	/* Cannot configure crc on a CRTC that is disabled */
 	if (!dce60_is_tg_enabled(tg))
 		return false;
-- 
2.25.4


[-- Attachment #5: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 1546+ messages in thread

* Re:
  2020-07-24 18:31         ` Re: Alex Deucher
@ 2020-07-26 15:31           ` Mauro Rossi
  2020-07-27 18:31             ` Re: Alex Deucher
  0 siblings, 1 reply; 1546+ messages in thread
From: Mauro Rossi @ 2020-07-26 15:31 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Deucher, Alexander, Harry Wentland, Christian Koenig,
	amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 9396 bytes --]

Hello,

On Fri, Jul 24, 2020 at 8:31 PM Alex Deucher <alexdeucher@gmail.com> wrote:

> On Wed, Jul 22, 2020 at 3:57 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
> >
> > Hello,
> > re-sending and copying full DL
> >
> > On Wed, Jul 22, 2020 at 4:51 AM Alex Deucher <alexdeucher@gmail.com>
> wrote:
> >>
> >> On Mon, Jul 20, 2020 at 6:00 AM Mauro Rossi <issor.oruam@gmail.com>
> wrote:
> >> >
> >> > Hi Christian,
> >> >
> >> > On Mon, Jul 20, 2020 at 11:00 AM Christian König
> >> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> > >
> >> > > Hi Mauro,
> >> > >
> >> > > I'm not deep into the whole DC design, so just some general high
> level
> >> > > comments on the cover letter:
> >> > >
> >> > > 1. Please add a subject line to the cover letter, my spam filter
> thinks
> >> > > that this is suspicious otherwise.
> >> >
> >> > My mistake in the editing of covert letter with git send-email,
> >> > I may have forgot to keep the Subject at the top
> >> >
> >> > >
> >> > > 2. Then you should probably note how well (badly?) is that tested.
> Since
> >> > > you noted proof of concept it might not even work.
> >> >
> >> > The Changelog is to be read as:
> >> >
> >> > [RFC] was the initial Proof of concept was the RFC and [PATCH v2] was
> >> > just a rebase onto amd-staging-drm-next
> >> >
> >> > this series [PATCH v3] has all the known changes required for DCE6
> specificity
> >> > and based on a long offline thread with Alexander Deutcher and past
> >> > dri-devel chats with Harry Wentland.
> >> >
> >> > It was tested for my possibilities of testing with HD7750 and HD7950,
> >> > with checks in dmesg output for not getting "missing registers/masks"
> >> > kernel WARNING
> >> > and with kernel build on Ubuntu 20.04 and with android-x86
> >> >
> >> > The proposal I made to Alex is that AMD testing systems will be used
> >> > for further regression testing,
> >> > as part of review and validation for eligibility to
> amd-staging-drm-next
> >> >
> >>
> >> We will certainly test it once it lands, but presumably this is
> >> working on the SI cards you have access to?
> >
> >
> > Yes, most of my testing was done with android-x86  Android CTS (EGL,
> GLES2, GLES3, VK)
> >
> > I am also in contact with a person with Firepro W5130M who is running a
> piglit session
> >
> > I had bought an HD7850 to test with Pitcairn, but it arrived as
> defective so I could not test with Pitcair
> >
> >
> >>
> >> > >
> >> > > 3. How feature complete (HDMI audio?, Freesync?) is it?
> >> >
> >> > All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
> >> > DCE6 (dc/dce60 path) in the last two years from initial submission
> >> >
> >> > >
> >> > > Apart from that it looks like a rather impressive piece of work :)
> >> > >
> >> > > Cheers,
> >> > > Christian.
> >> >
> >> > Thanks,
> >> > please consider that most of the latest DCE6 specific parts were
> >> > possible due to recent Alex support in getting the correct DCE6
> >> > headers,
> >> > his suggestions and continuous feedback.
> >> >
> >> > I would suggest that Alex comments on the proposed next steps to
> follow.
> >>
> >> The code looks pretty good to me.  I'd like to get some feedback from
> >> the display team to see if they have any concerns, but beyond that I
> >> think we can pull it into the tree and continue improving it there.
> >> Do you have a link to a git tree I can pull directly that contains
> >> these patches?  Is this the right branch?
> >> https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_next
> >>
> >> Thanks!
> >>
> >> Alex
> >
> >
> > The following branch was pushed with the series on top of
> amd-staging-drm-next
> >
> > https://github.com/maurossi/linux/commits/kernel-5.6_si_drm-next
>
> I gave this a quick test on all of the SI asics and the various
> monitors I had available and it looks good.  A few minor patches I
> noticed are attached.  If they look good to you, I'll squash them into
> the series when I commit it.  I've pushed it to my fdo tree as well:
> https://cgit.freedesktop.org/~agd5f/linux/log/?h=si_dc_support
>
> Thanks!
>
> Alex
>

The new patches are ok and with the following infomation about piglit
tests,
the series may be good to go.

I have performed piglit tests on Tahiti HD7950 on kernel 5.8.0-rc6 with AMD
DC support for SI
and comparison with vanilla kernel 5.8.0-rc6

Results are the following

[piglit gpu tests with kernel 5.8.0-rc6-amddcsi]

utente@utente-desktop:~/piglit$ ./piglit run gpu .
[26714/26714] skip: 1731, pass: 24669, warn: 15, fail: 288, crash: 11
Thank you for running Piglit!
Results have been written to /home/utente/piglit

[piglit gpu tests with vanilla 5.8.0-rc6]

utente@utente-desktop:~/piglit$ ./piglit run gpu .
[26714/26714] skip: 1731, pass: 24673, warn: 13, fail: 283, crash: 14
Thank you for running Piglit!
Results have been written to /home/utente/piglit

In the attachment the comparison of "5.8.0-rc6-amddcsi" vs "5.8.0-rc6"
vanilla
and viceversa, I see no significant regression and in the delta of failed
tests I don't recognize DC related test cases,
but you may also have a look.

dmesg for "5.8.0-rc6-amddcsi" is also provide the check the crashes

Regarding the other user testing the series with Firepro W5130M
he found an already existing issue in amdgpu si_support=1 which is
independent from my series and matches a problem alrady reported. [1]

Mauro

[1] https://bbs.archlinux.org/viewtopic.php?id=249097


>
> >
> >>
> >>
> >> >
> >> > Mauro
> >> >
> >> > >
> >> > > Am 16.07.20 um 23:22 schrieb Mauro Rossi:
> >> > > > The series adds SI support to AMD DC
> >> > > >
> >> > > > Changelog:
> >> > > >
> >> > > > [RFC]
> >> > > > Preliminar Proof Of Concept, with DCE8 headers still used in
> dce60_resources.c
> >> > > >
> >> > > > [PATCH v2]
> >> > > > Rebase on amd-staging-drm-next dated 17-Oct-2018
> >> > > >
> >> > > > [PATCH v3]
> >> > > > Add support for DCE6 specific headers,
> >> > > > ad hoc DCE6 macros, funtions and fixes,
> >> > > > rebase on current amd-staging-drm-next
> >> > > >
> >> > > >
> >> > > > Commits [01/27]..[08/27] SI support added in various DC components
> >> > > >
> >> > > > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
> >> > > > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
> >> > > > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6
> support (v9b)
> >> > > > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support
> (v2)
> >> > > > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
> >> > > > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6
> (v2)
> >> > > > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6
> (v4)
> >> > > > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
> >> > > >
> >> > > > Commits [09/27]..[24/27] DCE6 specific code adaptions
> >> > > >
> >> > > > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI
> parts (v2)
> >> > > > [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size
> to 64
> >> > > > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific
> macros,functions
> >> > > > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific
> macros
> >> > > > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific
> macros,functions
> >> > > > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific
> macros,functions
> >> > > > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6
> specific macros,functions
> >> > > > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6
> specific macros,functions
> >> > > > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific
> macros,functions
> >> > > > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6
> specific macros,functions
> >> > > > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
> >> > > > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling
> Horizontal Filter Init
> >> > > > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6
> macros,functions
> >> > > > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6
> specific .cursor_lock
> >> > > > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add
> DCE6 specific functions
> >> > > > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
> >> > > >
> >> > > >
> >> > > > Commits [25/27]..[27/27] SI support final enablements
> >> > > >
> >> > > > [PATCH v3 25/27] drm/amd/display: create plane rotation property
> for Bonarie and later
> >> > > > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
> >> > > > [PATCH v3 27/27] drm/amd/display: enable SI support in the
> Kconfig (v2)
> >> > > >
> >> > > >
> >> > > > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
> >> > > >
> >> > > > _______________________________________________
> >> > > > amd-gfx mailing list
> >> > > > amd-gfx@lists.freedesktop.org
> >> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >> > >
> >> > _______________________________________________
> >> > amd-gfx mailing list
> >> > amd-gfx@lists.freedesktop.org
> >> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

[-- Attachment #1.2: Type: text/html, Size: 13512 bytes --]

[-- Attachment #2: dmesg_kernel-5.8.0-rc6_amddcsi.txt --]
[-- Type: text/plain, Size: 87504 bytes --]

[    0.000000] microcode: microcode updated early to revision 0x21, date = 2019-02-13
[    0.000000] Linux version 5.8.0-050800rc6-generic (kernel@kathleen) (gcc (Ubuntu 9.3.0-13ubuntu1) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34.90.20200716) #202007192331 SMP Sun Jul 19 23:33:45 UTC 2020
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.8.0-050800rc6-generic root=UUID=833ac3c7-4d08-47b5-807f-9a8ddeb3a8d2 ro quiet splash radeon.si_support=0 amdgpu.si_support=1 vt.handoff=7
[    0.000000] KERNEL supported cpus:
[    0.000000]   Intel GenuineIntel
[    0.000000]   AMD AuthenticAMD
[    0.000000]   Hygon HygonGenuine
[    0.000000]   Centaur CentaurHauls
[    0.000000]   zhaoxin   Shanghai  
[    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[    0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009d7ff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009d800-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000dd907fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000dd908000-0x00000000de08cfff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000de08d000-0x00000000de116fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000de117000-0x00000000de1b6fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000de1b7000-0x00000000de9a5fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000de9a6000-0x00000000de9a6fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000de9a7000-0x00000000de9e9fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000de9ea000-0x00000000df407fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000df408000-0x00000000df7f0fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000df7f1000-0x00000000df7fffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000f8000000-0x00000000fbffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fed03fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000021effffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] SMBIOS 2.7 present.
[    0.000000] DMI: To Be Filled By O.E.M. To Be Filled By O.E.M./H77 Pro4/MVP, BIOS P1.70 08/07/2013
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 3392.425 MHz processor
[    0.000891] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.000892] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.000897] last_pfn = 0x21f000 max_arch_pfn = 0x400000000
[    0.000901] MTRR default type: uncachable
[    0.000901] MTRR fixed ranges enabled:
[    0.000902]   00000-9FFFF write-back
[    0.000903]   A0000-BFFFF uncachable
[    0.000903]   C0000-CFFFF write-protect
[    0.000904]   D0000-E7FFF uncachable
[    0.000904]   E8000-FFFFF write-protect
[    0.000905] MTRR variable ranges enabled:
[    0.000906]   0 base 000000000 mask E00000000 write-back
[    0.000907]   1 base 200000000 mask FF0000000 write-back
[    0.000907]   2 base 210000000 mask FF8000000 write-back
[    0.000908]   3 base 218000000 mask FFC000000 write-back
[    0.000908]   4 base 21C000000 mask FFE000000 write-back
[    0.000909]   5 base 21E000000 mask FFF000000 write-back
[    0.000910]   6 base 0E0000000 mask FE0000000 uncachable
[    0.000910]   7 disabled
[    0.000910]   8 disabled
[    0.000911]   9 disabled
[    0.001158] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT  
[    0.001265] total RAM covered: 8176M
[    0.001638] Found optimal setting for mtrr clean up
[    0.001639]  gran_size: 64K 	chunk_size: 32M 	num_reg: 6  	lose cover RAM: 0G
[    0.001863] e820: update [mem 0xe0000000-0xffffffff] usable ==> reserved
[    0.001866] last_pfn = 0xdf800 max_arch_pfn = 0x400000000
[    0.008892] found SMP MP-table at [mem 0x000fd8d0-0x000fd8df]
[    0.030371] check: Scanning 1 areas for low memory corruption
[    0.030771] RAMDISK: [mem 0x3172d000-0x34b8dfff]
[    0.030777] ACPI: Early table checksum verification disabled
[    0.030780] ACPI: RSDP 0x00000000000F0490 000024 (v02 ALASKA)
[    0.030782] ACPI: XSDT 0x00000000DE19B080 00007C (v01 ALASKA A M I    01072009 AMI  00010013)
[    0.030787] ACPI: FACP 0x00000000DE1A4DC0 00010C (v05 ALASKA A M I    01072009 AMI  00010013)
[    0.030791] ACPI: DSDT 0x00000000DE19B190 009C2D (v02 ALASKA A M I    00000022 INTL 20051117)
[    0.030794] ACPI: FACS 0x00000000DE1B5080 000040
[    0.030796] ACPI: APIC 0x00000000DE1A4ED0 000072 (v03 ALASKA A M I    01072009 AMI  00010013)
[    0.030798] ACPI: FPDT 0x00000000DE1A4F48 000044 (v01 ALASKA A M I    01072009 AMI  00010013)
[    0.030800] ACPI: MCFG 0x00000000DE1A4F90 00003C (v01 ALASKA A M I    01072009 MSFT 00000097)
[    0.030802] ACPI: SSDT 0x00000000DE1A4FD0 0007E1 (v01 Intel_ AoacTabl 00001000 INTL 20091112)
[    0.030804] ACPI: AAFT 0x00000000DE1A57B8 000112 (v01 ALASKA OEMAAFT  01072009 MSFT 00000097)
[    0.030806] ACPI: HPET 0x00000000DE1A58D0 000038 (v01 ALASKA A M I    01072009 AMI. 00000005)
[    0.030808] ACPI: SSDT 0x00000000DE1A5908 00036D (v01 SataRe SataTabl 00001000 INTL 20091112)
[    0.030811] ACPI: SSDT 0x00000000DE1A5C78 0009AA (v01 PmRef  Cpu0Ist  00003000 INTL 20051117)
[    0.030813] ACPI: SSDT 0x00000000DE1A6628 000A92 (v01 PmRef  CpuPm    00003000 INTL 20051117)
[    0.030815] ACPI: BGRT 0x00000000DE1A70C0 000038 (v00 ALASKA A M I    01072009 AMI  00010013)
[    0.030822] ACPI: Local APIC address 0xfee00000
[    0.030892] No NUMA configuration found
[    0.030893] Faking a node at [mem 0x0000000000000000-0x000000021effffff]
[    0.030901] NODE_DATA(0) allocated [mem 0x21efd1000-0x21effafff]
[    0.031211] Zone ranges:
[    0.031211]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.031212]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
[    0.031213]   Normal   [mem 0x0000000100000000-0x000000021effffff]
[    0.031214]   Device   empty
[    0.031214] Movable zone start for each node
[    0.031217] Early memory node ranges
[    0.031217]   node   0: [mem 0x0000000000001000-0x000000000009cfff]
[    0.031218]   node   0: [mem 0x0000000000100000-0x00000000dd907fff]
[    0.031219]   node   0: [mem 0x00000000de08d000-0x00000000de116fff]
[    0.031219]   node   0: [mem 0x00000000de9a6000-0x00000000de9a6fff]
[    0.031220]   node   0: [mem 0x00000000de9ea000-0x00000000df407fff]
[    0.031220]   node   0: [mem 0x00000000df7f1000-0x00000000df7fffff]
[    0.031221]   node   0: [mem 0x0000000100000000-0x000000021effffff]
[    0.031310] Zeroed struct page in unavailable ranges: 11428 pages
[    0.031311] Initmem setup node 0 [mem 0x0000000000001000-0x000000021effffff]
[    0.031312] On node 0 totalpages: 2085724
[    0.031313]   DMA zone: 64 pages used for memmap
[    0.031313]   DMA zone: 21 pages reserved
[    0.031314]   DMA zone: 3996 pages, LIFO batch:0
[    0.031341]   DMA32 zone: 14159 pages used for memmap
[    0.031341]   DMA32 zone: 906176 pages, LIFO batch:63
[    0.042718]   Normal zone: 18368 pages used for memmap
[    0.042720]   Normal zone: 1175552 pages, LIFO batch:63
[    0.058107] ACPI: PM-Timer IO Port: 0x408
[    0.058109] ACPI: Local APIC address 0xfee00000
[    0.058116] ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
[    0.058126] IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
[    0.058128] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.058129] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.058130] ACPI: IRQ0 used by override.
[    0.058131] ACPI: IRQ9 used by override.
[    0.058133] Using ACPI (MADT) for SMP configuration information
[    0.058134] ACPI: HPET id: 0x8086a701 base: 0xfed00000
[    0.058139] TSC deadline timer available
[    0.058140] smpboot: Allowing 4 CPUs, 0 hotplug CPUs
[    0.058155] PM: hibernation: Registered nosave memory: [mem 0x00000000-0x00000fff]
[    0.058157] PM: hibernation: Registered nosave memory: [mem 0x0009d000-0x0009dfff]
[    0.058157] PM: hibernation: Registered nosave memory: [mem 0x0009e000-0x0009ffff]
[    0.058157] PM: hibernation: Registered nosave memory: [mem 0x000a0000-0x000dffff]
[    0.058158] PM: hibernation: Registered nosave memory: [mem 0x000e0000-0x000fffff]
[    0.058159] PM: hibernation: Registered nosave memory: [mem 0xdd908000-0xde08cfff]
[    0.058160] PM: hibernation: Registered nosave memory: [mem 0xde117000-0xde1b6fff]
[    0.058161] PM: hibernation: Registered nosave memory: [mem 0xde1b7000-0xde9a5fff]
[    0.058162] PM: hibernation: Registered nosave memory: [mem 0xde9a7000-0xde9e9fff]
[    0.058163] PM: hibernation: Registered nosave memory: [mem 0xdf408000-0xdf7f0fff]
[    0.058164] PM: hibernation: Registered nosave memory: [mem 0xdf800000-0xf7ffffff]
[    0.058164] PM: hibernation: Registered nosave memory: [mem 0xf8000000-0xfbffffff]
[    0.058165] PM: hibernation: Registered nosave memory: [mem 0xfc000000-0xfebfffff]
[    0.058165] PM: hibernation: Registered nosave memory: [mem 0xfec00000-0xfec00fff]
[    0.058166] PM: hibernation: Registered nosave memory: [mem 0xfec01000-0xfecfffff]
[    0.058166] PM: hibernation: Registered nosave memory: [mem 0xfed00000-0xfed03fff]
[    0.058166] PM: hibernation: Registered nosave memory: [mem 0xfed04000-0xfed1bfff]
[    0.058167] PM: hibernation: Registered nosave memory: [mem 0xfed1c000-0xfed1ffff]
[    0.058167] PM: hibernation: Registered nosave memory: [mem 0xfed20000-0xfedfffff]
[    0.058168] PM: hibernation: Registered nosave memory: [mem 0xfee00000-0xfee00fff]
[    0.058168] PM: hibernation: Registered nosave memory: [mem 0xfee01000-0xfeffffff]
[    0.058168] PM: hibernation: Registered nosave memory: [mem 0xff000000-0xffffffff]
[    0.058170] [mem 0xdf800000-0xf7ffffff] available for PCI devices
[    0.058171] Booting paravirtualized kernel on bare hardware
[    0.058173] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[    0.058179] setup_percpu: NR_CPUS:8192 nr_cpumask_bits:4 nr_cpu_ids:4 nr_node_ids:1
[    0.058466] percpu: Embedded 56 pages/cpu s192512 r8192 d28672 u524288
[    0.058471] pcpu-alloc: s192512 r8192 d28672 u524288 alloc=1*2097152
[    0.058471] pcpu-alloc: [0] 0 1 2 3 
[    0.058495] Built 1 zonelists, mobility grouping on.  Total pages: 2053112
[    0.058495] Policy zone: Normal
[    0.058497] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.8.0-050800rc6-generic root=UUID=833ac3c7-4d08-47b5-807f-9a8ddeb3a8d2 ro quiet splash radeon.si_support=0 amdgpu.si_support=1 vt.handoff=7
[    0.059394] Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes, linear)
[    0.059799] Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes, linear)
[    0.059841] mem auto-init: stack:off, heap alloc:on, heap free:off
[    0.098003] Memory: 8041840K/8342896K available (14339K kernel code, 2555K rwdata, 8736K rodata, 2632K init, 4912K bss, 301056K reserved, 0K cma-reserved)
[    0.098010] random: get_random_u64 called from kmem_cache_open+0x2d/0x410 with crng_init=0
[    0.098116] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.098128] Kernel/User page tables isolation: enabled
[    0.098144] ftrace: allocating 46071 entries in 180 pages
[    0.111578] ftrace: allocated 180 pages with 4 groups
[    0.111684] rcu: Hierarchical RCU implementation.
[    0.111685] rcu: 	RCU restricting CPUs from NR_CPUS=8192 to nr_cpu_ids=4.
[    0.111686] 	Trampoline variant of Tasks RCU enabled.
[    0.111686] 	Rude variant of Tasks RCU enabled.
[    0.111687] 	Tracing variant of Tasks RCU enabled.
[    0.111687] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[    0.111688] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[    0.114400] NR_IRQS: 524544, nr_irqs: 456, preallocated irqs: 16
[    0.114611] random: crng done (trusting CPU's manufacturer)
[    0.114630] Console: colour dummy device 80x25
[    0.114634] printk: console [tty0] enabled
[    0.114648] ACPI: Core revision 20200528
[    0.114745] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484882848 ns
[    0.114755] APIC: Switch to symmetric I/O mode setup
[    0.114825] x2apic: IRQ remapping doesn't support X2APIC mode
[    0.115236] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.134755] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x30e65a81c66, max_idle_ns: 440795263477 ns
[    0.134758] Calibrating delay loop (skipped), value calculated using timer frequency.. 6784.85 BogoMIPS (lpj=13569700)
[    0.134760] pid_max: default: 32768 minimum: 301
[    0.134781] LSM: Security Framework initializing
[    0.134788] Yama: becoming mindful.
[    0.134810] AppArmor: AppArmor initialized
[    0.134854] Mount-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.134874] Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.135089] mce: CPU0: Thermal monitoring enabled (TM1)
[    0.135099] process: using mwait in idle threads
[    0.135101] Last level iTLB entries: 4KB 512, 2MB 8, 4MB 8
[    0.135101] Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32, 1GB 0
[    0.135103] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization
[    0.135104] Spectre V2 : Mitigation: Full generic retpoline
[    0.135105] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[    0.135105] Spectre V2 : Enabling Restricted Speculation for firmware calls
[    0.135106] Spectre V2 : mitigation: Enabling conditional Indirect Branch Prediction Barrier
[    0.135107] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl and seccomp
[    0.135109] SRBDS: Vulnerable: No microcode
[    0.135110] MDS: Mitigation: Clear CPU buffers
[    0.135279] Freeing SMP alternatives memory: 40K
[    0.138821] smpboot: CPU0: Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz (family: 0x6, model: 0x3a, stepping: 0x9)
[    0.138915] Performance Events: PEBS fmt1+, IvyBridge events, 16-deep LBR, full-width counters, Intel PMU driver.
[    0.138921] ... version:                3
[    0.138921] ... bit width:              48
[    0.138922] ... generic registers:      8
[    0.138922] ... value mask:             0000ffffffffffff
[    0.138922] ... max period:             00007fffffffffff
[    0.138923] ... fixed-purpose events:   3
[    0.138923] ... event mask:             00000007000000ff
[    0.138953] rcu: Hierarchical SRCU implementation.
[    0.139601] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
[    0.139647] smp: Bringing up secondary CPUs ...
[    0.139724] x86: Booting SMP configuration:
[    0.139725] .... node  #0, CPUs:      #1 #2 #3
[    0.146807] smp: Brought up 1 node, 4 CPUs
[    0.146807] smpboot: Max logical packages: 1
[    0.146807] smpboot: Total of 4 processors activated (27139.40 BogoMIPS)
[    0.147882] devtmpfs: initialized
[    0.147882] x86/mm: Memory block size: 128MB
[    0.147882] PM: Registering ACPI NVS region [mem 0xde117000-0xde1b6fff] (655360 bytes)
[    0.147882] PM: Registering ACPI NVS region [mem 0xde9a7000-0xde9e9fff] (274432 bytes)
[    0.147882] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.147882] futex hash table entries: 1024 (order: 4, 65536 bytes, linear)
[    0.147882] pinctrl core: initialized pinctrl subsystem
[    0.147882] PM: RTC time: 12:11:09, date: 2020-07-26
[    0.147882] thermal_sys: Registered thermal governor 'fair_share'
[    0.147882] thermal_sys: Registered thermal governor 'bang_bang'
[    0.147882] thermal_sys: Registered thermal governor 'step_wise'
[    0.147882] thermal_sys: Registered thermal governor 'user_space'
[    0.147882] thermal_sys: Registered thermal governor 'power_allocator'
[    0.147882] NET: Registered protocol family 16
[    0.147882] DMA: preallocated 1024 KiB GFP_KERNEL pool for atomic allocations
[    0.147882] DMA: preallocated 1024 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations
[    0.147882] DMA: preallocated 1024 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
[    0.147882] audit: initializing netlink subsys (disabled)
[    0.147882] audit: type=2000 audit(1595765468.032:1): state=initialized audit_enabled=0 res=1
[    0.147882] EISA bus registered
[    0.147882] cpuidle: using governor ladder
[    0.147882] cpuidle: using governor menu
[    0.147882] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
[    0.147882] ACPI: bus type PCI registered
[    0.147882] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[    0.147882] PCI: MMCONFIG for domain 0000 [bus 00-3f] at [mem 0xf8000000-0xfbffffff] (base 0xf8000000)
[    0.147882] PCI: MMCONFIG at [mem 0xf8000000-0xfbffffff] reserved in E820
[    0.147882] PCI: Using configuration type 1 for base access
[    0.147882] core: PMU erratum BJ122, BV98, HSD29 workaround disabled, HT off
[    0.147882] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
[    0.150780] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[    0.150835] ACPI: Added _OSI(Module Device)
[    0.150835] ACPI: Added _OSI(Processor Device)
[    0.150836] ACPI: Added _OSI(3.0 _SCP Extensions)
[    0.150837] ACPI: Added _OSI(Processor Aggregator Device)
[    0.150837] ACPI: Added _OSI(Linux-Dell-Video)
[    0.150838] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
[    0.150839] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
[    0.157639] ACPI: 5 ACPI AML tables successfully acquired and loaded
[    0.159283] ACPI: Dynamic OEM Table Load:
[    0.159288] ACPI: SSDT 0xFFFF9176154C7000 00083B (v01 PmRef  Cpu0Cst  00003001 INTL 20051117)
[    0.160055] ACPI: Dynamic OEM Table Load:
[    0.160059] ACPI: SSDT 0xFFFF9176154BE000 000303 (v01 PmRef  ApIst    00003000 INTL 20051117)
[    0.160633] ACPI: Dynamic OEM Table Load:
[    0.160636] ACPI: SSDT 0xFFFF917615082400 000119 (v01 PmRef  ApCst    00003000 INTL 20051117)
[    0.161917] ACPI: Interpreter enabled
[    0.161936] ACPI: (supports S0 S3 S4 S5)
[    0.161937] ACPI: Using IOAPIC for interrupt routing
[    0.162004] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[    0.162243] ACPI: Enabled 16 GPEs in block 00 to 3F
[    0.167082] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-3e])
[    0.167087] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
[    0.167379] acpi PNP0A08:00: _OSC: platform does not support [PCIeHotplug SHPCHotplug PME LTR]
[    0.167574] acpi PNP0A08:00: _OSC: OS now controls [AER PCIeCapability]
[    0.167574] acpi PNP0A08:00: FADT indicates ASPM is unsupported, using BIOS configuration
[    0.168078] PCI host bridge to bus 0000:00
[    0.168080] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
[    0.168081] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
[    0.168082] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[    0.168083] pci_bus 0000:00: root bus resource [mem 0x000d0000-0x000d3fff window]
[    0.168083] pci_bus 0000:00: root bus resource [mem 0x000d4000-0x000d7fff window]
[    0.168084] pci_bus 0000:00: root bus resource [mem 0x000d8000-0x000dbfff window]
[    0.168085] pci_bus 0000:00: root bus resource [mem 0x000dc000-0x000dffff window]
[    0.168086] pci_bus 0000:00: root bus resource [mem 0x000e0000-0x000e3fff window]
[    0.168087] pci_bus 0000:00: root bus resource [mem 0x000e4000-0x000e7fff window]
[    0.168087] pci_bus 0000:00: root bus resource [mem 0xe0000000-0xfeafffff window]
[    0.168088] pci_bus 0000:00: root bus resource [bus 00-3e]
[    0.168096] pci 0000:00:00.0: [8086:0150] type 00 class 0x060000
[    0.168183] pci 0000:00:01.0: [8086:0151] type 01 class 0x060400
[    0.168215] pci 0000:00:01.0: PME# supported from D0 D3hot D3cold
[    0.168321] pci 0000:00:14.0: [8086:1e31] type 00 class 0x0c0330
[    0.168343] pci 0000:00:14.0: reg 0x10: [mem 0xf7f00000-0xf7f0ffff 64bit]
[    0.168408] pci 0000:00:14.0: PME# supported from D3hot D3cold
[    0.168494] pci 0000:00:16.0: [8086:1e3a] type 00 class 0x078000
[    0.168517] pci 0000:00:16.0: reg 0x10: [mem 0xf7f1a000-0xf7f1a00f 64bit]
[    0.168585] pci 0000:00:16.0: PME# supported from D0 D3hot D3cold
[    0.168668] pci 0000:00:1a.0: [8086:1e2d] type 00 class 0x0c0320
[    0.168688] pci 0000:00:1a.0: reg 0x10: [mem 0xf7f18000-0xf7f183ff]
[    0.168767] pci 0000:00:1a.0: PME# supported from D0 D3hot D3cold
[    0.168853] pci 0000:00:1b.0: [8086:1e20] type 00 class 0x040300
[    0.168872] pci 0000:00:1b.0: reg 0x10: [mem 0xf7f10000-0xf7f13fff 64bit]
[    0.168944] pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
[    0.169037] pci 0000:00:1c.0: [8086:1e10] type 01 class 0x060400
[    0.169192] pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
[    0.169312] pci 0000:00:1c.4: [8086:244e] type 01 class 0x060401
[    0.169401] pci 0000:00:1c.4: PME# supported from D0 D3hot D3cold
[    0.169496] pci 0000:00:1c.5: [8086:1e1a] type 01 class 0x060400
[    0.169586] pci 0000:00:1c.5: PME# supported from D0 D3hot D3cold
[    0.169683] pci 0000:00:1c.7: [8086:1e1e] type 01 class 0x060400
[    0.169773] pci 0000:00:1c.7: PME# supported from D0 D3hot D3cold
[    0.169870] pci 0000:00:1d.0: [8086:1e26] type 00 class 0x0c0320
[    0.169890] pci 0000:00:1d.0: reg 0x10: [mem 0xf7f17000-0xf7f173ff]
[    0.169969] pci 0000:00:1d.0: PME# supported from D0 D3hot D3cold
[    0.170057] pci 0000:00:1f.0: [8086:1e4a] type 00 class 0x060100
[    0.170234] pci 0000:00:1f.2: [8086:1e02] type 00 class 0x010601
[    0.170250] pci 0000:00:1f.2: reg 0x10: [io  0xf070-0xf077]
[    0.170256] pci 0000:00:1f.2: reg 0x14: [io  0xf060-0xf063]
[    0.170263] pci 0000:00:1f.2: reg 0x18: [io  0xf050-0xf057]
[    0.170269] pci 0000:00:1f.2: reg 0x1c: [io  0xf040-0xf043]
[    0.170275] pci 0000:00:1f.2: reg 0x20: [io  0xf020-0xf03f]
[    0.170281] pci 0000:00:1f.2: reg 0x24: [mem 0xf7f16000-0xf7f167ff]
[    0.170318] pci 0000:00:1f.2: PME# supported from D3hot
[    0.170397] pci 0000:00:1f.3: [8086:1e22] type 00 class 0x0c0500
[    0.170413] pci 0000:00:1f.3: reg 0x10: [mem 0xf7f15000-0xf7f150ff 64bit]
[    0.170431] pci 0000:00:1f.3: reg 0x20: [io  0xf000-0xf01f]
[    0.170547] pci 0000:01:00.0: [1002:679a] type 00 class 0x030000
[    0.170558] pci 0000:01:00.0: reg 0x10: [mem 0xe0000000-0xefffffff 64bit pref]
[    0.170563] pci 0000:01:00.0: reg 0x18: [mem 0xf7e00000-0xf7e3ffff 64bit]
[    0.170567] pci 0000:01:00.0: reg 0x20: [io  0xe000-0xe0ff]
[    0.170573] pci 0000:01:00.0: reg 0x30: [mem 0xf7e40000-0xf7e5ffff pref]
[    0.170576] pci 0000:01:00.0: enabling Extended Tags
[    0.170602] pci 0000:01:00.0: supports D1 D2
[    0.170603] pci 0000:01:00.0: PME# supported from D1 D2 D3hot
[    0.170651] pci 0000:01:00.1: [1002:aaa0] type 00 class 0x040300
[    0.170661] pci 0000:01:00.1: reg 0x10: [mem 0xf7e60000-0xf7e63fff 64bit]
[    0.170677] pci 0000:01:00.1: enabling Extended Tags
[    0.170699] pci 0000:01:00.1: supports D1 D2
[    0.170743] pci 0000:00:01.0: PCI bridge to [bus 01]
[    0.170744] pci 0000:00:01.0:   bridge window [io  0xe000-0xefff]
[    0.170746] pci 0000:00:01.0:   bridge window [mem 0xf7e00000-0xf7efffff]
[    0.170748] pci 0000:00:01.0:   bridge window [mem 0xe0000000-0xefffffff 64bit pref]
[    0.174800] pci 0000:00:1c.0: PCI bridge to [bus 02]
[    0.174868] pci 0000:03:00.0: [1b21:1080] type 01 class 0x060401
[    0.175050] pci 0000:00:1c.4: PCI bridge to [bus 03-04] (subtractive decode)
[    0.175059] pci 0000:00:1c.4:   bridge window [io  0x0000-0x0cf7 window] (subtractive decode)
[    0.175060] pci 0000:00:1c.4:   bridge window [io  0x0d00-0xffff window] (subtractive decode)
[    0.175061] pci 0000:00:1c.4:   bridge window [mem 0x000a0000-0x000bffff window] (subtractive decode)
[    0.175062] pci 0000:00:1c.4:   bridge window [mem 0x000d0000-0x000d3fff window] (subtractive decode)
[    0.175063] pci 0000:00:1c.4:   bridge window [mem 0x000d4000-0x000d7fff window] (subtractive decode)
[    0.175063] pci 0000:00:1c.4:   bridge window [mem 0x000d8000-0x000dbfff window] (subtractive decode)
[    0.175065] pci 0000:00:1c.4:   bridge window [mem 0x000dc000-0x000dffff window] (subtractive decode)
[    0.175066] pci 0000:00:1c.4:   bridge window [mem 0x000e0000-0x000e3fff window] (subtractive decode)
[    0.175067] pci 0000:00:1c.4:   bridge window [mem 0x000e4000-0x000e7fff window] (subtractive decode)
[    0.175068] pci 0000:00:1c.4:   bridge window [mem 0xe0000000-0xfeafffff window] (subtractive decode)
[    0.175102] pci_bus 0000:04: extended config space not accessible
[    0.175181] pci 0000:03:00.0: PCI bridge to [bus 04] (subtractive decode)
[    0.175201] pci 0000:03:00.0:   bridge window [io  0x0000-0x0cf7 window] (subtractive decode)
[    0.175202] pci 0000:03:00.0:   bridge window [io  0x0d00-0xffff window] (subtractive decode)
[    0.175203] pci 0000:03:00.0:   bridge window [mem 0x000a0000-0x000bffff window] (subtractive decode)
[    0.175204] pci 0000:03:00.0:   bridge window [mem 0x000d0000-0x000d3fff window] (subtractive decode)
[    0.175204] pci 0000:03:00.0:   bridge window [mem 0x000d4000-0x000d7fff window] (subtractive decode)
[    0.175205] pci 0000:03:00.0:   bridge window [mem 0x000d8000-0x000dbfff window] (subtractive decode)
[    0.175206] pci 0000:03:00.0:   bridge window [mem 0x000dc000-0x000dffff window] (subtractive decode)
[    0.175207] pci 0000:03:00.0:   bridge window [mem 0x000e0000-0x000e3fff window] (subtractive decode)
[    0.175208] pci 0000:03:00.0:   bridge window [mem 0x000e4000-0x000e7fff window] (subtractive decode)
[    0.175208] pci 0000:03:00.0:   bridge window [mem 0xe0000000-0xfeafffff window] (subtractive decode)
[    0.175270] pci 0000:05:00.0: [10ec:8168] type 00 class 0x020000
[    0.175303] pci 0000:05:00.0: reg 0x10: [io  0xd000-0xd0ff]
[    0.175335] pci 0000:05:00.0: reg 0x18: [mem 0xf0004000-0xf0004fff 64bit pref]
[    0.175354] pci 0000:05:00.0: reg 0x20: [mem 0xf0000000-0xf0003fff 64bit pref]
[    0.175479] pci 0000:05:00.0: supports D1 D2
[    0.175480] pci 0000:05:00.0: PME# supported from D0 D1 D2 D3hot D3cold
[    0.175606] pci 0000:00:1c.5: PCI bridge to [bus 05]
[    0.175609] pci 0000:00:1c.5:   bridge window [io  0xd000-0xdfff]
[    0.175616] pci 0000:00:1c.5:   bridge window [mem 0xf0000000-0xf00fffff 64bit pref]
[    0.175666] pci 0000:06:00.0: [1b21:0612] type 00 class 0x010601
[    0.175694] pci 0000:06:00.0: reg 0x10: [io  0xc050-0xc057]
[    0.175707] pci 0000:06:00.0: reg 0x14: [io  0xc040-0xc043]
[    0.175719] pci 0000:06:00.0: reg 0x18: [io  0xc030-0xc037]
[    0.175731] pci 0000:06:00.0: reg 0x1c: [io  0xc020-0xc023]
[    0.175743] pci 0000:06:00.0: reg 0x20: [io  0xc000-0xc01f]
[    0.175756] pci 0000:06:00.0: reg 0x24: [mem 0xf7d00000-0xf7d001ff]
[    0.175931] pci 0000:00:1c.7: PCI bridge to [bus 06]
[    0.175933] pci 0000:00:1c.7:   bridge window [io  0xc000-0xcfff]
[    0.175936] pci 0000:00:1c.7:   bridge window [mem 0xf7d00000-0xf7dfffff]
[    0.176585] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 10 *11 12 14 15)
[    0.176646] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 *10 11 12 14 15)
[    0.176705] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 *10 11 12 14 15)
[    0.176763] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 6 10 11 12 14 15)
[    0.176822] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 10 11 12 14 15) *0, disabled.
[    0.176880] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 10 11 12 14 15) *0, disabled.
[    0.176938] ACPI: PCI Interrupt Link [LNKG] (IRQs *3 4 5 6 10 11 12 14 15)
[    0.176998] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 10 *11 12 14 15)
[    0.177169] iommu: Default domain type: Translated 
[    0.177169] pci 0000:01:00.0: vgaarb: setting as boot VGA device
[    0.177169] pci 0000:01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none
[    0.177169] pci 0000:01:00.0: vgaarb: bridge control possible
[    0.177169] vgaarb: loaded
[    0.177169] SCSI subsystem initialized
[    0.177169] libata version 3.00 loaded.
[    0.177169] ACPI: bus type USB registered
[    0.177169] usbcore: registered new interface driver usbfs
[    0.177169] usbcore: registered new interface driver hub
[    0.177169] usbcore: registered new device driver usb
[    0.177169] pps_core: LinuxPPS API ver. 1 registered
[    0.177169] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[    0.177169] PTP clock support registered
[    0.177169] EDAC MC: Ver: 3.0.0
[    0.177169] NetLabel: Initializing
[    0.177169] NetLabel:  domain hash size = 128
[    0.177169] NetLabel:  protocols = UNLABELED CIPSOv4 CALIPSO
[    0.177169] NetLabel:  unlabeled traffic allowed by default
[    0.177169] PCI: Using ACPI for IRQ routing
[    0.179063] PCI: pci_cache_line_size set to 64 bytes
[    0.179109] e820: reserve RAM buffer [mem 0x0009d800-0x0009ffff]
[    0.179109] e820: reserve RAM buffer [mem 0xdd908000-0xdfffffff]
[    0.179110] e820: reserve RAM buffer [mem 0xde117000-0xdfffffff]
[    0.179111] e820: reserve RAM buffer [mem 0xde9a7000-0xdfffffff]
[    0.179112] e820: reserve RAM buffer [mem 0xdf408000-0xdfffffff]
[    0.179112] e820: reserve RAM buffer [mem 0xdf800000-0xdfffffff]
[    0.179113] e820: reserve RAM buffer [mem 0x21f000000-0x21fffffff]
[    0.179338] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0, 0, 0, 0, 0
[    0.179340] hpet0: 8 comparators, 64-bit 14.318180 MHz counter
[    0.181360] clocksource: Switched to clocksource tsc-early
[    0.190211] VFS: Disk quotas dquot_6.6.0
[    0.190224] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.190309] AppArmor: AppArmor Filesystem Enabled
[    0.190331] pnp: PnP ACPI init
[    0.190448] system 00:00: [mem 0xfed40000-0xfed44fff] has been reserved
[    0.190452] system 00:00: Plug and Play ACPI device, IDs PNP0c01 (active)
[    0.190536] system 00:01: [io  0x0680-0x069f] has been reserved
[    0.190537] system 00:01: [io  0x1000-0x100f] has been reserved
[    0.190538] system 00:01: [io  0xffff] has been reserved
[    0.190539] system 00:01: [io  0xffff] has been reserved
[    0.190540] system 00:01: [io  0x0400-0x0453] has been reserved
[    0.190541] system 00:01: [io  0x0458-0x047f] has been reserved
[    0.190543] system 00:01: [io  0x0500-0x057f] has been reserved
[    0.190544] system 00:01: [io  0x164e-0x164f] has been reserved
[    0.190546] system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.190567] pnp 00:02: Plug and Play ACPI device, IDs PNP0b00 (active)
[    0.190613] system 00:03: [io  0x0454-0x0457] has been reserved
[    0.190616] system 00:03: Plug and Play ACPI device, IDs INT3f0d PNP0c02 (active)
[    0.190699] system 00:04: [io  0x0290-0x029f] has been reserved
[    0.190701] system 00:04: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.190931] system 00:05: [io  0x04d0-0x04d1] has been reserved
[    0.190934] system 00:05: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.190972] pnp 00:06: Plug and Play ACPI device, IDs PNP0303 PNP030b (active)
[    0.191125] pnp 00:07: [dma 0 disabled]
[    0.191160] pnp 00:07: Plug and Play ACPI device, IDs PNP0501 (active)
[    0.191391] system 00:08: [mem 0xfed1c000-0xfed1ffff] has been reserved
[    0.191392] system 00:08: [mem 0xfed10000-0xfed17fff] has been reserved
[    0.191393] system 00:08: [mem 0xfed18000-0xfed18fff] has been reserved
[    0.191394] system 00:08: [mem 0xfed19000-0xfed19fff] has been reserved
[    0.191395] system 00:08: [mem 0xf8000000-0xfbffffff] has been reserved
[    0.191396] system 00:08: [mem 0xfed20000-0xfed3ffff] has been reserved
[    0.191397] system 00:08: [mem 0xfed90000-0xfed93fff] has been reserved
[    0.191398] system 00:08: [mem 0xfed45000-0xfed8ffff] has been reserved
[    0.191399] system 00:08: [mem 0xff000000-0xffffffff] has been reserved
[    0.191400] system 00:08: [mem 0xfee00000-0xfeefffff] could not be reserved
[    0.191401] system 00:08: [mem 0xf0100000-0xf0100fff] has been reserved
[    0.191403] system 00:08: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.191561] pnp: PnP ACPI: found 9 devices
[    0.197011] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[    0.197057] NET: Registered protocol family 2
[    0.197198] tcp_listen_portaddr_hash hash table entries: 4096 (order: 4, 65536 bytes, linear)
[    0.197256] TCP established hash table entries: 65536 (order: 7, 524288 bytes, linear)
[    0.197413] TCP bind hash table entries: 65536 (order: 8, 1048576 bytes, linear)
[    0.197478] TCP: Hash tables configured (established 65536 bind 65536)
[    0.197550] UDP hash table entries: 4096 (order: 5, 131072 bytes, linear)
[    0.197573] UDP-Lite hash table entries: 4096 (order: 5, 131072 bytes, linear)
[    0.197615] NET: Registered protocol family 1
[    0.197619] NET: Registered protocol family 44
[    0.197629] pci 0000:00:01.0: PCI bridge to [bus 01]
[    0.197631] pci 0000:00:01.0:   bridge window [io  0xe000-0xefff]
[    0.197633] pci 0000:00:01.0:   bridge window [mem 0xf7e00000-0xf7efffff]
[    0.197635] pci 0000:00:01.0:   bridge window [mem 0xe0000000-0xefffffff 64bit pref]
[    0.197637] pci 0000:00:1c.0: PCI bridge to [bus 02]
[    0.197654] pci 0000:03:00.0: PCI bridge to [bus 04]
[    0.197672] pci 0000:00:1c.4: PCI bridge to [bus 03-04]
[    0.197682] pci 0000:00:1c.5: PCI bridge to [bus 05]
[    0.197683] pci 0000:00:1c.5:   bridge window [io  0xd000-0xdfff]
[    0.197689] pci 0000:00:1c.5:   bridge window [mem 0xf0000000-0xf00fffff 64bit pref]
[    0.197694] pci 0000:00:1c.7: PCI bridge to [bus 06]
[    0.197695] pci 0000:00:1c.7:   bridge window [io  0xc000-0xcfff]
[    0.197699] pci 0000:00:1c.7:   bridge window [mem 0xf7d00000-0xf7dfffff]
[    0.197706] pci_bus 0000:00: resource 4 [io  0x0000-0x0cf7 window]
[    0.197707] pci_bus 0000:00: resource 5 [io  0x0d00-0xffff window]
[    0.197708] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window]
[    0.197709] pci_bus 0000:00: resource 7 [mem 0x000d0000-0x000d3fff window]
[    0.197710] pci_bus 0000:00: resource 8 [mem 0x000d4000-0x000d7fff window]
[    0.197711] pci_bus 0000:00: resource 9 [mem 0x000d8000-0x000dbfff window]
[    0.197712] pci_bus 0000:00: resource 10 [mem 0x000dc000-0x000dffff window]
[    0.197712] pci_bus 0000:00: resource 11 [mem 0x000e0000-0x000e3fff window]
[    0.197713] pci_bus 0000:00: resource 12 [mem 0x000e4000-0x000e7fff window]
[    0.197714] pci_bus 0000:00: resource 13 [mem 0xe0000000-0xfeafffff window]
[    0.197715] pci_bus 0000:01: resource 0 [io  0xe000-0xefff]
[    0.197716] pci_bus 0000:01: resource 1 [mem 0xf7e00000-0xf7efffff]
[    0.197716] pci_bus 0000:01: resource 2 [mem 0xe0000000-0xefffffff 64bit pref]
[    0.197718] pci_bus 0000:03: resource 4 [io  0x0000-0x0cf7 window]
[    0.197718] pci_bus 0000:03: resource 5 [io  0x0d00-0xffff window]
[    0.197719] pci_bus 0000:03: resource 6 [mem 0x000a0000-0x000bffff window]
[    0.197720] pci_bus 0000:03: resource 7 [mem 0x000d0000-0x000d3fff window]
[    0.197721] pci_bus 0000:03: resource 8 [mem 0x000d4000-0x000d7fff window]
[    0.197722] pci_bus 0000:03: resource 9 [mem 0x000d8000-0x000dbfff window]
[    0.197722] pci_bus 0000:03: resource 10 [mem 0x000dc000-0x000dffff window]
[    0.197723] pci_bus 0000:03: resource 11 [mem 0x000e0000-0x000e3fff window]
[    0.197724] pci_bus 0000:03: resource 12 [mem 0x000e4000-0x000e7fff window]
[    0.197725] pci_bus 0000:03: resource 13 [mem 0xe0000000-0xfeafffff window]
[    0.197726] pci_bus 0000:04: resource 4 [io  0x0000-0x0cf7 window]
[    0.197726] pci_bus 0000:04: resource 5 [io  0x0d00-0xffff window]
[    0.197727] pci_bus 0000:04: resource 6 [mem 0x000a0000-0x000bffff window]
[    0.197728] pci_bus 0000:04: resource 7 [mem 0x000d0000-0x000d3fff window]
[    0.197729] pci_bus 0000:04: resource 8 [mem 0x000d4000-0x000d7fff window]
[    0.197730] pci_bus 0000:04: resource 9 [mem 0x000d8000-0x000dbfff window]
[    0.197730] pci_bus 0000:04: resource 10 [mem 0x000dc000-0x000dffff window]
[    0.197731] pci_bus 0000:04: resource 11 [mem 0x000e0000-0x000e3fff window]
[    0.197732] pci_bus 0000:04: resource 12 [mem 0x000e4000-0x000e7fff window]
[    0.197733] pci_bus 0000:04: resource 13 [mem 0xe0000000-0xfeafffff window]
[    0.197734] pci_bus 0000:05: resource 0 [io  0xd000-0xdfff]
[    0.197735] pci_bus 0000:05: resource 2 [mem 0xf0000000-0xf00fffff 64bit pref]
[    0.197735] pci_bus 0000:06: resource 0 [io  0xc000-0xcfff]
[    0.197736] pci_bus 0000:06: resource 1 [mem 0xf7d00000-0xf7dfffff]
[    0.222890] pci 0000:00:1a.0: quirk_usb_early_handoff+0x0/0x662 took 24314 usecs
[    0.246886] pci 0000:00:1d.0: quirk_usb_early_handoff+0x0/0x662 took 23419 usecs
[    0.246898] pci 0000:01:00.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
[    0.246903] pci 0000:01:00.1: D0 power state depends on 0000:01:00.0
[    0.246910] pci 0000:03:00.0: CLS mismatch (64 != 32), using 64 bytes
[    0.246965] Trying to unpack rootfs image as initramfs...
[    0.364777] Freeing initrd memory: 53636K
[    0.364812] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[    0.364814] software IO TLB: mapped [mem 0xd6600000-0xda600000] (64MB)
[    0.365043] check: Scanning for low memory corruption every 60 seconds
[    0.365380] Initialise system trusted keyrings
[    0.365388] Key type blacklist registered
[    0.365410] workingset: timestamp_bits=36 max_order=21 bucket_order=0
[    0.366383] zbud: loaded
[    0.366576] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    0.366695] fuse: init (API version 7.31)
[    0.366801] integrity: Platform Keyring initialized
[    0.375417] Key type asymmetric registered
[    0.375418] Asymmetric key parser 'x509' registered
[    0.375424] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 244)
[    0.375457] io scheduler mq-deadline registered
[    0.376279] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[    0.376316] vesafb: mode is 1280x1024x32, linelength=5120, pages=0
[    0.376317] vesafb: scrolling: redraw
[    0.376318] vesafb: Truecolor: size=0:8:8:8, shift=0:16:8:0
[    0.376332] vesafb: framebuffer at 0xe0000000, mapped to 0x0000000076879528, using 5120k, total 5120k
[    0.376360] fbcon: Deferring console take-over
[    0.376361] fb0: VESA VGA frame buffer device
[    0.376369] intel_idle: MWAIT substates: 0x1120
[    0.376370] intel_idle: v0.5.1 model 0x3A
[    0.376490] intel_idle: Local APIC timer is reliable in all C-states
[    0.376584] input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input0
[    0.376600] ACPI: Power Button [PWRB]
[    0.376625] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input1
[    0.376649] ACPI: Power Button [PWRF]
[    0.376964] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
[    0.397473] 00:07: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[    0.399205] Linux agpgart interface v0.103
[    0.401124] loop: module loaded
[    0.401322] libphy: Fixed MDIO Bus: probed
[    0.401323] tun: Universal TUN/TAP device driver, 1.6
[    0.401342] PPP generic driver version 2.4.2
[    0.401375] VFIO - User Level meta-driver version: 0.3
[    0.401445] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    0.401447] ehci-pci: EHCI PCI platform driver
[    0.401544] ehci-pci 0000:00:1a.0: EHCI Host Controller
[    0.401548] ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 1
[    0.401558] ehci-pci 0000:00:1a.0: debug port 2
[    0.405470] ehci-pci 0000:00:1a.0: cache line size of 64 is not supported
[    0.405481] ehci-pci 0000:00:1a.0: irq 16, io mem 0xf7f18000
[    0.418782] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00
[    0.418858] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.08
[    0.418859] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    0.418860] usb usb1: Product: EHCI Host Controller
[    0.418861] usb usb1: Manufacturer: Linux 5.8.0-050800rc6-generic ehci_hcd
[    0.418861] usb usb1: SerialNumber: 0000:00:1a.0
[    0.419022] hub 1-0:1.0: USB hub found
[    0.419029] hub 1-0:1.0: 2 ports detected
[    0.419235] ehci-pci 0000:00:1d.0: EHCI Host Controller
[    0.419238] ehci-pci 0000:00:1d.0: new USB bus registered, assigned bus number 2
[    0.419247] ehci-pci 0000:00:1d.0: debug port 2
[    0.423140] ehci-pci 0000:00:1d.0: cache line size of 64 is not supported
[    0.423147] ehci-pci 0000:00:1d.0: irq 23, io mem 0xf7f17000
[    0.438781] ehci-pci 0000:00:1d.0: USB 2.0 started, EHCI 1.00
[    0.438850] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.08
[    0.438851] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    0.438852] usb usb2: Product: EHCI Host Controller
[    0.438853] usb usb2: Manufacturer: Linux 5.8.0-050800rc6-generic ehci_hcd
[    0.438854] usb usb2: SerialNumber: 0000:00:1d.0
[    0.439011] hub 2-0:1.0: USB hub found
[    0.439017] hub 2-0:1.0: 2 ports detected
[    0.439137] ehci-platform: EHCI generic platform driver
[    0.439143] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[    0.439146] ohci-pci: OHCI PCI platform driver
[    0.439153] ohci-platform: OHCI generic platform driver
[    0.439157] uhci_hcd: USB Universal Host Controller Interface driver
[    0.439197] i8042: PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1
[    0.439197] i8042: PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp
[    0.439680] serio: i8042 KBD port at 0x60,0x64 irq 1
[    0.439888] mousedev: PS/2 mouse device common for all mice
[    0.440215] rtc_cmos 00:02: RTC can wake from S4
[    0.440407] rtc_cmos 00:02: registered as rtc0
[    0.440466] rtc_cmos 00:02: setting system clock to 2020-07-26T12:11:09 UTC (1595765469)
[    0.440478] rtc_cmos 00:02: alarms up to one month, y3k, 242 bytes nvram, hpet irqs
[    0.440483] i2c /dev entries driver
[    0.440513] device-mapper: uevent: version 1.0.3
[    0.440556] device-mapper: ioctl: 4.42.0-ioctl (2020-02-27) initialised: dm-devel@redhat.com
[    0.440571] platform eisa.0: Probing EISA bus 0
[    0.440572] platform eisa.0: EISA: Cannot allocate resource for mainboard
[    0.440573] platform eisa.0: Cannot allocate resource for EISA slot 1
[    0.440574] platform eisa.0: Cannot allocate resource for EISA slot 2
[    0.440574] platform eisa.0: Cannot allocate resource for EISA slot 3
[    0.440575] platform eisa.0: Cannot allocate resource for EISA slot 4
[    0.440576] platform eisa.0: Cannot allocate resource for EISA slot 5
[    0.440577] platform eisa.0: Cannot allocate resource for EISA slot 6
[    0.440577] platform eisa.0: Cannot allocate resource for EISA slot 7
[    0.440578] platform eisa.0: Cannot allocate resource for EISA slot 8
[    0.440579] platform eisa.0: EISA: Detected 0 cards
[    0.440583] intel_pstate: Intel P-state driver initializing
[    0.440811] ledtrig-cpu: registered to indicate activity on CPUs
[    0.440850] drop_monitor: Initializing network drop monitor service
[    0.440988] NET: Registered protocol family 10
[    0.446148] Segment Routing with IPv6
[    0.446163] NET: Registered protocol family 17
[    0.446230] Key type dns_resolver registered
[    0.446481] microcode: sig=0x306a9, pf=0x2, revision=0x21
[    0.446529] microcode: Microcode Update Driver: v2.2.
[    0.446532] IPI shorthand broadcast: enabled
[    0.446537] sched_clock: Marking stable (446362121, 164421)->(452023091, -5496549)
[    0.446596] registered taskstats version 1
[    0.446605] Loading compiled-in X.509 certificates
[    0.447191] Loaded X.509 cert 'Build time autogenerated kernel key: f5ed095bb538b9d2a07de73aa8b3b326e45d53f0'
[    0.447219] zswap: loaded using pool lzo/zbud
[    0.447327] Key type ._fscrypt registered
[    0.447328] Key type .fscrypt registered
[    0.447328] Key type fscrypt-provisioning registered
[    0.449435] Key type encrypted registered
[    0.449437] AppArmor: AppArmor sha1 policy hashing enabled
[    0.449442] ima: No TPM chip found, activating TPM-bypass!
[    0.449445] ima: Allocated hash algorithm: sha1
[    0.449452] ima: No architecture policies found
[    0.449462] evm: Initialising EVM extended attributes:
[    0.449462] evm: security.selinux
[    0.449463] evm: security.SMACK64
[    0.449463] evm: security.SMACK64EXEC
[    0.449463] evm: security.SMACK64TRANSMUTE
[    0.449464] evm: security.SMACK64MMAP
[    0.449464] evm: security.apparmor
[    0.449464] evm: security.ima
[    0.449464] evm: security.capability
[    0.449465] evm: HMAC attrs: 0x1
[    0.449711] PM:   Magic number: 12:847:178
[    0.449746] acpi device:0e: hash matches
[    0.449762]  platform: hash matches
[    0.449851] RAS: Correctable Errors collector initialized.
[    0.450788] Freeing unused decrypted memory: 2040K
[    0.451226] Freeing unused kernel image (initmem) memory: 2632K
[    0.464247] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input2
[    0.470785] Write protecting the kernel read-only data: 26624k
[    0.471421] Freeing unused kernel image (text/rodata gap) memory: 2044K
[    0.471711] Freeing unused kernel image (rodata/data gap) memory: 1504K
[    0.511328] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[    0.511329] x86/mm: Checking user space page tables
[    0.550008] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[    0.550011] Run /init as init process
[    0.550012]   with arguments:
[    0.550012]     /init
[    0.550013]     splash
[    0.550013]   with environment:
[    0.550014]     HOME=/
[    0.550014]     TERM=linux
[    0.550014]     BOOT_IMAGE=/boot/vmlinuz-5.8.0-050800rc6-generic
[    0.616201] xhci_hcd 0000:00:14.0: xHCI Host Controller
[    0.616206] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 3
[    0.617408] xhci_hcd 0000:00:14.0: hcc params 0x20007181 hci version 0x100 quirks 0x000000000000b930
[    0.617412] xhci_hcd 0000:00:14.0: cache line size of 64 is not supported
[    0.617453] ACPI Warning: SystemIO range 0x0000000000000428-0x000000000000042F conflicts with OpRegion 0x0000000000000400-0x000000000000047F (\PMIO) (20200528/utaddress-204)
[    0.617458] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[    0.617460] ACPI Warning: SystemIO range 0x0000000000000540-0x000000000000054F conflicts with OpRegion 0x0000000000000500-0x000000000000057F (\GPR2) (20200528/utaddress-204)
[    0.617463] ACPI Warning: SystemIO range 0x0000000000000540-0x000000000000054F conflicts with OpRegion 0x0000000000000500-0x0000000000000563 (\GPIO) (20200528/utaddress-204)
[    0.617465] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[    0.617465] ACPI Warning: SystemIO range 0x0000000000000530-0x000000000000053F conflicts with OpRegion 0x0000000000000500-0x000000000000057F (\GPR2) (20200528/utaddress-204)
[    0.617467] ACPI Warning: SystemIO range 0x0000000000000530-0x000000000000053F conflicts with OpRegion 0x0000000000000500-0x0000000000000563 (\GPIO) (20200528/utaddress-204)
[    0.617469] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[    0.617469] ACPI Warning: SystemIO range 0x0000000000000500-0x000000000000052F conflicts with OpRegion 0x0000000000000500-0x000000000000057F (\GPR2) (20200528/utaddress-204)
[    0.617471] ACPI Warning: SystemIO range 0x0000000000000500-0x000000000000052F conflicts with OpRegion 0x0000000000000500-0x0000000000000563 (\GPIO) (20200528/utaddress-204)
[    0.617473] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[    0.617473] lpc_ich: Resource conflict(s) found affecting gpio_ich
[    0.617550] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.08
[    0.617551] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    0.617551] usb usb3: Product: xHCI Host Controller
[    0.617552] usb usb3: Manufacturer: Linux 5.8.0-050800rc6-generic xhci-hcd
[    0.617553] usb usb3: SerialNumber: 0000:00:14.0
[    0.619611] ahci 0000:00:1f.2: version 3.0
[    0.619698] r8169 0000:05:00.0: can't disable ASPM; OS doesn't have ASPM control
[    0.619813] hub 3-0:1.0: USB hub found
[    0.620778] hub 3-0:1.0: 4 ports detected
[    0.630937] ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 6 Gbps 0x3f impl SATA mode
[    0.630939] ahci 0000:00:1f.2: flags: 64bit ncq pm led clo pio slum part ems apst 
[    0.636087] i801_smbus 0000:00:1f.3: SMBus using PCI interrupt
[    0.636977] xhci_hcd 0000:00:14.0: xHCI Host Controller
[    0.636980] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 4
[    0.636982] xhci_hcd 0000:00:14.0: Host supports USB 3.0 SuperSpeed
[    0.637007] i2c i2c-0: 2/4 memory slots populated (from DMI)
[    0.637019] usb usb4: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 5.08
[    0.637020] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    0.637021] usb usb4: Product: xHCI Host Controller
[    0.637021] usb usb4: Manufacturer: Linux 5.8.0-050800rc6-generic xhci-hcd
[    0.637022] usb usb4: SerialNumber: 0000:00:14.0
[    0.637102] hub 4-0:1.0: USB hub found
[    0.637109] hub 4-0:1.0: 4 ports detected
[    0.637356] i2c i2c-0: Successfully instantiated SPD at 0x50
[    0.637656] i2c i2c-0: Successfully instantiated SPD at 0x51
[    0.650843] libphy: r8169: probed
[    0.659022] r8169 0000:05:00.0 eth0: RTL8168evl/8111evl, bc:5f:f4:99:82:b4, XID 2c9, IRQ 31
[    0.659023] r8169 0000:05:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko]
[    0.695313] scsi host0: ahci
[    0.695501] scsi host1: ahci
[    0.695605] scsi host2: ahci
[    0.695702] scsi host3: ahci
[    0.695832] scsi host4: ahci
[    0.695947] scsi host5: ahci
[    0.695978] ata1: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16100 irq 30
[    0.695979] ata2: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16180 irq 30
[    0.695981] ata3: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16200 irq 30
[    0.695982] ata4: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16280 irq 30
[    0.695983] ata5: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16300 irq 30
[    0.695984] ata6: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16380 irq 30
[    0.696142] ahci 0000:06:00.0: SSS flag set, parallel bus scan disabled
[    0.696180] ahci 0000:06:00.0: AHCI 0001.0200 32 slots 2 ports 6 Gbps 0x3 impl SATA mode
[    0.696181] ahci 0000:06:00.0: flags: 64bit ncq sntf stag led clo pmp pio slum part ccc sxs 
[    0.696361] scsi host6: ahci
[    0.696415] scsi host7: ahci
[    0.696446] ata7: SATA max UDMA/133 abar m512@0xf7d00000 port 0xf7d00100 irq 32
[    0.696448] ata8: SATA max UDMA/133 abar m512@0xf7d00000 port 0xf7d00180 irq 32
[    0.754782] usb 1-1: new high-speed USB device number 2 using ehci-pci
[    0.774790] usb 2-1: new high-speed USB device number 2 using ehci-pci
[    0.911507] usb 1-1: New USB device found, idVendor=8087, idProduct=0024, bcdDevice= 0.00
[    0.911508] usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[    0.911849] hub 1-1:1.0: USB hub found
[    0.912053] hub 1-1:1.0: 6 ports detected
[    0.931162] usb 2-1: New USB device found, idVendor=8087, idProduct=0024, bcdDevice= 0.00
[    0.931165] usb 2-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[    0.931557] hub 2-1:1.0: USB hub found
[    0.931651] hub 2-1:1.0: 8 ports detected
[    1.010804] ata7: SATA link down (SStatus 0 SControl 300)
[    1.010808] ata6: SATA link down (SStatus 0 SControl 300)
[    1.010836] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    1.010857] ata5: SATA link down (SStatus 0 SControl 300)
[    1.010883] ata2: SATA link down (SStatus 0 SControl 300)
[    1.010895] ata1: SATA link down (SStatus 0 SControl 300)
[    1.010908] ata4: SATA link down (SStatus 0 SControl 300)
[    1.012014] ata3.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[    1.012018] ata3.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[    1.012020] ata3.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[    1.047436] ata3.00: ATA-7: ST3360320AS, 3.AAM, max UDMA/133
[    1.047437] ata3.00: 703282608 sectors, multi 16: LBA48 NCQ (depth 32)
[    1.073177] ata3.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[    1.073180] ata3.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[    1.073183] ata3.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[    1.105743] ata3.00: configured for UDMA/133
[    1.105861] scsi 2:0:0:0: Direct-Access     ATA      ST3360320AS      M    PQ: 0 ANSI: 5
[    1.106002] sd 2:0:0:0: Attached scsi generic sg0 type 0
[    1.106029] sd 2:0:0:0: [sda] 703282608 512-byte logical blocks: (360 GB/335 GiB)
[    1.106036] sd 2:0:0:0: [sda] Write Protect is off
[    1.106037] sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    1.106050] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.173751]  sda: sda1 sda2
[    1.174077] sd 2:0:0:0: [sda] Attached SCSI disk
[    1.178771] usb 1-1.5: new low-speed USB device number 3 using ehci-pci
[    1.302266] usb 1-1.5: New USB device found, idVendor=045e, idProduct=0040, bcdDevice= 1.21
[    1.302269] usb 1-1.5: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[    1.302279] usb 1-1.5: Product: Microsoft Wheel Mouse Optical®
[    1.302280] usb 1-1.5: Manufacturer: Microsoft
[    1.306529] hid: raw HID events driver (C) Jiri Kosina
[    1.313170] usbcore: registered new interface driver usbhid
[    1.313170] usbhid: USB HID core driver
[    1.315148] input: Microsoft Microsoft Wheel Mouse Optical® as /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.5/1-1.5:1.0/0003:045E:0040.0001/input/input3
[    1.315224] hid-generic 0003:045E:0040.0001: input,hidraw0: USB HID v1.00 Mouse [Microsoft Microsoft Wheel Mouse Optical®] on usb-0000:00:1a.0-1.5/input0
[    1.366782] tsc: Refined TSC clocksource calibration: 3392.293 MHz
[    1.366789] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x30e5de2a436, max_idle_ns: 440795285127 ns
[    1.366901] clocksource: Switched to clocksource tsc
[    1.382775] usb 1-1.6: new high-speed USB device number 4 using ehci-pci
[    1.405193] ata8: SATA link down (SStatus 0 SControl 300)
[    1.493243] usb 1-1.6: New USB device found, idVendor=05e3, idProduct=0605, bcdDevice= 6.0b
[    1.493244] usb 1-1.6: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[    1.493245] usb 1-1.6: Product: USB2.0 Hub
[    1.493691] hub 1-1.6:1.0: USB hub found
[    1.494115] hub 1-1.6:1.0: 4 ports detected
[    2.119687] fbcon: Taking over console
[    2.119758] Console: switching to colour frame buffer device 160x64
[    2.192425] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
[    4.153346] systemd[1]: Inserted module 'autofs4'
[    4.317155] systemd[1]: systemd 245.4-4ubuntu3.2 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
[    4.334876] systemd[1]: Detected architecture x86-64.
[    4.360873] systemd[1]: Set hostname to <utente-desktop>.
[    7.546847] systemd[1]: /lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
[    8.593193] systemd[1]: Created slice Virtual Machine and Container Slice.
[    8.593492] systemd[1]: Created slice system-modprobe.slice.
[    8.593642] systemd[1]: Created slice User and Session Slice.
[    8.593690] systemd[1]: Started Forward Password Requests to Wall Directory Watch.
[    8.593823] systemd[1]: Set up automount Arbitrary Executable File Formats File System Automount Point.
[    8.593856] systemd[1]: Reached target User and Group Name Lookups.
[    8.593866] systemd[1]: Reached target Remote File Systems.
[    8.593872] systemd[1]: Reached target Slices.
[    8.593888] systemd[1]: Reached target Libvirt guests shutdown.
[    8.593938] systemd[1]: Listening on Device-mapper event daemon FIFOs.
[    8.594003] systemd[1]: Listening on LVM2 poll daemon socket.
[    8.605075] systemd[1]: Listening on Syslog Socket.
[    8.605141] systemd[1]: Listening on fsck to fsckd communication Socket.
[    8.605180] systemd[1]: Listening on initctl Compatibility Named Pipe.
[    8.605314] systemd[1]: Listening on Journal Audit Socket.
[    8.605367] systemd[1]: Listening on Journal Socket (/dev/log).
[    8.605436] systemd[1]: Listening on Journal Socket.
[    8.605529] systemd[1]: Listening on Network Service Netlink Socket.
[    8.605591] systemd[1]: Listening on udev Control Socket.
[    8.605632] systemd[1]: Listening on udev Kernel Socket.
[    8.606314] systemd[1]: Mounting Huge Pages File System...
[    8.607032] systemd[1]: Mounting POSIX Message Queue File System...
[    8.607828] systemd[1]: Mounting Kernel Debug File System...
[    8.608560] systemd[1]: Mounting Kernel Trace File System...
[    8.609756] systemd[1]: Starting Journal Service...
[    8.610486] systemd[1]: Starting Availability of block devices...
[    8.611470] systemd[1]: Starting Set the console keyboard layout...
[    8.612340] systemd[1]: Starting Create list of static device nodes for the current kernel...
[    8.613086] systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling...
[    8.613818] systemd[1]: Starting Load Kernel Module drm...
[    8.755368] systemd[1]: Condition check resulted in Set Up Additional Binary Formats being skipped.
[    8.755416] systemd[1]: Condition check resulted in File System Check on Root Device being skipped.
[    8.834411] systemd[1]: Starting Load Kernel Modules...
[    8.835159] systemd[1]: Starting Remount Root and Kernel File Systems...
[    8.835857] systemd[1]: Starting udev Coldplug all Devices...
[    8.836525] systemd[1]: Starting Uncomplicated firewall...
[    8.837906] systemd[1]: Mounted Huge Pages File System.
[    8.838007] systemd[1]: Mounted POSIX Message Queue File System.
[    8.838088] systemd[1]: Mounted Kernel Debug File System.
[    8.838167] systemd[1]: Mounted Kernel Trace File System.
[    8.838502] systemd[1]: Finished Availability of block devices.
[    8.846510] systemd[1]: Finished Create list of static device nodes for the current kernel.
[    9.003539] systemd[1]: Started Journal Service.
[    9.039225] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
[    9.207828] systemd-journald[295]: Received client request to flush runtime journal.
[    9.534583] lp: driver loaded but no devices found
[    9.675407] ppdev: user-space parallel port driver
[   13.179050] audit: type=1400 audit(1595765482.234:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=388 comm="apparmor_parser"
[   13.179061] audit: type=1400 audit(1595765482.234:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=388 comm="apparmor_parser"
[   13.179063] audit: type=1400 audit(1595765482.234:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=388 comm="apparmor_parser"
[   13.228910] audit: type=1400 audit(1595765482.282:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oopslash" pid=390 comm="apparmor_parser"
[   13.321052] audit: type=1400 audit(1595765482.374:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="unity8-dash" pid=392 comm="apparmor_parser"
[   13.327188] audit: type=1400 audit(1595765482.382:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="content-hub-peer-picker" pid=391 comm="apparmor_parser"
[   13.391780] audit: type=1400 audit(1595765482.446:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/tcpdump" pid=387 comm="apparmor_parser"
[   13.470023] audit: type=1400 audit(1595765482.522:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/cups-browsed" pid=393 comm="apparmor_parser"
[   13.493912] audit: type=1400 audit(1595765482.546:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/cups/backend/cups-pdf" pid=389 comm="apparmor_parser"
[   13.493923] audit: type=1400 audit(1595765482.546:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/cupsd" pid=389 comm="apparmor_parser"
[   16.818546] at24 0-0050: supply vcc not found, using dummy regulator
[   16.819139] at24 0-0050: 256 byte spd EEPROM, read-only
[   16.819164] at24 0-0051: supply vcc not found, using dummy regulator
[   16.819730] at24 0-0051: 256 byte spd EEPROM, read-only
[   20.037329] RAPL PMU: API unit is 2^-32 Joules, 2 fixed counters, 163840 ms ovfl timer
[   20.037330] RAPL PMU: hw unit of domain pp0-core 2^-16 Joules
[   20.037331] RAPL PMU: hw unit of domain package 2^-16 Joules
[   21.044402] [drm] radeon kernel modesetting enabled.
[   21.044450] radeon 0000:01:00.0: SI support disabled by module param
[   21.048448] cryptd: max_cpu_qlen set to 1000
[   21.477046] AVX version of gcm_enc/dec engaged.
[   21.477048] AES CTR mode by8 optimization enabled
[   21.618260] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
[   21.618262] AMD-Vi: AMD IOMMUv2 functionality not available on this system
[   22.281348] [drm] amdgpu kernel modesetting enabled.
[   22.281415] CRAT table not found
[   22.281418] Virtual CRAT table created for CPU
[   22.281432] amdgpu: Topology: Add CPU node
[   22.281502] checking generic (e0000000 500000) vs hw (e0000000 10000000)
[   22.281503] fb0: switching to amdgpudrmfb from VESA VGA
[   22.281577] Console: switching to colour dummy device 80x25
[   22.281606] amdgpu 0000:01:00.0: vgaarb: deactivate vga console
[   22.281726] [drm] initializing kernel modesetting (TAHITI 0x1002:0x679A 0x174B:0xE207 0x00).
[   22.281728] amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
[   22.281734] [drm] register mmio base: 0xF7E00000
[   22.281734] [drm] register mmio size: 262144
[   22.281735] [drm] PCIE atomic ops is not supported
[   22.281739] [drm] add ip block number 0 <si_common>
[   22.281739] [drm] add ip block number 1 <gmc_v6_0>
[   22.281740] [drm] add ip block number 2 <si_ih>
[   22.281740] [drm] add ip block number 3 <gfx_v6_0>
[   22.281741] [drm] add ip block number 4 <si_dma>
[   22.281741] [drm] add ip block number 5 <si_dpm>
[   22.281742] [drm] add ip block number 6 <dce_v6_0>
[   22.281743] kfd kfd: TAHITI  not supported in kfd
[   22.288950] [drm] BIOS signature incorrect 0 0
[   22.288955] resource sanity check: requesting [mem 0x000c0000-0x000dffff], which spans more than PCI Bus 0000:00 [mem 0x000d0000-0x000d3fff window]
[   22.288958] caller pci_map_rom+0x71/0x18c mapping multiple BARs
[   22.288975] amdgpu: ATOM BIOS: 113-1E207200SA-T47
[   22.289285] [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
[   22.380020] snd_hda_intel 0000:01:00.1: Force to non-snoop mode
[   22.490933] input: HDA ATI HDMI HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input4
[   22.490969] input: HDA ATI HDMI HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input5
[   22.490998] input: HDA ATI HDMI HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input6
[   22.491027] input: HDA ATI HDMI HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input7
[   22.491058] input: HDA ATI HDMI HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input8
[   22.491087] input: HDA ATI HDMI HDMI/DP,pcm=11 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input9
[   22.813179] amdgpu 0000:01:00.0: amdgpu: VRAM: 3072M 0x000000F400000000 - 0x000000F4BFFFFFFF (3072M used)
[   22.813181] amdgpu 0000:01:00.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF
[   22.813190] [drm] Detected VRAM RAM=3072M, BAR=256M
[   22.813190] [drm] RAM width 384bits GDDR5
[   22.813279] [TTM] Zone  kernel: Available graphics memory: 4051868 KiB
[   22.813280] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
[   22.813280] [TTM] Initializing pool allocator
[   22.813283] [TTM] Initializing DMA pool allocator
[   22.813315] [drm] amdgpu: 3072M of VRAM memory ready
[   22.813317] [drm] amdgpu: 3072M of GTT memory ready.
[   22.813320] [drm] GART: num cpu pages 262144, num gpu pages 262144
[   22.813765] amdgpu 0000:01:00.0: amdgpu: PCIE GART of 1024M enabled (table at 0x000000F400500000).
[   22.813811] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   22.828189] intel_rapl_common: Found RAPL domain package
[   22.828190] intel_rapl_common: Found RAPL domain core
[   23.047397] snd_hda_codec_realtek hdaudioC0D0: autoconfig for ALC892: line_outs=3 (0x14/0x15/0x16/0x0/0x0) type:line
[   23.047399] snd_hda_codec_realtek hdaudioC0D0:    speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
[   23.047400] snd_hda_codec_realtek hdaudioC0D0:    hp_outs=1 (0x1b/0x0/0x0/0x0/0x0)
[   23.047400] snd_hda_codec_realtek hdaudioC0D0:    mono: mono_out=0x0
[   23.047401] snd_hda_codec_realtek hdaudioC0D0:    dig-out=0x1e/0x0
[   23.047401] snd_hda_codec_realtek hdaudioC0D0:    inputs:
[   23.047403] snd_hda_codec_realtek hdaudioC0D0:      Front Mic=0x19
[   23.047404] snd_hda_codec_realtek hdaudioC0D0:      Rear Mic=0x18
[   23.047404] snd_hda_codec_realtek hdaudioC0D0:      Line=0x1a
[   23.060290] input: HDA Intel PCH Rear Mic as /devices/pci0000:00/0000:00:1b.0/sound/card0/input10
[   23.060326] input: HDA Intel PCH Line as /devices/pci0000:00/0000:00:1b.0/sound/card0/input11
[   23.060356] input: HDA Intel PCH Line Out Front as /devices/pci0000:00/0000:00:1b.0/sound/card0/input12
[   23.060386] input: HDA Intel PCH Line Out Surround as /devices/pci0000:00/0000:00:1b.0/sound/card0/input13
[   23.060424] input: HDA Intel PCH Line Out CLFE as /devices/pci0000:00/0000:00:1b.0/sound/card0/input14
[   23.132188] [drm] Internal thermal controller with fan control
[   23.132195] [drm] amdgpu: dpm initialized
[   23.132231] [drm] AMDGPU Display Connectors
[   23.132231] [drm] Connector 0:
[   23.132232] [drm]   DP-1
[   23.132232] [drm]   HPD5
[   23.132233] [drm]   DDC: 0x194c 0x194c 0x194d 0x194d 0x194e 0x194e 0x194f 0x194f
[   23.132233] [drm]   Encoders:
[   23.132234] [drm]     DFP1: INTERNAL_UNIPHY2
[   23.132234] [drm] Connector 1:
[   23.132234] [drm]   DP-2
[   23.132235] [drm]   HPD4
[   23.132235] [drm]   DDC: 0x1950 0x1950 0x1951 0x1951 0x1952 0x1952 0x1953 0x1953
[   23.132236] [drm]   Encoders:
[   23.132236] [drm]     DFP2: INTERNAL_UNIPHY2
[   23.132236] [drm] Connector 2:
[   23.132236] [drm]   HDMI-A-1
[   23.132237] [drm]   HPD1
[   23.132237] [drm]   DDC: 0x1954 0x1954 0x1955 0x1955 0x1956 0x1956 0x1957 0x1957
[   23.132238] [drm]   Encoders:
[   23.132238] [drm]     DFP3: INTERNAL_UNIPHY1
[   23.132238] [drm] Connector 3:
[   23.132238] [drm]   DVI-I-1
[   23.132239] [drm]   HPD3
[   23.132239] [drm]   DDC: 0x1960 0x1960 0x1961 0x1961 0x1962 0x1962 0x1963 0x1963
[   23.132240] [drm]   Encoders:
[   23.132240] [drm]     DFP4: INTERNAL_UNIPHY
[   23.132240] [drm]     CRT1: INTERNAL_KLDSCP_DAC1
[   23.132527] [drm] PCIE gen 3 link speeds already enabled
[   23.274921] amdgpu 0000:01:00.0: amdgpu: SE 2, SH per SE 2, CU per SH 8, active_cu_number 28
[   23.364927] [drm] fb mappable at 0xE0703000
[   23.364928] [drm] vram apper at 0xE0000000
[   23.364929] [drm] size 5242880
[   23.364929] [drm] fb depth is 24
[   23.364929] [drm]    pitch is 5120
[   23.365091] fbcon: amdgpudrmfb (fb0) is primary device
[   23.463699] Console: switching to colour frame buffer device 160x64
[   23.465607] amdgpu 0000:01:00.0: fb0: amdgpudrmfb frame buffer device
[   23.736585] [drm] Initialized amdgpu 3.38.0 20150101 for 0000:01:00.0 on minor 0
...
[ 7723.674495] arb_gpu_shader5[114877]: segfault at 7fbb937fe9d0 ip 00007fbbbaad8aab sp 00007fff47d256a0 error 4 in libpthread-2.31.so[7fbbbaad5000+11000]
[ 7723.674502] Code: Bad RIP value.
[ 7758.485659] arb_enhanced_la[124954]: segfault at 290001 ip 00007f73e6c3ad5a sp 00007ffdbe5d4aa8 error 4 in libc-2.31.so[7f73e6bab000+178000]
[ 7758.485664] Code: Bad RIP value.
[ 7759.173405] arb_enhanced_la[125230]: segfault at 290001 ip 00007f5ad9fa7d5a sp 00007fffd9aaa1e8 error 4 in libc-2.31.so[7f5ad9f18000+178000]
[ 7759.173411] Code: Bad RIP value.
[ 7805.053360] amdgpu 0000:01:00.0: amdgpu: GPU fault detected: 146 0x0006880c
[ 7805.053364] amdgpu 0000:01:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
[ 7805.053365] amdgpu 0000:01:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0608800C
[ 7805.053367] amdgpu 0000:01:00.0: amdgpu: VM fault (0x0c, vmid 3) at page 0, read from '' (0x00000000) (136)
[ 7813.142358] [TTM] Failed to find memory space for buffer 0x00000000812205b0 eviction
[ 7813.142371] [TTM]  No space for 00000000812205b0 (524288 pages, 2097152K, 2048M)
[ 7813.142373] [TTM]    placement[0]=0x00060002 (1)
[ 7813.142374] [TTM]      has_type: 1
[ 7813.142374] [TTM]      use_type: 1
[ 7813.142375] [TTM]      flags: 0x0000000A
[ 7813.142376] [TTM]      gpu_offset: 0xFF00000000
[ 7813.142376] [TTM]      size: 786432
[ 7813.142377] [TTM]      available_caching: 0x00070000
[ 7813.142377] [TTM]      default_caching: 0x00010000
[ 7813.142379] [TTM]  0x0000000000000400-0x0000000000000402: 2: used
[ 7813.142380] [TTM]  0x0000000000000402-0x0000000000000412: 16: used
[ 7813.142381] [TTM]  0x0000000000000412-0x0000000000000414: 2: used
[ 7813.142382] [TTM]  0x0000000000000414-0x0000000000000416: 2: used
[ 7813.142383] [TTM]  0x0000000000000416-0x0000000000000418: 2: used
[ 7813.142384] [TTM]  0x0000000000000418-0x000000000000041a: 2: used
[ 7813.142384] [TTM]  0x000000000000041a-0x000000000000041c: 2: used
[ 7813.142385] [TTM]  0x000000000000041c-0x000000000000051c: 256: used
[ 7813.142386] [TTM]  0x000000000000051c-0x000000000000061c: 256: used
[ 7813.142387] [TTM]  0x000000000000061c-0x000000000000061e: 2: used
[ 7813.142388] [TTM]  0x000000000000061e-0x0000000000000620: 2: used
[ 7813.142388] [TTM]  0x0000000000000620-0x0000000000000622: 2: used
[ 7813.142389] [TTM]  0x0000000000000622-0x0000000000000624: 2: used
[ 7813.142390] [TTM]  0x0000000000000624-0x0000000000000626: 2: used
[ 7813.142391] [TTM]  0x0000000000000626-0x0000000000000628: 2: used
[ 7813.142391] [TTM]  0x0000000000000628-0x000000000000062a: 2: used
[ 7813.142392] [TTM]  0x000000000000062a-0x000000000000062c: 2: used
[ 7813.142393] [TTM]  0x000000000000062c-0x000000000000062e: 2: used
[ 7813.142393] [TTM]  0x000000000000062e-0x0000000000000630: 2: used
[ 7813.142394] [TTM]  0x0000000000000630-0x0000000000000632: 2: used
[ 7813.142395] [TTM]  0x0000000000000632-0x0000000000000634: 2: used
[ 7813.142395] [TTM]  0x0000000000000634-0x0000000000000636: 2: used
[ 7813.142396] [TTM]  0x0000000000000636-0x0000000000000638: 2: used
[ 7813.142397] [TTM]  0x0000000000000638-0x000000000000063a: 2: used
[ 7813.142398] [TTM]  0x000000000000063a-0x000000000000063c: 2: used
[ 7813.142399] [TTM]  0x000000000000063c-0x000000000000063e: 2: used
[ 7813.142400] [TTM]  0x000000000000063e-0x000000000000063f: 1: used
[ 7813.142400] [TTM]  0x000000000000063f-0x0000000000000641: 2: used
[ 7813.142401] [TTM]  0x0000000000000641-0x0000000000000643: 2: used
[ 7813.142402] [TTM]  0x0000000000000643-0x0000000000000645: 2: used
[ 7813.142402] [TTM]  0x0000000000000645-0x0000000000000647: 2: used
[ 7813.142403] [TTM]  0x0000000000000647-0x0000000000000649: 2: used
[ 7813.142404] [TTM]  0x0000000000000649-0x000000000000064b: 2: used
[ 7813.142405] [TTM]  0x000000000000064b-0x000000000000064d: 2: used
[ 7813.142406] [TTM]  0x000000000000064d-0x000000000000064f: 2: used
[ 7813.142406] [TTM]  0x000000000000064f-0x0000000000000651: 2: used
[ 7813.142407] [TTM]  0x0000000000000651-0x0000000000000653: 2: used
[ 7813.142408] [TTM]  0x0000000000000653-0x0000000000000655: 2: used
[ 7813.142409] [TTM]  0x0000000000000655-0x0000000000000657: 2: used
[ 7813.142409] [TTM]  0x0000000000000657-0x0000000000000659: 2: used
[ 7813.142410] [TTM]  0x0000000000000659-0x000000000000065b: 2: used
[ 7813.142411] [TTM]  0x000000000000065b-0x0000000000000692: 55: free
[ 7813.142411] [TTM]  0x0000000000000692-0x0000000000000694: 2: used
[ 7813.142412] [TTM]  0x0000000000000694-0x000000000000070f: 123: free
[ 7813.142413] [TTM]  0x000000000000070f-0x0000000000000711: 2: used
[ 7813.142413] [TTM]  0x0000000000000711-0x000000000000079c: 139: free
[ 7813.142414] [TTM]  0x000000000000079c-0x000000000000079e: 2: used
[ 7813.142415] [TTM]  0x000000000000079e-0x00000000000007ee: 80: free
[ 7813.142415] [TTM]  0x00000000000007ee-0x00000000000007f0: 2: used
[ 7813.142461] [TTM]  0x00000000000007f0-0x00000000000007f2: 2: used
[ 7813.142462] [TTM]  0x00000000000007f2-0x00000000000007fe: 12: free
[ 7813.142463] [TTM]  0x00000000000007fe-0x0000000000000800: 2: used
[ 7813.142463] [TTM]  0x0000000000000800-0x0000000000000806: 6: free
[ 7813.142464] [TTM]  0x0000000000000806-0x0000000000000808: 2: used
[ 7813.142464] [TTM]  0x0000000000000808-0x000000000000080e: 6: free
[ 7813.142465] [TTM]  0x000000000000080e-0x000000000000082e: 32: used
[ 7813.142465] [TTM]  0x000000000000082e-0x000000000000083a: 12: free
[ 7813.142466] [TTM]  0x000000000000083a-0x000000000000083c: 2: used
[ 7813.142467] [TTM]  0x000000000000083c-0x000000000000083e: 2: used
[ 7813.142467] [TTM]  0x000000000000083e-0x0000000000000840: 2: used
[ 7813.142469] [TTM]  0x0000000000000840-0x0000000000000842: 2: used
[ 7813.142469] [TTM]  0x0000000000000842-0x0000000000000844: 2: used
[ 7813.142470] [TTM]  0x0000000000000844-0x0000000000000846: 2: used
[ 7813.142471] [TTM]  0x0000000000000846-0x0000000000000848: 2: used
[ 7813.142472] [TTM]  0x0000000000000848-0x000000000000084a: 2: used
[ 7813.142473] [TTM]  0x000000000000084a-0x000000000000084c: 2: used
[ 7813.142473] [TTM]  0x000000000000084c-0x000000000000084e: 2: used
[ 7813.142474] [TTM]  0x000000000000084e-0x0000000000000850: 2: used
[ 7813.142475] [TTM]  0x0000000000000850-0x0000000000000852: 2: used
[ 7813.142475] [TTM]  0x0000000000000852-0x0000000000000854: 2: used
[ 7813.142476] [TTM]  0x0000000000000854-0x000000000000088a: 54: free
[ 7813.142476] [TTM]  0x000000000000088a-0x000000000000088c: 2: used
[ 7813.142477] [TTM]  0x000000000000088c-0x0000000000040000: 259956: free
[ 7813.142478] [TTM]  total: 261120, used 677 free 260443
[ 7813.142479] [TTM]  man size:786432 pages, gtt available:260443 pages, usage:2054MB
[ 7813.270091] [TTM] Failed to find memory space for buffer 0x00000000812205b0 eviction
[ 7813.270104] [TTM]  No space for 00000000812205b0 (524288 pages, 2097152K, 2048M)
[ 7813.270105] [TTM]    placement[0]=0x00060002 (1)
[ 7813.270105] [TTM]      has_type: 1
[ 7813.270106] [TTM]      use_type: 1
[ 7813.270106] [TTM]      flags: 0x0000000A
[ 7813.270107] [TTM]      gpu_offset: 0xFF00000000
[ 7813.270108] [TTM]      size: 786432
[ 7813.270108] [TTM]      available_caching: 0x00070000
[ 7813.270109] [TTM]      default_caching: 0x00010000
[ 7813.270110] [TTM]  0x0000000000000400-0x0000000000000402: 2: used
[ 7813.270111] [TTM]  0x0000000000000402-0x0000000000000412: 16: used
[ 7813.270112] [TTM]  0x0000000000000412-0x0000000000000414: 2: used
[ 7813.270113] [TTM]  0x0000000000000414-0x0000000000000416: 2: used
[ 7813.270113] [TTM]  0x0000000000000416-0x0000000000000418: 2: used
[ 7813.270114] [TTM]  0x0000000000000418-0x000000000000041a: 2: used
[ 7813.270115] [TTM]  0x000000000000041a-0x000000000000041c: 2: used
[ 7813.270116] [TTM]  0x000000000000041c-0x000000000000051c: 256: used
[ 7813.270116] [TTM]  0x000000000000051c-0x000000000000061c: 256: used
[ 7813.270117] [TTM]  0x000000000000061c-0x000000000000061e: 2: used
[ 7813.270118] [TTM]  0x000000000000061e-0x0000000000040000: 260578: free
[ 7813.270119] [TTM]  total: 261120, used 542 free 260578
[ 7813.270120] [TTM]  man size:786432 pages, gtt available:261602 pages, usage:2050MB
[ 7813.339330] [TTM] Failed to find memory space for buffer 0x00000000812205b0 eviction
[ 7813.339339] [TTM]  No space for 00000000812205b0 (524288 pages, 2097152K, 2048M)
[ 7813.339340] [TTM]    placement[0]=0x00060002 (1)
[ 7813.339341] [TTM]      has_type: 1
[ 7813.339341] [TTM]      use_type: 1
[ 7813.339342] [TTM]      flags: 0x0000000A
[ 7813.339343] [TTM]      gpu_offset: 0xFF00000000
[ 7813.339343] [TTM]      size: 786432
[ 7813.339344] [TTM]      available_caching: 0x00070000
[ 7813.339344] [TTM]      default_caching: 0x00010000
[ 7813.339347] [TTM]  0x0000000000000400-0x0000000000000402: 2: used
[ 7813.339348] [TTM]  0x0000000000000402-0x0000000000000412: 16: used
[ 7813.339348] [TTM]  0x0000000000000412-0x0000000000000414: 2: used
[ 7813.339349] [TTM]  0x0000000000000414-0x0000000000000416: 2: used
[ 7813.339350] [TTM]  0x0000000000000416-0x0000000000000418: 2: used
[ 7813.339350] [TTM]  0x0000000000000418-0x000000000000041a: 2: used
[ 7813.339351] [TTM]  0x000000000000041a-0x000000000000041c: 2: used
[ 7813.339352] [TTM]  0x000000000000041c-0x000000000000051c: 256: used
[ 7813.339353] [TTM]  0x000000000000051c-0x000000000000061c: 256: used
[ 7813.339353] [TTM]  0x000000000000061c-0x000000000000061e: 2: used
[ 7813.339354] [TTM]  0x000000000000061e-0x0000000000000620: 2: used
[ 7813.339355] [TTM]  0x0000000000000620-0x0000000000000622: 2: used
[ 7813.339356] [TTM]  0x0000000000000622-0x0000000000000624: 2: used
[ 7813.339357] [TTM]  0x0000000000000624-0x0000000000000626: 2: used
[ 7813.339357] [TTM]  0x0000000000000626-0x0000000000000628: 2: used
[ 7813.339358] [TTM]  0x0000000000000628-0x00000000000006fe: 214: free
[ 7813.339359] [TTM]  0x00000000000006fe-0x000000000000071e: 32: used
[ 7813.339360] [TTM]  0x000000000000071e-0x000000000000071f: 1: used
[ 7813.339360] [TTM]  0x000000000000071f-0x0000000000040000: 260321: free
[ 7813.339361] [TTM]  total: 261120, used 585 free 260535
[ 7813.339363] [TTM]  man size:786432 pages, gtt available:260791 pages, usage:2053MB
[ 7813.437505] [TTM] Failed to find memory space for buffer 0x00000000812205b0 eviction
[ 7813.437516] [TTM]  No space for 00000000812205b0 (524288 pages, 2097152K, 2048M)
[ 7813.437517] [TTM]    placement[0]=0x00060002 (1)
[ 7813.437518] [TTM]      has_type: 1
[ 7813.437519] [TTM]      use_type: 1
[ 7813.437519] [TTM]      flags: 0x0000000A
[ 7813.437520] [TTM]      gpu_offset: 0xFF00000000
[ 7813.437521] [TTM]      size: 786432
[ 7813.437521] [TTM]      available_caching: 0x00070000
[ 7813.437522] [TTM]      default_caching: 0x00010000
[ 7813.437523] [TTM]  0x0000000000000400-0x0000000000000402: 2: used
[ 7813.437524] [TTM]  0x0000000000000402-0x0000000000000412: 16: used
[ 7813.437525] [TTM]  0x0000000000000412-0x0000000000000414: 2: used
[ 7813.437526] [TTM]  0x0000000000000414-0x0000000000000416: 2: used
[ 7813.437527] [TTM]  0x0000000000000416-0x0000000000000418: 2: used
[ 7813.437527] [TTM]  0x0000000000000418-0x000000000000041a: 2: used
[ 7813.437528] [TTM]  0x000000000000041a-0x000000000000041c: 2: used
[ 7813.437529] [TTM]  0x000000000000041c-0x000000000000051c: 256: used
[ 7813.437529] [TTM]  0x000000000000051c-0x000000000000061c: 256: used
[ 7813.437530] [TTM]  0x000000000000061c-0x000000000000061e: 2: used
[ 7813.437531] [TTM]  0x000000000000061e-0x0000000000040000: 260578: free
[ 7813.437531] [TTM]  total: 261120, used 542 free 260578
[ 7813.437533] [TTM]  man size:786432 pages, gtt available:261602 pages, usage:2050MB
[ 7813.438518] arb_uniform_buf[143135]: segfault at 0 ip 00007f20b6f990d7 sp 00007ffdebfcc8c8 error 6 in libc-2.31.so[7f20b6eff000+178000]
[ 7813.438532] Code: Bad RIP value.
[ 7919.344885] arb_shader_stor[146734]: segfault at 0 ip 00007fe2ab5020d7 sp 00007fff6027eda8 error 6 in libc-2.31.so[7fe2ab468000+178000]
[ 7919.344894] Code: Bad RIP value.
[ 7919.897315] arb_shader_stor[146769]: segfault at 0 ip 00007f10d8fbd0d7 sp 00007ffcf8895608 error 6 in libc-2.31.so[7f10d8f23000+178000]
[ 7919.897332] Code: Bad RIP value.
[ 8009.208256] egl-copy-buffer[147619]: segfault at 18 ip 00007f968e8c9e9b sp 00007ffe7ca12200 error 4 in libEGL_mesa.so.0.0.0[7f968e8a9000+26000]
[ 8009.208263] Code: Bad RIP value.
[ 8032.266864] perf: interrupt took too long (2507 > 2500), lowering kernel.perf_event_max_sample_rate to 79750
[ 8070.875068] [TTM] Buffer eviction failed
[ 8080.462745] amdgpu 0000:01:00.0: amdgpu: GPU fault detected: 146 0x00ce8804
[ 8080.462756] amdgpu 0000:01:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00134006
[ 8080.462758] amdgpu 0000:01:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0E088004
[ 8080.462759] amdgpu 0000:01:00.0: amdgpu: VM fault (0x04, vmid 7) at page 1261574, read from '' (0x00000000) (136)
[ 8080.478266] amdgpu 0000:01:00.0: amdgpu: GPU fault detected: 146 0x00c28804
[ 8080.478271] amdgpu 0000:01:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00134006
[ 8080.478272] amdgpu 0000:01:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02088004
[ 8080.478274] amdgpu 0000:01:00.0: amdgpu: VM fault (0x04, vmid 1) at page 1261574, read from '' (0x00000000) (136)
[ 8204.864339] shader_runner[168816]: segfault at 7f64df7fe9d0 ip 00007f6506a47aab sp 00007fff3961d340 error 4
[ 8204.864343] shader_runner[168803]: segfault at 7faa7d7fa9d0 ip 00007faa941ccaab sp 00007ffca97f4490 error 4
[ 8204.864345]  in libpthread-2.31.so[7f6506a44000+11000]
[ 8204.864348] Code: Bad RIP value.
[ 8204.864349]  in libpthread-2.31.so[7faa941c9000+11000]
[ 8204.864351] Code: Bad RIP value.
[ 8204.864376] shader_runner[168801]: segfault at 7f12bf7fe9d0 ip 00007f12d3155aab sp 00007fff81846f80 error 4 in libpthread-2.31.so[7f12d3152000+11000]
[ 8204.864381] Code: Bad RIP value.
[ 8204.864501] shader_runner[168802]: segfault at 7f7225ffb9d0 ip 00007f723cfa4aab sp 00007ffda6a8a890 error 4 in libpthread-2.31.so[7f723cfa1000+11000]
[ 8204.864507] Code: Bad RIP value.
[ 8207.293001] shader_runner[168847]: segfault at 7f4781ffb9d0 ip 00007f4799379aab sp 00007ffd72820630 error 4 in libpthread-2.31.so[7f4799376000+11000]
[ 8207.293009] Code: Bad RIP value.
[ 8207.303214] shader_runner[168849]: segfault at 7f01a27fc9d0 ip 00007f01c1c58aab sp 00007ffef3fc31d0 error 4 in libpthread-2.31.so[7f01c1c55000+11000]
[ 8207.303220] Code: Bad RIP value.
[ 8207.333651] shader_runner[168872]: segfault at 7f84fffff9d0 ip 00007f852f5f4aab sp 00007ffc03821a30 error 4 in libpthread-2.31.so[7f852f5f1000+11000]
[ 8207.333656] Code: Bad RIP value.
[ 8207.339399] shader_runner[168875]: segfault at 7f5dedffb9d0 ip 00007f5e04e37aab sp 00007ffd41558ad0 error 4 in libpthread-2.31.so[7f5e04e34000+11000]
[ 8207.339405] Code: Bad RIP value.
[ 8207.515900] shader_runner[168890]: segfault at 7f3e677fe9d0 ip 00007f3e76a5baab sp 00007ffe1bdbfa30 error 4 in libpthread-2.31.so[7f3e76a58000+11000]
[ 8207.515907] Code: Bad RIP value.
[ 8207.551837] shader_runner[168915]: segfault at 7f14667fc9d0 ip 00007f147dbdbaab sp 00007ffef737bb30 error 4 in libpthread-2.31.so[7f147dbd8000+11000]
[ 8207.551842] Code: Bad RIP value.
[ 8209.900683] show_signal_msg: 38 callbacks suppressed
[ 8209.900686] shader_runner[169450]: segfault at 7fe88d1119d0 ip 00007fe897dc7aab sp 00007fff9a7994e0 error 4 in libpthread-2.31.so[7fe897dc4000+11000]
[ 8209.900695] Code: Bad RIP value.
[ 8209.958317] shader_runner[169463]: segfault at 7f05d8ff99d0 ip 00007f05e82a9aab sp 00007ffd29495db0 error 4 in libpthread-2.31.so[7f05e82a6000+11000]
[ 8209.958323] Code: Bad RIP value.
[ 8210.016780] shader_runner[169477]: segfault at 7fd1657fa9d0 ip 00007fd174c58aab sp 00007ffd46a738b0 error 4 in libpthread-2.31.so[7fd174c55000+11000]
[ 8210.016787] Code: Bad RIP value.
[ 8210.095393] shader_runner[169492]: segfault at 7f8d79d7c9d0 ip 00007f8d84a32aab sp 00007ffe83c7c320 error 4 in libpthread-2.31.so[7f8d84a2f000+11000]
[ 8210.095398] Code: Bad RIP value.
[ 8210.175068] shader_runner[169506]: segfault at 7f27877fe9d0 ip 00007f27a68b4aab sp 00007ffd39ff79a0 error 4 in libpthread-2.31.so[7f27a68b1000+11000]
[ 8210.175075] Code: Bad RIP value.
[ 8210.202147] shader_runner[169519]: segfault at 7f315a7fc9d0 ip 00007f316970daab sp 00007ffee6c3a210 error 4 in libpthread-2.31.so[7f316970a000+11000]
[ 8210.202156] Code: Bad RIP value.
[ 8210.288298] shader_runner[169534]: segfault at 7f7a3cff99d0 ip 00007f7a4c23baab sp 00007ffc087caeb0 error 4 in libpthread-2.31.so[7f7a4c238000+11000]
[ 8210.288303] Code: Bad RIP value.
[ 8210.329530] shader_runner[169547]: segfault at 7f63f57fa9d0 ip 00007f6404af5aab sp 00007ffdf3e7f790 error 4 in libpthread-2.31.so[7f6404af2000+11000]
[ 8210.329536] Code: Bad RIP value.
[ 8210.412320] shader_runner[169562]: segfault at 7f622471f9d0 ip 00007f622f3d5aab sp 00007fff6f38f6b0 error 4 in libpthread-2.31.so[7f622f3d2000+11000]
[ 8210.412325] Code: Bad RIP value.
[ 8210.455261] shader_runner[169575]: segfault at 7f0d177fe9d0 ip 00007f0d2e351aab sp 00007fff77b01400 error 4 in libpthread-2.31.so[7f0d2e34e000+11000]
[ 8210.455269] Code: Bad RIP value.
[ 8218.886289] show_signal_msg: 27 callbacks suppressed
[ 8218.886292] shader_runner[172286]: segfault at 56393e81e408 ip 00007f4feb9a3ed9 sp 00007ffe74015800 error 4 in radeonsi_dri.so[7f4feb6ad000+d49000]
[ 8218.886297] Code: Bad RIP value.
[ 8218.899687] shader_runner[172285]: segfault at 563750011378 ip 00007ff7236e4ed9 sp 00007ffe2e978e10 error 4 in radeonsi_dri.so[7ff7233ee000+d49000]
[ 8218.899692] Code: Bad RIP value.
[ 8219.001985] shader_runner[172334]: segfault at 5623ce8c4848 ip 00007fa239f2bed9 sp 00007ffcaf7c4170 error 4 in radeonsi_dri.so[7fa239c35000+d49000]
[ 8219.001991] Code: Bad RIP value.
[ 8219.490115] shader_runner[172514]: segfault at 55f2d3009314 ip 00007fad22647500 sp 00007ffe441c0120 error 4 in radeonsi_dri.so[7fad2234f000+d49000]
[ 8219.490123] Code: Bad RIP value.
[ 8219.491095] shader_runner[172516]: segfault at 563bd86d20a4 ip 00007fb9e40f9500 sp 00007ffcd77518b0 error 4 in radeonsi_dri.so[7fb9e3e01000+d49000]
[ 8219.491101] Code: Bad RIP value.
[ 8219.711083] shader_runner[172588]: segfault at 55ca9ae686a4 ip 00007fe140555500 sp 00007ffe9cae1400 error 4 in radeonsi_dri.so[7fe14025d000+d49000]
[ 8219.711090] Code: Bad RIP value.
[ 8430.203633] perf: interrupt took too long (3138 > 3133), lowering kernel.perf_event_max_sample_rate to 63500
[ 9055.012725] audit: type=1400 audit(1595774523.846:84): apparmor="ALLOWED" operation="open" profile="libreoffice-soffice" name="/usr/share/libdrm/amdgpu.ids" pid=383072 comm="soffice.bin" requested_mask="r" denied_mask="r" fsuid=1000 ouid=0

[-- Attachment #3: piglit_tests_amddcsi.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 34347 bytes --]

[-- Attachment #4: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-07-26 15:31           ` Re: Mauro Rossi
@ 2020-07-27 18:31             ` Alex Deucher
  2020-07-27 19:46               ` Re: Mauro Rossi
  0 siblings, 1 reply; 1546+ messages in thread
From: Alex Deucher @ 2020-07-27 18:31 UTC (permalink / raw)
  To: Mauro Rossi
  Cc: Deucher, Alexander, Harry Wentland, Christian Koenig,
	amd-gfx list

On Sun, Jul 26, 2020 at 11:31 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>
> Hello,
>
> On Fri, Jul 24, 2020 at 8:31 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>>
>> On Wed, Jul 22, 2020 at 3:57 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>> >
>> > Hello,
>> > re-sending and copying full DL
>> >
>> > On Wed, Jul 22, 2020 at 4:51 AM Alex Deucher <alexdeucher@gmail.com> wrote:
>> >>
>> >> On Mon, Jul 20, 2020 at 6:00 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>> >> >
>> >> > Hi Christian,
>> >> >
>> >> > On Mon, Jul 20, 2020 at 11:00 AM Christian König
>> >> > <ckoenig.leichtzumerken@gmail.com> wrote:
>> >> > >
>> >> > > Hi Mauro,
>> >> > >
>> >> > > I'm not deep into the whole DC design, so just some general high level
>> >> > > comments on the cover letter:
>> >> > >
>> >> > > 1. Please add a subject line to the cover letter, my spam filter thinks
>> >> > > that this is suspicious otherwise.
>> >> >
>> >> > My mistake in the editing of covert letter with git send-email,
>> >> > I may have forgot to keep the Subject at the top
>> >> >
>> >> > >
>> >> > > 2. Then you should probably note how well (badly?) is that tested. Since
>> >> > > you noted proof of concept it might not even work.
>> >> >
>> >> > The Changelog is to be read as:
>> >> >
>> >> > [RFC] was the initial Proof of concept was the RFC and [PATCH v2] was
>> >> > just a rebase onto amd-staging-drm-next
>> >> >
>> >> > this series [PATCH v3] has all the known changes required for DCE6 specificity
>> >> > and based on a long offline thread with Alexander Deutcher and past
>> >> > dri-devel chats with Harry Wentland.
>> >> >
>> >> > It was tested for my possibilities of testing with HD7750 and HD7950,
>> >> > with checks in dmesg output for not getting "missing registers/masks"
>> >> > kernel WARNING
>> >> > and with kernel build on Ubuntu 20.04 and with android-x86
>> >> >
>> >> > The proposal I made to Alex is that AMD testing systems will be used
>> >> > for further regression testing,
>> >> > as part of review and validation for eligibility to amd-staging-drm-next
>> >> >
>> >>
>> >> We will certainly test it once it lands, but presumably this is
>> >> working on the SI cards you have access to?
>> >
>> >
>> > Yes, most of my testing was done with android-x86  Android CTS (EGL, GLES2, GLES3, VK)
>> >
>> > I am also in contact with a person with Firepro W5130M who is running a piglit session
>> >
>> > I had bought an HD7850 to test with Pitcairn, but it arrived as defective so I could not test with Pitcair
>> >
>> >
>> >>
>> >> > >
>> >> > > 3. How feature complete (HDMI audio?, Freesync?) is it?
>> >> >
>> >> > All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
>> >> > DCE6 (dc/dce60 path) in the last two years from initial submission
>> >> >
>> >> > >
>> >> > > Apart from that it looks like a rather impressive piece of work :)
>> >> > >
>> >> > > Cheers,
>> >> > > Christian.
>> >> >
>> >> > Thanks,
>> >> > please consider that most of the latest DCE6 specific parts were
>> >> > possible due to recent Alex support in getting the correct DCE6
>> >> > headers,
>> >> > his suggestions and continuous feedback.
>> >> >
>> >> > I would suggest that Alex comments on the proposed next steps to follow.
>> >>
>> >> The code looks pretty good to me.  I'd like to get some feedback from
>> >> the display team to see if they have any concerns, but beyond that I
>> >> think we can pull it into the tree and continue improving it there.
>> >> Do you have a link to a git tree I can pull directly that contains
>> >> these patches?  Is this the right branch?
>> >> https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_next
>> >>
>> >> Thanks!
>> >>
>> >> Alex
>> >
>> >
>> > The following branch was pushed with the series on top of amd-staging-drm-next
>> >
>> > https://github.com/maurossi/linux/commits/kernel-5.6_si_drm-next
>>
>> I gave this a quick test on all of the SI asics and the various
>> monitors I had available and it looks good.  A few minor patches I
>> noticed are attached.  If they look good to you, I'll squash them into
>> the series when I commit it.  I've pushed it to my fdo tree as well:
>> https://cgit.freedesktop.org/~agd5f/linux/log/?h=si_dc_support
>>
>> Thanks!
>>
>> Alex
>
>
> The new patches are ok and with the following infomation about piglit tests,
> the series may be good to go.
>
> I have performed piglit tests on Tahiti HD7950 on kernel 5.8.0-rc6 with AMD DC support for SI
> and comparison with vanilla kernel 5.8.0-rc6
>
> Results are the following
>
> [piglit gpu tests with kernel 5.8.0-rc6-amddcsi]
>
> utente@utente-desktop:~/piglit$ ./piglit run gpu .
> [26714/26714] skip: 1731, pass: 24669, warn: 15, fail: 288, crash: 11
> Thank you for running Piglit!
> Results have been written to /home/utente/piglit
>
> [piglit gpu tests with vanilla 5.8.0-rc6]
>
> utente@utente-desktop:~/piglit$ ./piglit run gpu .
> [26714/26714] skip: 1731, pass: 24673, warn: 13, fail: 283, crash: 14
> Thank you for running Piglit!
> Results have been written to /home/utente/piglit
>
> In the attachment the comparison of "5.8.0-rc6-amddcsi" vs "5.8.0-rc6" vanilla
> and viceversa, I see no significant regression and in the delta of failed tests I don't recognize DC related test cases,
> but you may also have a look.

Looks good to me.  The series is:
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

>
> dmesg for "5.8.0-rc6-amddcsi" is also provide the check the crashes
>
> Regarding the other user testing the series with Firepro W5130M
> he found an already existing issue in amdgpu si_support=1 which is independent from my series and matches a problem alrady reported. [1]
>

amdgpu does not currently implement GPU reset support for SI.

Alex

> Mauro
>
> [1] https://bbs.archlinux.org/viewtopic.php?id=249097
>
>>
>>
>> >
>> >>
>> >>
>> >> >
>> >> > Mauro
>> >> >
>> >> > >
>> >> > > Am 16.07.20 um 23:22 schrieb Mauro Rossi:
>> >> > > > The series adds SI support to AMD DC
>> >> > > >
>> >> > > > Changelog:
>> >> > > >
>> >> > > > [RFC]
>> >> > > > Preliminar Proof Of Concept, with DCE8 headers still used in dce60_resources.c
>> >> > > >
>> >> > > > [PATCH v2]
>> >> > > > Rebase on amd-staging-drm-next dated 17-Oct-2018
>> >> > > >
>> >> > > > [PATCH v3]
>> >> > > > Add support for DCE6 specific headers,
>> >> > > > ad hoc DCE6 macros, funtions and fixes,
>> >> > > > rebase on current amd-staging-drm-next
>> >> > > >
>> >> > > >
>> >> > > > Commits [01/27]..[08/27] SI support added in various DC components
>> >> > > >
>> >> > > > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
>> >> > > > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
>> >> > > > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support (v9b)
>> >> > > > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
>> >> > > > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
>> >> > > > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
>> >> > > > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
>> >> > > > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
>> >> > > >
>> >> > > > Commits [09/27]..[24/27] DCE6 specific code adaptions
>> >> > > >
>> >> > > > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI parts (v2)
>> >> > > > [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
>> >> > > > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific macros,functions
>> >> > > > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
>> >> > > > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific macros,functions
>> >> > > > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific macros,functions
>> >> > > > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6 specific macros,functions
>> >> > > > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific macros,functions
>> >> > > > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific macros,functions
>> >> > > > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific macros,functions
>> >> > > > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
>> >> > > > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init
>> >> > > > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6 macros,functions
>> >> > > > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6 specific .cursor_lock
>> >> > > > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6 specific functions
>> >> > > > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
>> >> > > >
>> >> > > >
>> >> > > > Commits [25/27]..[27/27] SI support final enablements
>> >> > > >
>> >> > > > [PATCH v3 25/27] drm/amd/display: create plane rotation property for Bonarie and later
>> >> > > > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
>> >> > > > [PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig (v2)
>> >> > > >
>> >> > > >
>> >> > > > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
>> >> > > >
>> >> > > > _______________________________________________
>> >> > > > amd-gfx mailing list
>> >> > > > amd-gfx@lists.freedesktop.org
>> >> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> >> > >
>> >> > _______________________________________________
>> >> > amd-gfx mailing list
>> >> > amd-gfx@lists.freedesktop.org
>> >> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-07-27 18:31             ` Re: Alex Deucher
@ 2020-07-27 19:46               ` Mauro Rossi
  2020-07-27 19:54                 ` Re: Alex Deucher
  0 siblings, 1 reply; 1546+ messages in thread
From: Mauro Rossi @ 2020-07-27 19:46 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Deucher, Alexander, Harry Wentland, Christian Koenig,
	amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 10849 bytes --]

On Mon, Jul 27, 2020 at 8:31 PM Alex Deucher <alexdeucher@gmail.com> wrote:

> On Sun, Jul 26, 2020 at 11:31 AM Mauro Rossi <issor.oruam@gmail.com>
> wrote:
> >
> > Hello,
> >
> > On Fri, Jul 24, 2020 at 8:31 PM Alex Deucher <alexdeucher@gmail.com>
> wrote:
> >>
> >> On Wed, Jul 22, 2020 at 3:57 AM Mauro Rossi <issor.oruam@gmail.com>
> wrote:
> >> >
> >> > Hello,
> >> > re-sending and copying full DL
> >> >
> >> > On Wed, Jul 22, 2020 at 4:51 AM Alex Deucher <alexdeucher@gmail.com>
> wrote:
> >> >>
> >> >> On Mon, Jul 20, 2020 at 6:00 AM Mauro Rossi <issor.oruam@gmail.com>
> wrote:
> >> >> >
> >> >> > Hi Christian,
> >> >> >
> >> >> > On Mon, Jul 20, 2020 at 11:00 AM Christian König
> >> >> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> >> > >
> >> >> > > Hi Mauro,
> >> >> > >
> >> >> > > I'm not deep into the whole DC design, so just some general high
> level
> >> >> > > comments on the cover letter:
> >> >> > >
> >> >> > > 1. Please add a subject line to the cover letter, my spam filter
> thinks
> >> >> > > that this is suspicious otherwise.
> >> >> >
> >> >> > My mistake in the editing of covert letter with git send-email,
> >> >> > I may have forgot to keep the Subject at the top
> >> >> >
> >> >> > >
> >> >> > > 2. Then you should probably note how well (badly?) is that
> tested. Since
> >> >> > > you noted proof of concept it might not even work.
> >> >> >
> >> >> > The Changelog is to be read as:
> >> >> >
> >> >> > [RFC] was the initial Proof of concept was the RFC and [PATCH v2]
> was
> >> >> > just a rebase onto amd-staging-drm-next
> >> >> >
> >> >> > this series [PATCH v3] has all the known changes required for DCE6
> specificity
> >> >> > and based on a long offline thread with Alexander Deutcher and past
> >> >> > dri-devel chats with Harry Wentland.
> >> >> >
> >> >> > It was tested for my possibilities of testing with HD7750 and
> HD7950,
> >> >> > with checks in dmesg output for not getting "missing
> registers/masks"
> >> >> > kernel WARNING
> >> >> > and with kernel build on Ubuntu 20.04 and with android-x86
> >> >> >
> >> >> > The proposal I made to Alex is that AMD testing systems will be
> used
> >> >> > for further regression testing,
> >> >> > as part of review and validation for eligibility to
> amd-staging-drm-next
> >> >> >
> >> >>
> >> >> We will certainly test it once it lands, but presumably this is
> >> >> working on the SI cards you have access to?
> >> >
> >> >
> >> > Yes, most of my testing was done with android-x86  Android CTS (EGL,
> GLES2, GLES3, VK)
> >> >
> >> > I am also in contact with a person with Firepro W5130M who is running
> a piglit session
> >> >
> >> > I had bought an HD7850 to test with Pitcairn, but it arrived as
> defective so I could not test with Pitcair
> >> >
> >> >
> >> >>
> >> >> > >
> >> >> > > 3. How feature complete (HDMI audio?, Freesync?) is it?
> >> >> >
> >> >> > All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
> >> >> > DCE6 (dc/dce60 path) in the last two years from initial submission
> >> >> >
> >> >> > >
> >> >> > > Apart from that it looks like a rather impressive piece of work
> :)
> >> >> > >
> >> >> > > Cheers,
> >> >> > > Christian.
> >> >> >
> >> >> > Thanks,
> >> >> > please consider that most of the latest DCE6 specific parts were
> >> >> > possible due to recent Alex support in getting the correct DCE6
> >> >> > headers,
> >> >> > his suggestions and continuous feedback.
> >> >> >
> >> >> > I would suggest that Alex comments on the proposed next steps to
> follow.
> >> >>
> >> >> The code looks pretty good to me.  I'd like to get some feedback from
> >> >> the display team to see if they have any concerns, but beyond that I
> >> >> think we can pull it into the tree and continue improving it there.
> >> >> Do you have a link to a git tree I can pull directly that contains
> >> >> these patches?  Is this the right branch?
> >> >> https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_next
> >> >>
> >> >> Thanks!
> >> >>
> >> >> Alex
> >> >
> >> >
> >> > The following branch was pushed with the series on top of
> amd-staging-drm-next
> >> >
> >> > https://github.com/maurossi/linux/commits/kernel-5.6_si_drm-next
> >>
> >> I gave this a quick test on all of the SI asics and the various
> >> monitors I had available and it looks good.  A few minor patches I
> >> noticed are attached.  If they look good to you, I'll squash them into
> >> the series when I commit it.  I've pushed it to my fdo tree as well:
> >> https://cgit.freedesktop.org/~agd5f/linux/log/?h=si_dc_support
> >>
> >> Thanks!
> >>
> >> Alex
> >
> >
> > The new patches are ok and with the following infomation about piglit
> tests,
> > the series may be good to go.
> >
> > I have performed piglit tests on Tahiti HD7950 on kernel 5.8.0-rc6 with
> AMD DC support for SI
> > and comparison with vanilla kernel 5.8.0-rc6
> >
> > Results are the following
> >
> > [piglit gpu tests with kernel 5.8.0-rc6-amddcsi]
> >
> > utente@utente-desktop:~/piglit$ ./piglit run gpu .
> > [26714/26714] skip: 1731, pass: 24669, warn: 15, fail: 288, crash: 11
> > Thank you for running Piglit!
> > Results have been written to /home/utente/piglit
> >
> > [piglit gpu tests with vanilla 5.8.0-rc6]
> >
> > utente@utente-desktop:~/piglit$ ./piglit run gpu .
> > [26714/26714] skip: 1731, pass: 24673, warn: 13, fail: 283, crash: 14
> > Thank you for running Piglit!
> > Results have been written to /home/utente/piglit
> >
> > In the attachment the comparison of "5.8.0-rc6-amddcsi" vs "5.8.0-rc6"
> vanilla
> > and viceversa, I see no significant regression and in the delta of
> failed tests I don't recognize DC related test cases,
> > but you may also have a look.
>
> Looks good to me.  The series is:
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>

Thank you Alex for review and the help in finalizing the series
and to Harry who initially encouraged me and provided the feedbacks to
previous v2 series



>
> >
> > dmesg for "5.8.0-rc6-amddcsi" is also provide the check the crashes
> >
> > Regarding the other user testing the series with Firepro W5130M
> > he found an already existing issue in amdgpu si_support=1 which is
> independent from my series and matches a problem alrady reported. [1]
> >
>
> amdgpu does not currently implement GPU reset support for SI.
>
> Alex
>

If you have in the plans to add support and prevent those crashes,
the user would be glad to be available for glxgears and piglit testing
on Firepro W5130M

Please let me know

Mauro


>
> > Mauro
> >
> > [1] https://bbs.archlinux.org/viewtopic.php?id=249097
> >
> >>
> >>
> >> >
> >> >>
> >> >>
> >> >> >
> >> >> > Mauro
> >> >> >
> >> >> > >
> >> >> > > Am 16.07.20 um 23:22 schrieb Mauro Rossi:
> >> >> > > > The series adds SI support to AMD DC
> >> >> > > >
> >> >> > > > Changelog:
> >> >> > > >
> >> >> > > > [RFC]
> >> >> > > > Preliminar Proof Of Concept, with DCE8 headers still used in
> dce60_resources.c
> >> >> > > >
> >> >> > > > [PATCH v2]
> >> >> > > > Rebase on amd-staging-drm-next dated 17-Oct-2018
> >> >> > > >
> >> >> > > > [PATCH v3]
> >> >> > > > Add support for DCE6 specific headers,
> >> >> > > > ad hoc DCE6 macros, funtions and fixes,
> >> >> > > > rebase on current amd-staging-drm-next
> >> >> > > >
> >> >> > > >
> >> >> > > > Commits [01/27]..[08/27] SI support added in various DC
> components
> >> >> > > >
> >> >> > > > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers
> (v6)
> >> >> > > > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
> >> >> > > > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6
> support (v9b)
> >> >> > > > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support
> (v2)
> >> >> > > > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
> >> >> > > > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for
> DCE6 (v2)
> >> >> > > > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6
> (v4)
> >> >> > > > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support
> (v4)
> >> >> > > >
> >> >> > > > Commits [09/27]..[24/27] DCE6 specific code adaptions
> >> >> > > >
> >> >> > > > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for
> SI parts (v2)
> >> >> > > > [PATCH v3 10/27] drm/amd/display: dc/dce60: set
> max_cursor_size to 64
> >> >> > > > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific
> macros,functions
> >> >> > > > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific
> macros
> >> >> > > > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific
> macros,functions
> >> >> > > > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific
> macros,functions
> >> >> > > > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6
> specific macros,functions
> >> >> > > > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6
> specific macros,functions
> >> >> > > > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific
> macros,functions
> >> >> > > > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6
> specific macros,functions
> >> >> > > > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers
> (v7)
> >> >> > > > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling
> Horizontal Filter Init
> >> >> > > > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6
> macros,functions
> >> >> > > > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6
> specific .cursor_lock
> >> >> > > > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add
> DCE6 specific functions
> >> >> > > > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers
> (v6)
> >> >> > > >
> >> >> > > >
> >> >> > > > Commits [25/27]..[27/27] SI support final enablements
> >> >> > > >
> >> >> > > > [PATCH v3 25/27] drm/amd/display: create plane rotation
> property for Bonarie and later
> >> >> > > > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts
> (v2)
> >> >> > > > [PATCH v3 27/27] drm/amd/display: enable SI support in the
> Kconfig (v2)
> >> >> > > >
> >> >> > > >
> >> >> > > > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
> >> >> > > >
> >> >> > > > _______________________________________________
> >> >> > > > amd-gfx mailing list
> >> >> > > > amd-gfx@lists.freedesktop.org
> >> >> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >> >> > >
> >> >> > _______________________________________________
> >> >> > amd-gfx mailing list
> >> >> > amd-gfx@lists.freedesktop.org
> >> >> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

[-- Attachment #1.2: Type: text/html, Size: 16193 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-07-27 19:46               ` Re: Mauro Rossi
@ 2020-07-27 19:54                 ` Alex Deucher
  0 siblings, 0 replies; 1546+ messages in thread
From: Alex Deucher @ 2020-07-27 19:54 UTC (permalink / raw)
  To: Mauro Rossi
  Cc: Deucher, Alexander, Harry Wentland, Christian Koenig,
	amd-gfx list

On Mon, Jul 27, 2020 at 3:46 PM Mauro Rossi <issor.oruam@gmail.com> wrote:
>
>
>
> On Mon, Jul 27, 2020 at 8:31 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>>
>> On Sun, Jul 26, 2020 at 11:31 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>> >
>> > Hello,
>> >
>> > On Fri, Jul 24, 2020 at 8:31 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>> >>
>> >> On Wed, Jul 22, 2020 at 3:57 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>> >> >
>> >> > Hello,
>> >> > re-sending and copying full DL
>> >> >
>> >> > On Wed, Jul 22, 2020 at 4:51 AM Alex Deucher <alexdeucher@gmail.com> wrote:
>> >> >>
>> >> >> On Mon, Jul 20, 2020 at 6:00 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>> >> >> >
>> >> >> > Hi Christian,
>> >> >> >
>> >> >> > On Mon, Jul 20, 2020 at 11:00 AM Christian König
>> >> >> > <ckoenig.leichtzumerken@gmail.com> wrote:
>> >> >> > >
>> >> >> > > Hi Mauro,
>> >> >> > >
>> >> >> > > I'm not deep into the whole DC design, so just some general high level
>> >> >> > > comments on the cover letter:
>> >> >> > >
>> >> >> > > 1. Please add a subject line to the cover letter, my spam filter thinks
>> >> >> > > that this is suspicious otherwise.
>> >> >> >
>> >> >> > My mistake in the editing of covert letter with git send-email,
>> >> >> > I may have forgot to keep the Subject at the top
>> >> >> >
>> >> >> > >
>> >> >> > > 2. Then you should probably note how well (badly?) is that tested. Since
>> >> >> > > you noted proof of concept it might not even work.
>> >> >> >
>> >> >> > The Changelog is to be read as:
>> >> >> >
>> >> >> > [RFC] was the initial Proof of concept was the RFC and [PATCH v2] was
>> >> >> > just a rebase onto amd-staging-drm-next
>> >> >> >
>> >> >> > this series [PATCH v3] has all the known changes required for DCE6 specificity
>> >> >> > and based on a long offline thread with Alexander Deutcher and past
>> >> >> > dri-devel chats with Harry Wentland.
>> >> >> >
>> >> >> > It was tested for my possibilities of testing with HD7750 and HD7950,
>> >> >> > with checks in dmesg output for not getting "missing registers/masks"
>> >> >> > kernel WARNING
>> >> >> > and with kernel build on Ubuntu 20.04 and with android-x86
>> >> >> >
>> >> >> > The proposal I made to Alex is that AMD testing systems will be used
>> >> >> > for further regression testing,
>> >> >> > as part of review and validation for eligibility to amd-staging-drm-next
>> >> >> >
>> >> >>
>> >> >> We will certainly test it once it lands, but presumably this is
>> >> >> working on the SI cards you have access to?
>> >> >
>> >> >
>> >> > Yes, most of my testing was done with android-x86  Android CTS (EGL, GLES2, GLES3, VK)
>> >> >
>> >> > I am also in contact with a person with Firepro W5130M who is running a piglit session
>> >> >
>> >> > I had bought an HD7850 to test with Pitcairn, but it arrived as defective so I could not test with Pitcair
>> >> >
>> >> >
>> >> >>
>> >> >> > >
>> >> >> > > 3. How feature complete (HDMI audio?, Freesync?) is it?
>> >> >> >
>> >> >> > All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
>> >> >> > DCE6 (dc/dce60 path) in the last two years from initial submission
>> >> >> >
>> >> >> > >
>> >> >> > > Apart from that it looks like a rather impressive piece of work :)
>> >> >> > >
>> >> >> > > Cheers,
>> >> >> > > Christian.
>> >> >> >
>> >> >> > Thanks,
>> >> >> > please consider that most of the latest DCE6 specific parts were
>> >> >> > possible due to recent Alex support in getting the correct DCE6
>> >> >> > headers,
>> >> >> > his suggestions and continuous feedback.
>> >> >> >
>> >> >> > I would suggest that Alex comments on the proposed next steps to follow.
>> >> >>
>> >> >> The code looks pretty good to me.  I'd like to get some feedback from
>> >> >> the display team to see if they have any concerns, but beyond that I
>> >> >> think we can pull it into the tree and continue improving it there.
>> >> >> Do you have a link to a git tree I can pull directly that contains
>> >> >> these patches?  Is this the right branch?
>> >> >> https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_next
>> >> >>
>> >> >> Thanks!
>> >> >>
>> >> >> Alex
>> >> >
>> >> >
>> >> > The following branch was pushed with the series on top of amd-staging-drm-next
>> >> >
>> >> > https://github.com/maurossi/linux/commits/kernel-5.6_si_drm-next
>> >>
>> >> I gave this a quick test on all of the SI asics and the various
>> >> monitors I had available and it looks good.  A few minor patches I
>> >> noticed are attached.  If they look good to you, I'll squash them into
>> >> the series when I commit it.  I've pushed it to my fdo tree as well:
>> >> https://cgit.freedesktop.org/~agd5f/linux/log/?h=si_dc_support
>> >>
>> >> Thanks!
>> >>
>> >> Alex
>> >
>> >
>> > The new patches are ok and with the following infomation about piglit tests,
>> > the series may be good to go.
>> >
>> > I have performed piglit tests on Tahiti HD7950 on kernel 5.8.0-rc6 with AMD DC support for SI
>> > and comparison with vanilla kernel 5.8.0-rc6
>> >
>> > Results are the following
>> >
>> > [piglit gpu tests with kernel 5.8.0-rc6-amddcsi]
>> >
>> > utente@utente-desktop:~/piglit$ ./piglit run gpu .
>> > [26714/26714] skip: 1731, pass: 24669, warn: 15, fail: 288, crash: 11
>> > Thank you for running Piglit!
>> > Results have been written to /home/utente/piglit
>> >
>> > [piglit gpu tests with vanilla 5.8.0-rc6]
>> >
>> > utente@utente-desktop:~/piglit$ ./piglit run gpu .
>> > [26714/26714] skip: 1731, pass: 24673, warn: 13, fail: 283, crash: 14
>> > Thank you for running Piglit!
>> > Results have been written to /home/utente/piglit
>> >
>> > In the attachment the comparison of "5.8.0-rc6-amddcsi" vs "5.8.0-rc6" vanilla
>> > and viceversa, I see no significant regression and in the delta of failed tests I don't recognize DC related test cases,
>> > but you may also have a look.
>>
>> Looks good to me.  The series is:
>> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>
>
> Thank you Alex for review and the help in finalizing the series
> and to Harry who initially encouraged me and provided the feedbacks to previous v2 series
>

Thanks for sticking with this!

>
>>
>>
>> >
>> > dmesg for "5.8.0-rc6-amddcsi" is also provide the check the crashes
>> >
>> > Regarding the other user testing the series with Firepro W5130M
>> > he found an already existing issue in amdgpu si_support=1 which is independent from my series and matches a problem alrady reported. [1]
>> >
>>
>> amdgpu does not currently implement GPU reset support for SI.
>>
>> Alex
>
>
> If you have in the plans to add support and prevent those crashes,
> the user would be glad to be available for glxgears and piglit testing on Firepro W5130M

Initial patch here:
https://patchwork.freedesktop.org/patch/380648/

Alex

>
> Please let me know
>
> Mauro
>
>>
>>
>> > Mauro
>> >
>> > [1] https://bbs.archlinux.org/viewtopic.php?id=249097
>> >
>> >>
>> >>
>> >> >
>> >> >>
>> >> >>
>> >> >> >
>> >> >> > Mauro
>> >> >> >
>> >> >> > >
>> >> >> > > Am 16.07.20 um 23:22 schrieb Mauro Rossi:
>> >> >> > > > The series adds SI support to AMD DC
>> >> >> > > >
>> >> >> > > > Changelog:
>> >> >> > > >
>> >> >> > > > [RFC]
>> >> >> > > > Preliminar Proof Of Concept, with DCE8 headers still used in dce60_resources.c
>> >> >> > > >
>> >> >> > > > [PATCH v2]
>> >> >> > > > Rebase on amd-staging-drm-next dated 17-Oct-2018
>> >> >> > > >
>> >> >> > > > [PATCH v3]
>> >> >> > > > Add support for DCE6 specific headers,
>> >> >> > > > ad hoc DCE6 macros, funtions and fixes,
>> >> >> > > > rebase on current amd-staging-drm-next
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > Commits [01/27]..[08/27] SI support added in various DC components
>> >> >> > > >
>> >> >> > > > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
>> >> >> > > > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
>> >> >> > > > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support (v9b)
>> >> >> > > > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
>> >> >> > > > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
>> >> >> > > > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
>> >> >> > > > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
>> >> >> > > > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
>> >> >> > > >
>> >> >> > > > Commits [09/27]..[24/27] DCE6 specific code adaptions
>> >> >> > > >
>> >> >> > > > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI parts (v2)
>> >> >> > > > [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
>> >> >> > > > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific macros,functions
>> >> >> > > > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
>> >> >> > > > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific macros,functions
>> >> >> > > > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific macros,functions
>> >> >> > > > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6 specific macros,functions
>> >> >> > > > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific macros,functions
>> >> >> > > > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific macros,functions
>> >> >> > > > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific macros,functions
>> >> >> > > > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
>> >> >> > > > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init
>> >> >> > > > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6 macros,functions
>> >> >> > > > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6 specific .cursor_lock
>> >> >> > > > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6 specific functions
>> >> >> > > > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > Commits [25/27]..[27/27] SI support final enablements
>> >> >> > > >
>> >> >> > > > [PATCH v3 25/27] drm/amd/display: create plane rotation property for Bonarie and later
>> >> >> > > > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
>> >> >> > > > [PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig (v2)
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
>> >> >> > > >
>> >> >> > > > _______________________________________________
>> >> >> > > > amd-gfx mailing list
>> >> >> > > > amd-gfx@lists.freedesktop.org
>> >> >> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> >> >> > >
>> >> >> > _______________________________________________
>> >> >> > amd-gfx mailing list
>> >> >> > amd-gfx@lists.freedesktop.org
>> >> >> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-08-12 10:54 Alex Anadi
  0 siblings, 0 replies; 1546+ messages in thread
From: Alex Anadi @ 2020-08-12 10:54 UTC (permalink / raw)


Attention: Sir/Madam,

Compliments of the season.

I am Mr Alex Anadi a senior staff of Computer Telex Dept of central
bank of Nigeria.

I decided to contact you because of the prevailing security report
reaching my office and the intense nature of polity in Nigeria.

This is to inform you about the recent plan of federal government of
Nigeria to send your fund to you via diplomatic immunity CASH DELIVERY
SYSTEM valued at $10.6 Million United states dollars only, contact me
for further details.

Regards,
Mr Alex Anadi.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-08-06 22:31 ` Konrad Dybcio
@ 2020-08-12 13:37   ` Amit Pundir
  0 siblings, 0 replies; 1546+ messages in thread
From: Amit Pundir @ 2020-08-12 13:37 UTC (permalink / raw)
  To: Konrad Dybcio
  Cc: Andy Gross, Bjorn Andersson, dt, John Stultz, linux-arm-msm, lkml,
	Rob Herring, Sumit Semwal

On Fri, 7 Aug 2020 at 04:02, Konrad Dybcio <konradybcio@gmail.com> wrote:
>
> Subject: Re: [PATCH v4] arm64: dts: qcom: Add support for Xiaomi Poco F1 (Beryllium)
>
> >// This removed_region is needed to boot the device
> >               // TODO: Find out the user of this reserved memory
> >               removed_region: memory@88f00000 {
>
> This region seems to belong to the Trust Zone. When Linux tries to access it, TZ bites and shuts the device down.

That is totally possible. Plus it falls right in between TZ and QSEE
reserved-memory regions. However, I do not find any credible source
of information which can confirm this. So I'm hesitant to update the
TODO item in the above comment.

>
> Konrad

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-11-06 10:44 Luis Gerhorst
@ 2020-11-06 14:34 ` Pavel Begunkov
  0 siblings, 0 replies; 1546+ messages in thread
From: Pavel Begunkov @ 2020-11-06 14:34 UTC (permalink / raw)
  To: Luis Gerhorst; +Cc: axboe, io-uring, metze, carter.li

On 06/11/2020 10:44, Luis Gerhorst wrote:
> Hello Pavel,
> 
> I'm from a university and am searching for a project to work on in the
> upcoming year. I am looking into allowing userspace to run multiple
> system calls interleaved with application-specific logic using a single
> context switch.
> 
> I noticed that you, Jens Axboe, and Carter Li discussed the possibility
> of integrating eBPF into io_uring earlier this year [1, 2, 3]. Is there
> any WIP on this topic?

To be honest, I've finally returned to it a week ago, just because got
more free time. I was implicitly patching/refactoring some bits keeping
this in mind but rather very lazily.

> If not I am considering to implement this. Besides the fact that AOT
> eBPF is only supported for priviledged processes, are there any issues
> you are aware of or reasons why this was not implemented yet?

All others I was anticipating are gone by now. I'd be really great to
think something out for non-privileged processes, but as you know that
doesn't hold us off.

> [1] https://lore.kernel.org/io-uring/67b28e66-f2f8-99a1-dfd1-14f753d11f7a@gmail.com/
> [2] https://lore.kernel.org/io-uring/8b3f182c-7c4b-da41-7ec8-bb4f22429ed1@kernel.dk/
> [3] https://github.com/axboe/liburing/issues/58
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-11-30 10:31 Oleksandr Tyshchenko
@ 2020-11-30 16:21 ` Alex Bennée
  2020-12-29 15:32   ` Re: Roger Pau Monné
  0 siblings, 1 reply; 1546+ messages in thread
From: Alex Bennée @ 2020-11-30 16:21 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Oleksandr Tyshchenko, Paul Durrant, Jan Beulich,
	Andrew Cooper, Roger Pau Monné, Wei Liu, Julien Grall,
	George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini,
	Tim Deegan, Daniel De Graaf, Volodymyr Babchuk, Jun Nakajima,
	Kevin Tian, Anthony PERARD, Bertrand Marquis, Wei Chen, Kaly Xin,
	Artem Mygaiev

Oleksandr Tyshchenko <olekstysh@gmail.com> writes:

> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
>
> Date: Sat, 28 Nov 2020 22:33:51 +0200
> Subject: [PATCH V3 00/23] IOREQ feature (+ virtio-mmio) on Arm
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> Hello all.
>
> The purpose of this patch series is to add IOREQ/DM support to Xen on Arm.
> You can find an initial discussion at [1] and RFC/V1/V2 series at [2]/[3]/[4].
> Xen on Arm requires some implementation to forward guest MMIO access to a device
> model in order to implement virtio-mmio backend or even mediator outside of hypervisor.
> As Xen on x86 already contains required support this series tries to make it common
> and introduce Arm specific bits plus some new functionality. Patch series is based on
> Julien's PoC "xen/arm: Add support for Guest IO forwarding to a device emulator".
> Besides splitting existing IOREQ/DM support and introducing Arm side, the series
> also includes virtio-mmio related changes (last 2 patches for toolstack)
> for the reviewers to be able to see how the whole picture could look
> like.

Thanks for posting the latest version.

>
> According to the initial discussion there are a few open questions/concerns
> regarding security, performance in VirtIO solution:
> 1. virtio-mmio vs virtio-pci, SPI vs MSI, different use-cases require different
>    transport...

I think I'm repeating things here I've said in various ephemeral video
chats over the last few weeks but I should probably put things down on
the record.

I think the original intention of the virtio framers is advanced
features would build on virtio-pci because you get a bunch of things
"for free" - notably enumeration and MSI support. There is assumption
that by the time you add these features to virtio-mmio you end up
re-creating your own less well tested version of virtio-pci. I've not
been terribly convinced by the argument that the guest implementation of
PCI presents a sufficiently large blob of code to make the simpler MMIO
desirable. My attempts to build two virtio kernels (PCI/MMIO) with
otherwise the same devices wasn't terribly conclusive either way.

That said virtio-mmio still has life in it because the cloudy slimmed
down guests moved to using it because the enumeration of PCI is a road
block to their fast boot up requirements. I'm sure they would also
appreciate a MSI implementation to reduce the overhead that handling
notifications currently has on trap-and-emulate.

AIUI for Xen the other downside to PCI is you would have to emulate it
in the hypervisor which would be additional code at the most privileged
level.

> 2. virtio backend is able to access all guest memory, some kind of protection
>    is needed: 'virtio-iommu in Xen' vs 'pre-shared-memory & memcpys in
>    guest'

This is also an area of interest for Project Stratos and something we
would like to be solved generally for all hypervisors. There is a good
write up of some approaches that Jean Phillipe did on the stratos
mailing list:

  From: Jean-Philippe Brucker <jean-philippe@linaro.org>
  Subject: Limited memory sharing investigation
  Message-ID: <20201002134336.GA2196245@myrica>

I suspect there is a good argument for the simplicity of a combined
virt queue but it is unlikely to be very performance orientated.

> 3. interface between toolstack and 'out-of-qemu' virtio backend, avoid using
>    Xenstore in virtio backend if possible.

I wonder how much work it would be for a rust expert to make:

  https://github.com/slp/vhost-user-blk

handle an IOREQ signalling pathway instead of the vhost-user/eventfd
pathway? That would give a good indication on how "hypervisor blind"
these daemons could be made.

<snip>
>
> Please note, build-test passed for the following modes:
> 1. x86: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y (default)
> 2. x86: #CONFIG_HVM is not set / #CONFIG_IOREQ_SERVER is not set
> 3. Arm64: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y

Forgive my relative newness to Xen, how do I convince the hypervisor to
build with this on? I've tried variants of:

  make -j9 CROSS_COMPILE=aarch64-linux-gnu- XEN_TARGET_ARCH=arm64 menuconfig XEN_EXPERT=y [CONFIG_|XEN_|_]IOREQ_SERVER=y

with no joy...

> 4. Arm64: CONFIG_HVM=y / #CONFIG_IOREQ_SERVER is not set  (default)
> 5. Arm32: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y
> 6. Arm32: CONFIG_HVM=y / #CONFIG_IOREQ_SERVER is not set  (default)
>
> ***
>
> Any feedback/help would be highly appreciated.
<snip>

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CAGMNF6W8baS_zLYL8DwVsbfPWTP2ohzRB7xutW0X=MUzv93pbA@mail.gmail.com>
@ 2020-12-02 17:09   ` Kun Yi
  0 siblings, 0 replies; 1546+ messages in thread
From: Kun Yi @ 2020-12-02 17:09 UTC (permalink / raw)
  To: Kun Yi, Guenter Roeck, robh+dt, Venkatesh, Supreeth
  Cc: OpenBMC Maillist, linux-hwmon, linux-kernel

Much apologies for the super late reply.. I was out for an extended
period of time due to personal circumstances.
I have now addressed most of the comments in the v4 series.

Also cc'ed Supreeth who works on the AMD System Manageability stack.

On Wed, Dec 2, 2020 at 8:57 AM Kun Yi <kunyi@google.com> wrote:
>
> On Sat, Apr 04, 2020 at 08:01:16PM -0700, Kun Yi wrote:
> > SB Temperature Sensor Interface (SB-TSI) is an SMBus compatible
> > interface that reports AMD SoC's Ttcl (normalized temperature),
> > and resembles a typical 8-pin remote temperature sensor's I2C interface
> > to BMC.
> >
> > This commit adds basic support using this interface to read CPU
> > temperature, and read/write high/low CPU temp thresholds.
> >
> > To instantiate this driver on an AMD CPU with SB-TSI
> > support, the i2c bus number would be the bus connected from the board
> > management controller (BMC) to the CPU. The i2c address is specified in
> > Section 6.3.1 of the spec [1]: The SB-TSI address is normally 98h for socket 0
> > and 90h for socket 1, but it could vary based on hardware address select pins.
> >
> > [1]: https://www.amd.com/system/files/TechDocs/56255_OSRR.pdf
> >
> > Test status: tested reading temp1_input, and reading/writing
> > temp1_max/min.
> >
> > Signed-off-by: Kun Yi <kunyi at google.com>
> > ---
> >  drivers/hwmon/Kconfig      |  10 ++
> >  drivers/hwmon/Makefile     |   1 +
> >  drivers/hwmon/sbtsi_temp.c | 259 +++++++++++++++++++++++++++++++++++++
> >  3 files changed, 270 insertions(+)
> >  create mode 100644 drivers/hwmon/sbtsi_temp.c
> >
> > diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
> > index 05a30832c6ba..9585dcd01d1b 100644
> > --- a/drivers/hwmon/Kconfig
> > +++ b/drivers/hwmon/Kconfig
> > @@ -1412,6 +1412,16 @@ config SENSORS_RASPBERRYPI_HWMON
> >    This driver can also be built as a module. If so, the module
> >    will be called raspberrypi-hwmon.
> >
> > +config SENSORS_SBTSI
> > + tristate "Emulated SB-TSI temperature sensor"
> > + depends on I2C
> > + help
> > +  If you say yes here you get support for emulated temperature
> > +  sensors on AMD SoCs with SB-TSI interface connected to a BMC device.
> > +
> > +  This driver can also be built as a module. If so, the module will
> > +  be called sbtsi_temp.
> > +
> >  config SENSORS_SHT15
> >   tristate "Sensiron humidity and temperature sensors. SHT15 and compat."
> >   depends on GPIOLIB || COMPILE_TEST
> > diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
> > index b0b9c8e57176..cd109f003ce4 100644
> > --- a/drivers/hwmon/Makefile
> > +++ b/drivers/hwmon/Makefile
> > @@ -152,6 +152,7 @@ obj-$(CONFIG_SENSORS_POWR1220)  += powr1220.o
> >  obj-$(CONFIG_SENSORS_PWM_FAN) += pwm-fan.o
> >  obj-$(CONFIG_SENSORS_RASPBERRYPI_HWMON) += raspberrypi-hwmon.o
> >  obj-$(CONFIG_SENSORS_S3C) += s3c-hwmon.o
> > +obj-$(CONFIG_SENSORS_SBTSI) += sbtsi_temp.o
> >  obj-$(CONFIG_SENSORS_SCH56XX_COMMON)+= sch56xx-common.o
> >  obj-$(CONFIG_SENSORS_SCH5627) += sch5627.o
> >  obj-$(CONFIG_SENSORS_SCH5636) += sch5636.o
> > diff --git a/drivers/hwmon/sbtsi_temp.c b/drivers/hwmon/sbtsi_temp.c
> > new file mode 100644
> > index 000000000000..e3ad6a9f7ec1
> > --- /dev/null
> > +++ b/drivers/hwmon/sbtsi_temp.c
> > @@ -0,0 +1,259 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * sbtsi_temp.c - hwmon driver for a SBI Temperature Sensor Interface (SB-TSI)
> > + *                compliant AMD SoC temperature device.
> > + *
> > + * Copyright (c) 2020, Google Inc.
> > + * Copyright (c) 2020, Kun Yi <kunyi at google.com>
> > + */
> > +
> > +#include <linux/err.h>
> > +#include <linux/i2c.h>
> > +#include <linux/init.h>
> > +#include <linux/hwmon.h>
> > +#include <linux/module.h>
> > +#include <linux/mutex.h>
> > +#include <linux/of_device.h>
> > +#include <linux/of.h>
> > +
> > +/*
> > + * SB-TSI registers only support SMBus byte data access. "_INT" registers are
> > + * the integer part of a temperature value or limit, and "_DEC" registers are
> > + * corresponding decimal parts.
> > + */
> > +#define SBTSI_REG_TEMP_INT 0x01 /* RO */
> > +#define SBTSI_REG_STATUS 0x02 /* RO */
> > +#define SBTSI_REG_CONFIG 0x03 /* RO */
> > +#define SBTSI_REG_TEMP_HIGH_INT 0x07 /* RW */
> > +#define SBTSI_REG_TEMP_LOW_INT 0x08 /* RW */
> > +#define SBTSI_REG_TEMP_DEC 0x10 /* RW */
> > +#define SBTSI_REG_TEMP_HIGH_DEC 0x13 /* RW */
> > +#define SBTSI_REG_TEMP_LOW_DEC 0x14 /* RW */
> > +#define SBTSI_REG_REV 0xFF /* RO */
>
> The revision register is not actually used.
Thanks. Removed. I agree that the register is not well documented, at
least publicly.
It shouldn't affect functionality of this driver, so I removed the
definition altogether.
>
> > +
> > +#define SBTSI_CONFIG_READ_ORDER_SHIFT 5
> > +
> > +#define SBTSI_TEMP_MIN 0
> > +#define SBTSI_TEMP_MAX 255875
> > +#define SBTSI_REV_MAX_VALID_ID 4
>
> Not actually used, and I am not sure if it would make sense to check it.
> If at all, it would only make sense if you also check SBTSIxFE (Manufacture
> ID). Unfortunately, the actual SB-TSI specification seems to be non-public,
> so I can't check if the driver as-is supports versions 0..3 (assuming those
> exist).

Thanks. Removed.

>
> > +
> > +/* Each client has this additional data */
> > +struct sbtsi_data {
> > + struct i2c_client *client;
> > + struct mutex lock;
> > +};
> > +
> > +/*
> > + * From SB-TSI spec: CPU temperature readings and limit registers encode the
> > + * temperature in increments of 0.125 from 0 to 255.875. The "high byte"
> > + * register encodes the base-2 of the integer portion, and the upper 3 bits of
> > + * the "low byte" encode in base-2 the decimal portion.
> > + *
> > + * e.g. INT=0x19, DEC=0x20 represents 25.125 degrees Celsius
> > + *
> > + * Therefore temperature in millidegree Celsius =
> > + *   (INT + DEC / 256) * 1000 = (INT * 8 + DEC / 32) * 125
> > + */
> > +static inline int sbtsi_reg_to_mc(s32 integer, s32 decimal)
> > +{
> > + return ((integer << 3) + (decimal >> 5)) * 125;
> > +}
> > +
> > +/*
> > + * Inversely, given temperature in millidegree Celsius
> > + *   INT = (TEMP / 125) / 8
> > + *   DEC = ((TEMP / 125) % 8) * 32
> > + * Caller have to make sure temp doesn't exceed 255875, the max valid value.
> > + */
> > +static inline void sbtsi_mc_to_reg(s32 temp, u8 *integer, u8 *decimal)
> > +{
> > + temp /= 125;
> > + *integer = temp >> 3;
> > + *decimal = (temp & 0x7) << 5;
> > +}
> > +
> > +static int sbtsi_read(struct device *dev, enum hwmon_sensor_types type,
> > +      u32 attr, int channel, long *val)
> > +{
> > + struct sbtsi_data *data = dev_get_drvdata(dev);
> > + s32 temp_int, temp_dec;
> > + int err, reg_int, reg_dec;
> > + u8 read_order;
> > +
> > + if (type != hwmon_temp)
> > + return -EINVAL;
> > +
> > + read_order = 0;
> > + switch (attr) {
> > + case hwmon_temp_input:
> > + /*
> > + * ReadOrder bit specifies the reading order of integer and
> > + * decimal part of CPU temp for atomic reads. If bit == 0,
> > + * reading integer part triggers latching of the decimal part,
> > + * so integer part should be read first. If bit == 1, read
> > + * order should be reversed.
> > + */
> > + err = i2c_smbus_read_byte_data(data->client, SBTSI_REG_CONFIG);
> > + if (err < 0)
> > + return err;
> > +
> As I understand it, the idea is to set this configuration bit once and then
> just use it. Any chance to do that ? This would save an i2c read operation
> each time the temperature is read, and the if/else complexity below.

Unfortunately, the read-order register bit is read-only.

>
> > + read_order = (u8)err & BIT(SBTSI_CONFIG_READ_ORDER_SHIFT);
>
> Nit: typecast is unnecessary.

Done.

>
> > + reg_int = SBTSI_REG_TEMP_INT;
> > + reg_dec = SBTSI_REG_TEMP_DEC;
> > + break;
> > + case hwmon_temp_max:
> > + reg_int = SBTSI_REG_TEMP_HIGH_INT;
> > + reg_dec = SBTSI_REG_TEMP_HIGH_DEC;
> > + break;
> > + case hwmon_temp_min:
> > + reg_int = SBTSI_REG_TEMP_LOW_INT;
> > + reg_dec = SBTSI_REG_TEMP_LOW_DEC;
> > + break;
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > + if (read_order == 0) {
> > + temp_int = i2c_smbus_read_byte_data(data->client, reg_int);
> > + temp_dec = i2c_smbus_read_byte_data(data->client, reg_dec);
> > + } else {
> > + temp_dec = i2c_smbus_read_byte_data(data->client, reg_dec);
> > + temp_int = i2c_smbus_read_byte_data(data->client, reg_int);
> > + }
>
> Just a thought: if you use regmap and tell it that the limit registers
> are non-volatile, this wouldn't actually read from the chip more than once.

That's a great suggestion, although in our normal use cases the limit
values are read and cached by the
userspace application. Seems changing to regmap would require some
messaging of the code. Would it
be acceptable to keep the initial driver as-is and do that in a following patch?

>
> Also, since the read involves reading two registers, and the first read
> locks the value for the second, you'll need mutex protection when reading
> the current temperature (not for limits, though).

Added mutex locking before/after the temp input reading.

>
> > +
> > + if (temp_int < 0)
> > + return temp_int;
> > + if (temp_dec < 0)
> > + return temp_dec;
> > +
> > + *val = sbtsi_reg_to_mc(temp_int, temp_dec);
> > +
> > + return 0;
> > +}
> > +
> > +static int sbtsi_write(struct device *dev, enum hwmon_sensor_types type,
> > +       u32 attr, int channel, long val)
> > +{
> > + struct sbtsi_data *data = dev_get_drvdata(dev);
> > + int reg_int, reg_dec, err;
> > + u8 temp_int, temp_dec;
> > +
> > + if (type != hwmon_temp)
> > + return -EINVAL;
> > +
> > + switch (attr) {
> > + case hwmon_temp_max:
> > + reg_int = SBTSI_REG_TEMP_HIGH_INT;
> > + reg_dec = SBTSI_REG_TEMP_HIGH_DEC;
> > + break;
> > + case hwmon_temp_min:
> > + reg_int = SBTSI_REG_TEMP_LOW_INT;
> > + reg_dec = SBTSI_REG_TEMP_LOW_DEC;
> > + break;
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > + val = clamp_val(val, SBTSI_TEMP_MIN, SBTSI_TEMP_MAX);
> > + mutex_lock(&data->lock);
> > + sbtsi_mc_to_reg(val, &temp_int, &temp_dec);
> > + err = i2c_smbus_write_byte_data(data->client, reg_int, temp_int);
> > + if (err)
> > + goto exit;
> > +
> > + err = i2c_smbus_write_byte_data(data->client, reg_dec, temp_dec);
> > +exit:
> > + mutex_unlock(&data->lock);
> > + return err;
> > +}
> > +
> > +static umode_t sbtsi_is_visible(const void *data,
> > + enum hwmon_sensor_types type,
> > + u32 attr, int channel)
> > +{
> > + switch (type) {
> > + case hwmon_temp:
> > + switch (attr) {
> > + case hwmon_temp_input:
> > + return 0444;
> > + case hwmon_temp_min:
> > + return 0644;
> > + case hwmon_temp_max:
> > + return 0644;
> > + }
> > + break;
> > + default:
> > + break;
> > + }
> > + return 0;
> > +}
> > +
> > +static const struct hwmon_channel_info *sbtsi_info[] = {
> > + HWMON_CHANNEL_INFO(chip,
> > +   HWMON_C_REGISTER_TZ),
> > + HWMON_CHANNEL_INFO(temp,
> > +   HWMON_T_INPUT | HWMON_T_MIN | HWMON_T_MAX),
>
> For your consideration: SB-TSI supports reporting high/low alerts.
> With this, it would be possible to implement respective alarm attributes.
> In conjunction with https://patchwork.kernel.org/patch/11277347/mbox/,
> it should also be possible to add interrupt and thus userspace notification
> for those attributes.
>
> SBTSI also supports setting the update rate (SBTSIx04) and setting
> the temperature offset (SBTSIx11, SBTSIx12), which could also be
> implemented as standard attributes.
>
> I won't require that for the initial version, just something to keep
> in mind.

Ack and thanks for the suggestions. I will keep in mind for future improvements.


>
> > + NULL
> > +};
> > +
> > +static const struct hwmon_ops sbtsi_hwmon_ops = {
> > + .is_visible = sbtsi_is_visible,
> > + .read = sbtsi_read,
> > + .write = sbtsi_write,
> > +};
> > +
> > +static const struct hwmon_chip_info sbtsi_chip_info = {
> > + .ops = &sbtsi_hwmon_ops,
> > + .info = sbtsi_info,
> > +};
> > +
> > +static int sbtsi_probe(struct i2c_client *client,
> > +       const struct i2c_device_id *id)
> > +{
> > + struct device *dev = &client->dev;
> > + struct device *hwmon_dev;
> > + struct sbtsi_data *data;
> > +
> > + data = devm_kzalloc(dev, sizeof(struct sbtsi_data), GFP_KERNEL);
> > + if (!data)
> > + return -ENOMEM;
> > +
> > + data->client = client;
> > + mutex_init(&data->lock);
> > +
> > + hwmon_dev =
> > + devm_hwmon_device_register_with_info(dev, client->name, data,
> > +     &sbtsi_chip_info, NULL);
> > +
> > + return PTR_ERR_OR_ZERO(hwmon_dev);
> > +}
> > +
> > +static const struct i2c_device_id sbtsi_id[] = {
> > + {"sbtsi", 0},
> > + {}
> > +};
> > +MODULE_DEVICE_TABLE(i2c, sbtsi_id);
> > +
> > +static const struct of_device_id __maybe_unused sbtsi_of_match[] = {
> > + {
> > + .compatible = "amd,sbtsi",
> > + },
> > + { },
> > +};
> > +MODULE_DEVICE_TABLE(of, sbtsi_of_match);
> > +
> > +static struct i2c_driver sbtsi_driver = {
> > + .class = I2C_CLASS_HWMON,
> > + .driver = {
> > + .name = "sbtsi",
> > + .of_match_table = of_match_ptr(sbtsi_of_match),
> > + },
> > + .probe = sbtsi_probe,
> > + .id_table = sbtsi_id,
> > +};
> > +
> > +module_i2c_driver(sbtsi_driver);
> > +
> > +MODULE_AUTHOR("Kun Yi <kunyi at google.com>");
> > +MODULE_DESCRIPTION("Hwmon driver for AMD SB-TSI emulated sensor");
> > +MODULE_LICENSE("GPL");



--
Regards,
Kun

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2020-12-02 17:09   ` Kun Yi
  0 siblings, 0 replies; 1546+ messages in thread
From: Kun Yi @ 2020-12-02 17:09 UTC (permalink / raw)
  To: Kun Yi, Guenter Roeck, robh+dt, Venkatesh, Supreeth
  Cc: linux-hwmon, OpenBMC Maillist, linux-kernel

Much apologies for the super late reply.. I was out for an extended
period of time due to personal circumstances.
I have now addressed most of the comments in the v4 series.

Also cc'ed Supreeth who works on the AMD System Manageability stack.

On Wed, Dec 2, 2020 at 8:57 AM Kun Yi <kunyi@google.com> wrote:
>
> On Sat, Apr 04, 2020 at 08:01:16PM -0700, Kun Yi wrote:
> > SB Temperature Sensor Interface (SB-TSI) is an SMBus compatible
> > interface that reports AMD SoC's Ttcl (normalized temperature),
> > and resembles a typical 8-pin remote temperature sensor's I2C interface
> > to BMC.
> >
> > This commit adds basic support using this interface to read CPU
> > temperature, and read/write high/low CPU temp thresholds.
> >
> > To instantiate this driver on an AMD CPU with SB-TSI
> > support, the i2c bus number would be the bus connected from the board
> > management controller (BMC) to the CPU. The i2c address is specified in
> > Section 6.3.1 of the spec [1]: The SB-TSI address is normally 98h for socket 0
> > and 90h for socket 1, but it could vary based on hardware address select pins.
> >
> > [1]: https://www.amd.com/system/files/TechDocs/56255_OSRR.pdf
> >
> > Test status: tested reading temp1_input, and reading/writing
> > temp1_max/min.
> >
> > Signed-off-by: Kun Yi <kunyi at google.com>
> > ---
> >  drivers/hwmon/Kconfig      |  10 ++
> >  drivers/hwmon/Makefile     |   1 +
> >  drivers/hwmon/sbtsi_temp.c | 259 +++++++++++++++++++++++++++++++++++++
> >  3 files changed, 270 insertions(+)
> >  create mode 100644 drivers/hwmon/sbtsi_temp.c
> >
> > diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
> > index 05a30832c6ba..9585dcd01d1b 100644
> > --- a/drivers/hwmon/Kconfig
> > +++ b/drivers/hwmon/Kconfig
> > @@ -1412,6 +1412,16 @@ config SENSORS_RASPBERRYPI_HWMON
> >    This driver can also be built as a module. If so, the module
> >    will be called raspberrypi-hwmon.
> >
> > +config SENSORS_SBTSI
> > + tristate "Emulated SB-TSI temperature sensor"
> > + depends on I2C
> > + help
> > +  If you say yes here you get support for emulated temperature
> > +  sensors on AMD SoCs with SB-TSI interface connected to a BMC device.
> > +
> > +  This driver can also be built as a module. If so, the module will
> > +  be called sbtsi_temp.
> > +
> >  config SENSORS_SHT15
> >   tristate "Sensiron humidity and temperature sensors. SHT15 and compat."
> >   depends on GPIOLIB || COMPILE_TEST
> > diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
> > index b0b9c8e57176..cd109f003ce4 100644
> > --- a/drivers/hwmon/Makefile
> > +++ b/drivers/hwmon/Makefile
> > @@ -152,6 +152,7 @@ obj-$(CONFIG_SENSORS_POWR1220)  += powr1220.o
> >  obj-$(CONFIG_SENSORS_PWM_FAN) += pwm-fan.o
> >  obj-$(CONFIG_SENSORS_RASPBERRYPI_HWMON) += raspberrypi-hwmon.o
> >  obj-$(CONFIG_SENSORS_S3C) += s3c-hwmon.o
> > +obj-$(CONFIG_SENSORS_SBTSI) += sbtsi_temp.o
> >  obj-$(CONFIG_SENSORS_SCH56XX_COMMON)+= sch56xx-common.o
> >  obj-$(CONFIG_SENSORS_SCH5627) += sch5627.o
> >  obj-$(CONFIG_SENSORS_SCH5636) += sch5636.o
> > diff --git a/drivers/hwmon/sbtsi_temp.c b/drivers/hwmon/sbtsi_temp.c
> > new file mode 100644
> > index 000000000000..e3ad6a9f7ec1
> > --- /dev/null
> > +++ b/drivers/hwmon/sbtsi_temp.c
> > @@ -0,0 +1,259 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * sbtsi_temp.c - hwmon driver for a SBI Temperature Sensor Interface (SB-TSI)
> > + *                compliant AMD SoC temperature device.
> > + *
> > + * Copyright (c) 2020, Google Inc.
> > + * Copyright (c) 2020, Kun Yi <kunyi at google.com>
> > + */
> > +
> > +#include <linux/err.h>
> > +#include <linux/i2c.h>
> > +#include <linux/init.h>
> > +#include <linux/hwmon.h>
> > +#include <linux/module.h>
> > +#include <linux/mutex.h>
> > +#include <linux/of_device.h>
> > +#include <linux/of.h>
> > +
> > +/*
> > + * SB-TSI registers only support SMBus byte data access. "_INT" registers are
> > + * the integer part of a temperature value or limit, and "_DEC" registers are
> > + * corresponding decimal parts.
> > + */
> > +#define SBTSI_REG_TEMP_INT 0x01 /* RO */
> > +#define SBTSI_REG_STATUS 0x02 /* RO */
> > +#define SBTSI_REG_CONFIG 0x03 /* RO */
> > +#define SBTSI_REG_TEMP_HIGH_INT 0x07 /* RW */
> > +#define SBTSI_REG_TEMP_LOW_INT 0x08 /* RW */
> > +#define SBTSI_REG_TEMP_DEC 0x10 /* RW */
> > +#define SBTSI_REG_TEMP_HIGH_DEC 0x13 /* RW */
> > +#define SBTSI_REG_TEMP_LOW_DEC 0x14 /* RW */
> > +#define SBTSI_REG_REV 0xFF /* RO */
>
> The revision register is not actually used.
Thanks. Removed. I agree that the register is not well documented, at
least publicly.
It shouldn't affect functionality of this driver, so I removed the
definition altogether.
>
> > +
> > +#define SBTSI_CONFIG_READ_ORDER_SHIFT 5
> > +
> > +#define SBTSI_TEMP_MIN 0
> > +#define SBTSI_TEMP_MAX 255875
> > +#define SBTSI_REV_MAX_VALID_ID 4
>
> Not actually used, and I am not sure if it would make sense to check it.
> If at all, it would only make sense if you also check SBTSIxFE (Manufacture
> ID). Unfortunately, the actual SB-TSI specification seems to be non-public,
> so I can't check if the driver as-is supports versions 0..3 (assuming those
> exist).

Thanks. Removed.

>
> > +
> > +/* Each client has this additional data */
> > +struct sbtsi_data {
> > + struct i2c_client *client;
> > + struct mutex lock;
> > +};
> > +
> > +/*
> > + * From SB-TSI spec: CPU temperature readings and limit registers encode the
> > + * temperature in increments of 0.125 from 0 to 255.875. The "high byte"
> > + * register encodes the base-2 of the integer portion, and the upper 3 bits of
> > + * the "low byte" encode in base-2 the decimal portion.
> > + *
> > + * e.g. INT=0x19, DEC=0x20 represents 25.125 degrees Celsius
> > + *
> > + * Therefore temperature in millidegree Celsius =
> > + *   (INT + DEC / 256) * 1000 = (INT * 8 + DEC / 32) * 125
> > + */
> > +static inline int sbtsi_reg_to_mc(s32 integer, s32 decimal)
> > +{
> > + return ((integer << 3) + (decimal >> 5)) * 125;
> > +}
> > +
> > +/*
> > + * Inversely, given temperature in millidegree Celsius
> > + *   INT = (TEMP / 125) / 8
> > + *   DEC = ((TEMP / 125) % 8) * 32
> > + * Caller have to make sure temp doesn't exceed 255875, the max valid value.
> > + */
> > +static inline void sbtsi_mc_to_reg(s32 temp, u8 *integer, u8 *decimal)
> > +{
> > + temp /= 125;
> > + *integer = temp >> 3;
> > + *decimal = (temp & 0x7) << 5;
> > +}
> > +
> > +static int sbtsi_read(struct device *dev, enum hwmon_sensor_types type,
> > +      u32 attr, int channel, long *val)
> > +{
> > + struct sbtsi_data *data = dev_get_drvdata(dev);
> > + s32 temp_int, temp_dec;
> > + int err, reg_int, reg_dec;
> > + u8 read_order;
> > +
> > + if (type != hwmon_temp)
> > + return -EINVAL;
> > +
> > + read_order = 0;
> > + switch (attr) {
> > + case hwmon_temp_input:
> > + /*
> > + * ReadOrder bit specifies the reading order of integer and
> > + * decimal part of CPU temp for atomic reads. If bit == 0,
> > + * reading integer part triggers latching of the decimal part,
> > + * so integer part should be read first. If bit == 1, read
> > + * order should be reversed.
> > + */
> > + err = i2c_smbus_read_byte_data(data->client, SBTSI_REG_CONFIG);
> > + if (err < 0)
> > + return err;
> > +
> As I understand it, the idea is to set this configuration bit once and then
> just use it. Any chance to do that ? This would save an i2c read operation
> each time the temperature is read, and the if/else complexity below.

Unfortunately, the read-order register bit is read-only.

>
> > + read_order = (u8)err & BIT(SBTSI_CONFIG_READ_ORDER_SHIFT);
>
> Nit: typecast is unnecessary.

Done.

>
> > + reg_int = SBTSI_REG_TEMP_INT;
> > + reg_dec = SBTSI_REG_TEMP_DEC;
> > + break;
> > + case hwmon_temp_max:
> > + reg_int = SBTSI_REG_TEMP_HIGH_INT;
> > + reg_dec = SBTSI_REG_TEMP_HIGH_DEC;
> > + break;
> > + case hwmon_temp_min:
> > + reg_int = SBTSI_REG_TEMP_LOW_INT;
> > + reg_dec = SBTSI_REG_TEMP_LOW_DEC;
> > + break;
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > + if (read_order == 0) {
> > + temp_int = i2c_smbus_read_byte_data(data->client, reg_int);
> > + temp_dec = i2c_smbus_read_byte_data(data->client, reg_dec);
> > + } else {
> > + temp_dec = i2c_smbus_read_byte_data(data->client, reg_dec);
> > + temp_int = i2c_smbus_read_byte_data(data->client, reg_int);
> > + }
>
> Just a thought: if you use regmap and tell it that the limit registers
> are non-volatile, this wouldn't actually read from the chip more than once.

That's a great suggestion, although in our normal use cases the limit
values are read and cached by the
userspace application. Seems changing to regmap would require some
messaging of the code. Would it
be acceptable to keep the initial driver as-is and do that in a following patch?

>
> Also, since the read involves reading two registers, and the first read
> locks the value for the second, you'll need mutex protection when reading
> the current temperature (not for limits, though).

Added mutex locking before/after the temp input reading.

>
> > +
> > + if (temp_int < 0)
> > + return temp_int;
> > + if (temp_dec < 0)
> > + return temp_dec;
> > +
> > + *val = sbtsi_reg_to_mc(temp_int, temp_dec);
> > +
> > + return 0;
> > +}
> > +
> > +static int sbtsi_write(struct device *dev, enum hwmon_sensor_types type,
> > +       u32 attr, int channel, long val)
> > +{
> > + struct sbtsi_data *data = dev_get_drvdata(dev);
> > + int reg_int, reg_dec, err;
> > + u8 temp_int, temp_dec;
> > +
> > + if (type != hwmon_temp)
> > + return -EINVAL;
> > +
> > + switch (attr) {
> > + case hwmon_temp_max:
> > + reg_int = SBTSI_REG_TEMP_HIGH_INT;
> > + reg_dec = SBTSI_REG_TEMP_HIGH_DEC;
> > + break;
> > + case hwmon_temp_min:
> > + reg_int = SBTSI_REG_TEMP_LOW_INT;
> > + reg_dec = SBTSI_REG_TEMP_LOW_DEC;
> > + break;
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > + val = clamp_val(val, SBTSI_TEMP_MIN, SBTSI_TEMP_MAX);
> > + mutex_lock(&data->lock);
> > + sbtsi_mc_to_reg(val, &temp_int, &temp_dec);
> > + err = i2c_smbus_write_byte_data(data->client, reg_int, temp_int);
> > + if (err)
> > + goto exit;
> > +
> > + err = i2c_smbus_write_byte_data(data->client, reg_dec, temp_dec);
> > +exit:
> > + mutex_unlock(&data->lock);
> > + return err;
> > +}
> > +
> > +static umode_t sbtsi_is_visible(const void *data,
> > + enum hwmon_sensor_types type,
> > + u32 attr, int channel)
> > +{
> > + switch (type) {
> > + case hwmon_temp:
> > + switch (attr) {
> > + case hwmon_temp_input:
> > + return 0444;
> > + case hwmon_temp_min:
> > + return 0644;
> > + case hwmon_temp_max:
> > + return 0644;
> > + }
> > + break;
> > + default:
> > + break;
> > + }
> > + return 0;
> > +}
> > +
> > +static const struct hwmon_channel_info *sbtsi_info[] = {
> > + HWMON_CHANNEL_INFO(chip,
> > +   HWMON_C_REGISTER_TZ),
> > + HWMON_CHANNEL_INFO(temp,
> > +   HWMON_T_INPUT | HWMON_T_MIN | HWMON_T_MAX),
>
> For your consideration: SB-TSI supports reporting high/low alerts.
> With this, it would be possible to implement respective alarm attributes.
> In conjunction with https://patchwork.kernel.org/patch/11277347/mbox/,
> it should also be possible to add interrupt and thus userspace notification
> for those attributes.
>
> SBTSI also supports setting the update rate (SBTSIx04) and setting
> the temperature offset (SBTSIx11, SBTSIx12), which could also be
> implemented as standard attributes.
>
> I won't require that for the initial version, just something to keep
> in mind.

Ack and thanks for the suggestions. I will keep in mind for future improvements.


>
> > + NULL
> > +};
> > +
> > +static const struct hwmon_ops sbtsi_hwmon_ops = {
> > + .is_visible = sbtsi_is_visible,
> > + .read = sbtsi_read,
> > + .write = sbtsi_write,
> > +};
> > +
> > +static const struct hwmon_chip_info sbtsi_chip_info = {
> > + .ops = &sbtsi_hwmon_ops,
> > + .info = sbtsi_info,
> > +};
> > +
> > +static int sbtsi_probe(struct i2c_client *client,
> > +       const struct i2c_device_id *id)
> > +{
> > + struct device *dev = &client->dev;
> > + struct device *hwmon_dev;
> > + struct sbtsi_data *data;
> > +
> > + data = devm_kzalloc(dev, sizeof(struct sbtsi_data), GFP_KERNEL);
> > + if (!data)
> > + return -ENOMEM;
> > +
> > + data->client = client;
> > + mutex_init(&data->lock);
> > +
> > + hwmon_dev =
> > + devm_hwmon_device_register_with_info(dev, client->name, data,
> > +     &sbtsi_chip_info, NULL);
> > +
> > + return PTR_ERR_OR_ZERO(hwmon_dev);
> > +}
> > +
> > +static const struct i2c_device_id sbtsi_id[] = {
> > + {"sbtsi", 0},
> > + {}
> > +};
> > +MODULE_DEVICE_TABLE(i2c, sbtsi_id);
> > +
> > +static const struct of_device_id __maybe_unused sbtsi_of_match[] = {
> > + {
> > + .compatible = "amd,sbtsi",
> > + },
> > + { },
> > +};
> > +MODULE_DEVICE_TABLE(of, sbtsi_of_match);
> > +
> > +static struct i2c_driver sbtsi_driver = {
> > + .class = I2C_CLASS_HWMON,
> > + .driver = {
> > + .name = "sbtsi",
> > + .of_match_table = of_match_ptr(sbtsi_of_match),
> > + },
> > + .probe = sbtsi_probe,
> > + .id_table = sbtsi_id,
> > +};
> > +
> > +module_i2c_driver(sbtsi_driver);
> > +
> > +MODULE_AUTHOR("Kun Yi <kunyi at google.com>");
> > +MODULE_DESCRIPTION("Hwmon driver for AMD SB-TSI emulated sensor");
> > +MODULE_LICENSE("GPL");



--
Regards,
Kun

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-12-02 18:22         ` Yun Levi
@ 2020-12-02 21:26           ` Yury Norov
  2020-12-02 22:51             ` Yun Levi
  0 siblings, 1 reply; 1546+ messages in thread
From: Yury Norov @ 2020-12-02 21:26 UTC (permalink / raw)
  To: Yun Levi
  Cc: Rasmus Villemoes, dushistov, Arnd Bergmann, Andrew Morton,
	Gustavo A. R. Silva, William Breathitt Gray, richard.weiyang,
	joseph.qi, skalluru, Josh Poimboeuf, Linux Kernel Mailing List,
	linux-arch, Andy Shevchenko

On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <ppbuk5246@gmail.com> wrote:
>
> On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <yury.norov@gmail.com> wrote:
>
> > Also look at lib/find_bit_benchmark.c
> Thanks. I'll see.
>
> > We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
> > bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
> > for the purpose of reverse traversing we can go with already existing find_last_bit(),
>
> Thank you. I haven't thought that way.
> But I think if we implement reverse traversing using find_last_bit(),
> we have a problem.
> Suppose the last bit 0, 1, 2, is set.
> If we start
>     find_last_bit(bitmap, 3) ==> return 2;
>     find_last_bit(bitmap, 2) ==> return 1;
>     find_last_bit(bitmap, 1) ==> return 0;
>     find_last_bit(bitmap, 0) ===> return 0? // here we couldn't
> distinguish size 0 input or 0 is set

If you traverse backward and reach bit #0, you're done. No need to continue.

>
> and the for_each traverse routine prevent above case by returning size
> (nbits) using find_next_bit.
> So, for compatibility and the same expected return value like next traversing,
> I think we need to find_prev_*_bit routine. if my understanding is correct.
>
>
> >  I think this patch has some good catches. We definitely need to implement
> > find_last_zero_bit(), as it is used by fs/ufs, and their local implementation is not optimal.
> >
> > We also should consider adding reverse traversing macros based on find_last_*_bit(),
> > if there are proposed users.
>
> Not only this, I think 'steal_from_bitmap_to_front' can be improved
> using ffind_prev_zero_bit
> like
>
> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> index af0013d3df63..9debb9707390 100644
> --- a/fs/btrfs/free-space-cache.c
> +++ b/fs/btrfs/free-space-cache.c
> @@ -2372,7 +2372,6 @@ static bool steal_from_bitmap_to_front(struct
> btrfs_free_space_ctl *ctl,
>   u64 bitmap_offset;
>   unsigned long i;
>   unsigned long j;
> - unsigned long prev_j;
>   u64 bytes;
>
>   bitmap_offset = offset_to_bitmap(ctl, info->offset);
> @@ -2388,20 +2387,15 @@ static bool steal_from_bitmap_to_front(struct
> btrfs_free_space_ctl *ctl,
>   return false;
>
>   i = offset_to_bit(bitmap->offset, ctl->unit, info->offset) - 1;
> - j = 0;
> - prev_j = (unsigned long)-1;
> - for_each_clear_bit_from(j, bitmap->bitmap, BITS_PER_BITMAP) {
> - if (j > i)
> - break;
> - prev_j = j;
> - }
> - if (prev_j == i)
> + j = find_prev_zero_bit(bitmap->bitmap, BITS_PER_BITMAP, i);

This one may be implemented with find_last_zero_bit() as well:

unsigned log j = find_last_zero_bit(bitmap, BITS_PER_BITMAP);
if (j <= i || j >= BITS_PER_BITMAP)
        return false;

I believe the latter version is better because find_last_*_bit() is simpler in
implementation (and partially exists), has less parameters, and therefore
simpler for users, and doesn't introduce functionality duplication.

The only consideration I can imagine to advocate find_prev*() is the performance
advantage in the scenario when we know for sure that first N bits of
bitmap are all
set/clear, and we can bypass traversing that area. But again, in this
case we can pass the
bitmap address with the appropriate offset, and stay with find_last_*()

> +
> + if (j == i)
>   return false;
>
> - if (prev_j == (unsigned long)-1)
> + if (j == BITS_PER_BITMAP)
>   bytes = (i + 1) * ctl->unit;
>   else
> - bytes = (i - prev_j) * ctl->unit;
> + bytes = (i - j) * ctl->unit;
>
>   info->offset -= bytes;
>   info->bytes += bytes;
>
> Thanks.
>
> HTH
> Levi.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-12-03  1:23               ` Yun Levi
@ 2020-12-03  8:33                 ` Rasmus Villemoes
  2020-12-03  9:47                   ` Re: Yun Levi
  0 siblings, 1 reply; 1546+ messages in thread
From: Rasmus Villemoes @ 2020-12-03  8:33 UTC (permalink / raw)
  To: Yun Levi, Yury Norov
  Cc: dushistov, Arnd Bergmann, Andrew Morton, Gustavo A. R. Silva,
	William Breathitt Gray, richard.weiyang, joseph.qi, skalluru,
	Josh Poimboeuf, Linux Kernel Mailing List, linux-arch,
	Andy Shevchenko

On 03/12/2020 02.23, Yun Levi wrote:
> On Thu, Dec 3, 2020 at 7:51 AM Yun Levi <ppbuk5246@gmail.com> wrote:
>>
>> On Thu, Dec 3, 2020 at 6:26 AM Yury Norov <yury.norov@gmail.com> wrote:
>>>
>>> On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <ppbuk5246@gmail.com> wrote:
>>>>
>>>> On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <yury.norov@gmail.com> wrote:
>>>>
>>>>> Also look at lib/find_bit_benchmark.c
>>>> Thanks. I'll see.
>>>>
>>>>> We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
>>>>> bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
>>>>> for the purpose of reverse traversing we can go with already existing find_last_bit(),
>>>>
>>>> Thank you. I haven't thought that way.
>>>> But I think if we implement reverse traversing using find_last_bit(),
>>>> we have a problem.
>>>> Suppose the last bit 0, 1, 2, is set.
>>>> If we start
>>>>     find_last_bit(bitmap, 3) ==> return 2;
>>>>     find_last_bit(bitmap, 2) ==> return 1;
>>>>     find_last_bit(bitmap, 1) ==> return 0;
>>>>     find_last_bit(bitmap, 0) ===> return 0? // here we couldn't

Either just make the return type of all find_prev/find_last be signed
int and use -1 as the sentinel to indicate "no such position exists", so
the loop condition would be foo >= 0. Or, change the condition from
"stop if we get the size returned" to "only continue if we get something
strictly less than the size we passed in (i.e., something which can
possibly be a valid bit index). In the latter case, both (unsigned)-1
aka UINT_MAX and the actual size value passed work equally well as a
sentinel.

If one uses UINT_MAX, a for_each_bit_reverse() macro would just be
something like

for (i = find_last_bit(bitmap, size); i < size; i =
find_last_bit(bitmap, i))

if one wants to use the size argument as the sentinel, the caller would
have to supply a scratch variable to keep track of the last i value:

for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i =
find_last_bit(bitmap, j))

which is probably a little less ergonomic.

Rasmus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-12-03  8:33                 ` Rasmus Villemoes
@ 2020-12-03  9:47                   ` Yun Levi
  2020-12-03 18:46                     ` Re: Yury Norov
  0 siblings, 1 reply; 1546+ messages in thread
From: Yun Levi @ 2020-12-03  9:47 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Yury Norov, dushistov, Arnd Bergmann, Andrew Morton,
	Gustavo A. R. Silva, William Breathitt Gray, richard.weiyang,
	joseph.qi, skalluru, Josh Poimboeuf, Linux Kernel Mailing List,
	linux-arch, Andy Shevchenko

> If one uses UINT_MAX, a for_each_bit_reverse() macro would just be
> something like
>
> for (i = find_last_bit(bitmap, size); i < size; i =
> find_last_bit(bitmap, i))
>
> if one wants to use the size argument as the sentinel, the caller would
> have to supply a scratch variable to keep track of the last i value:
>
> for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i =
> find_last_bit(bitmap, j))
>
> which is probably a little less ergonomic.

Actually Because I want to avoid the modification of return type of
find_last_*_bit for new sentinel,
I add find_prev_*_bit.
the big difference between find_last_bit and find_prev_bit is
   find_last_bit doesn't check the size bit and use sentinel with size.
   but find_prev_bit check the offset bit and use sentinel with size
which passed by another argument.
   So if we use find_prev_bit, we could have a clear iteration if
using find_prev_bit like.

  #define for_each_set_bit_reverse(bit, addr, size) \
      for ((bit) = find_last_bit((addr), (size));    \
            (bit) < (size);                                     \
            (bit) = find_prev_bit((addr), (size), (bit - 1)))

  #define for_each_set_bit_from_reverse(bit, addr, size) \
      for ((bit) = find_prev_bit((addr), (size), (bit)); \
             (bit) < (size);                                           \
             (bit) = find_prev_bit((addr), (size), (bit - 1)))

Though find_prev_*_bit / find_last_*_bit have the same functionality.
But they also have a small difference.
I think this small this small difference doesn't make some of
confusion to user but it help to solve problem
with a simple way (just like the iteration above).

So I think I need, find_prev_*_bit series.

Am I missing anything?

Thanks.

Levi.

On Thu, Dec 3, 2020 at 5:33 PM Rasmus Villemoes
<linux@rasmusvillemoes.dk> wrote:
>
> On 03/12/2020 02.23, Yun Levi wrote:
> > On Thu, Dec 3, 2020 at 7:51 AM Yun Levi <ppbuk5246@gmail.com> wrote:
> >>
> >> On Thu, Dec 3, 2020 at 6:26 AM Yury Norov <yury.norov@gmail.com> wrote:
> >>>
> >>> On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <ppbuk5246@gmail.com> wrote:
> >>>>
> >>>> On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <yury.norov@gmail.com> wrote:
> >>>>
> >>>>> Also look at lib/find_bit_benchmark.c
> >>>> Thanks. I'll see.
> >>>>
> >>>>> We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
> >>>>> bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
> >>>>> for the purpose of reverse traversing we can go with already existing find_last_bit(),
> >>>>
> >>>> Thank you. I haven't thought that way.
> >>>> But I think if we implement reverse traversing using find_last_bit(),
> >>>> we have a problem.
> >>>> Suppose the last bit 0, 1, 2, is set.
> >>>> If we start
> >>>>     find_last_bit(bitmap, 3) ==> return 2;
> >>>>     find_last_bit(bitmap, 2) ==> return 1;
> >>>>     find_last_bit(bitmap, 1) ==> return 0;
> >>>>     find_last_bit(bitmap, 0) ===> return 0? // here we couldn't
>
> Either just make the return type of all find_prev/find_last be signed
> int and use -1 as the sentinel to indicate "no such position exists", so
> the loop condition would be foo >= 0. Or, change the condition from
> "stop if we get the size returned" to "only continue if we get something
> strictly less than the size we passed in (i.e., something which can
> possibly be a valid bit index). In the latter case, both (unsigned)-1
> aka UINT_MAX and the actual size value passed work equally well as a
> sentinel.
>
> If one uses UINT_MAX, a for_each_bit_reverse() macro would just be
> something like
>
> for (i = find_last_bit(bitmap, size); i < size; i =
> find_last_bit(bitmap, i))
>
> if one wants to use the size argument as the sentinel, the caller would
> have to supply a scratch variable to keep track of the last i value:
>
> for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i =
> find_last_bit(bitmap, j))
>
> which is probably a little less ergonomic.
>
> Rasmus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-12-03  9:47                   ` Re: Yun Levi
@ 2020-12-03 18:46                     ` Yury Norov
  2020-12-03 18:52                       ` Re: Willy Tarreau
  2020-12-05 11:10                       ` Re: Rasmus Villemoes
  0 siblings, 2 replies; 1546+ messages in thread
From: Yury Norov @ 2020-12-03 18:46 UTC (permalink / raw)
  To: Yun Levi
  Cc: Rasmus Villemoes, dushistov, Arnd Bergmann, Andrew Morton,
	Gustavo A. R. Silva, William Breathitt Gray, richard.weiyang,
	joseph.qi, skalluru, Josh Poimboeuf, Linux Kernel Mailing List,
	linux-arch, Andy Shevchenko

Yun, could you please stop top-posting and excessive trimming in the thread?

On Thu, Dec 3, 2020 at 1:47 AM Yun Levi <ppbuk5246@gmail.com> wrote:
> > Either just make the return type of all find_prev/find_last be signed
> > int and use -1 as the sentinel to indicate "no such position exists", so
> > the loop condition would be foo >= 0. Or, change the condition from
> > "stop if we get the size returned" to "only continue if we get something
> > strictly less than the size we passed in (i.e., something which can
> > possibly be a valid bit index). In the latter case, both (unsigned)-1
> > aka UINT_MAX and the actual size value passed work equally well as a
> > sentinel.
> >
> > If one uses UINT_MAX, a for_each_bit_reverse() macro would just be
> > something like
> >
> > for (i = find_last_bit(bitmap, size); i < size; i =
> > find_last_bit(bitmap, i))
> >
> > if one wants to use the size argument as the sentinel, the caller would
> > have to supply a scratch variable to keep track of the last i value:
> >
> > for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i =
> > find_last_bit(bitmap, j))
> >
> > which is probably a little less ergonomic.
> >
> > Rasmus

I would prefer to avoid changing the find*bit() semantics. As for now,
if any of find_*_bit()
finds nothing, it returns the size of the bitmap it was passed.
Changing this for
a single function would break the consistency, and may cause problems
for those who
rely on existing behaviour.

Passing non-positive size to find_*_bit() should produce undefined
behaviour, because we cannot dereference a pointer to the bitmap in
this case; this is most probably a sign of a problem on a caller side
anyways.

Let's keep this logic unchanged?

> Actually Because I want to avoid the modification of return type of
> find_last_*_bit for new sentinel,
> I add find_prev_*_bit.
> the big difference between find_last_bit and find_prev_bit is
>    find_last_bit doesn't check the size bit and use sentinel with size.
>    but find_prev_bit check the offset bit and use sentinel with size
> which passed by another argument.
>    So if we use find_prev_bit, we could have a clear iteration if
> using find_prev_bit like.
>
>   #define for_each_set_bit_reverse(bit, addr, size) \
>       for ((bit) = find_last_bit((addr), (size));    \
>             (bit) < (size);                                     \
>             (bit) = find_prev_bit((addr), (size), (bit - 1)))
>
>   #define for_each_set_bit_from_reverse(bit, addr, size) \
>       for ((bit) = find_prev_bit((addr), (size), (bit)); \
>              (bit) < (size);                                           \
>              (bit) = find_prev_bit((addr), (size), (bit - 1)))
>
> Though find_prev_*_bit / find_last_*_bit have the same functionality.
> But they also have a small difference.
> I think this small this small difference doesn't make some of
> confusion to user but it help to solve problem
> with a simple way (just like the iteration above).
>
> So I think I need, find_prev_*_bit series.
>
> Am I missing anything?
>
> Thanks.
>
> Levi.

As you said, find_last_bit() and proposed find_prev_*_bit() have the
same functionality.
If you really want to have find_prev_*_bit(), could you please at
least write it using find_last_bit(), otherwise it would be just a
blottering.

Regarding reverse search, we can probably do like this (not tested,
just an idea):

#define for_each_set_bit_reverse(bit, addr, size) \
    for ((bit) = find_last_bit((addr), (size));    \
          (bit) < (size);                                     \
          (size) = (bit), (bit) = find_last_bit((addr), (bit)))

Thanks,
Yury

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-12-03 18:46                     ` Re: Yury Norov
@ 2020-12-03 18:52                       ` Willy Tarreau
  2020-12-04  1:36                         ` Re: Yun Levi
  2020-12-05 11:10                       ` Re: Rasmus Villemoes
  1 sibling, 1 reply; 1546+ messages in thread
From: Willy Tarreau @ 2020-12-03 18:52 UTC (permalink / raw)
  To: Yury Norov
  Cc: Yun Levi, Rasmus Villemoes, dushistov, Arnd Bergmann,
	Andrew Morton, Gustavo A. R. Silva, William Breathitt Gray,
	richard.weiyang, joseph.qi, skalluru, Josh Poimboeuf,
	Linux Kernel Mailing List, linux-arch, Andy Shevchenko

On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> Yun, could you please stop top-posting and excessive trimming in the thread?

And re-configure the mail agent to make the "Subject" field appear and
fill it.

Willy

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-12-03 18:52                       ` Re: Willy Tarreau
@ 2020-12-04  1:36                         ` Yun Levi
  2020-12-04 18:14                           ` Re: Yury Norov
  0 siblings, 1 reply; 1546+ messages in thread
From: Yun Levi @ 2020-12-04  1:36 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Yury Norov, Rasmus Villemoes, dushistov, Arnd Bergmann,
	Andrew Morton, Gustavo A. R. Silva, William Breathitt Gray,
	richard.weiyang, joseph.qi, skalluru, Josh Poimboeuf,
	Linux Kernel Mailing List, linux-arch, Andy Shevchenko

>On Fri, Dec 4, 2020 at 3:53 AM Willy Tarreau <w@1wt.eu> wrote:
>
> On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > Yun, could you please stop top-posting and excessive trimming in the thread?
>
> And re-configure the mail agent to make the "Subject" field appear and
> fill it.

>On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> Yun, could you please stop top-posting and excessive trimming in the thread?
Sorry to make you uncomfortable... Thanks for advice.

>On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> As you said, find_last_bit() and proposed find_prev_*_bit() have the
> same functionality.
> If you really want to have find_prev_*_bit(), could you please at
> least write it using find_last_bit(), otherwise it would be just a
> blottering.

Actually find_prev_*_bit call _find_prev_bit which is a common helper function
like _find_next_bit.
As you know this function is required to support __BIGEDIAN's little
endian search.
find_prev_bit actually wrapper of _find_prev_bit which have a feature
the find_last_bit.

That makes the semantics difference between find_last_bit and find_prev_bit.
-- specify where you find from and
   In loop, find_last_bit couldn't sustain original size as sentinel
return value
    (we should change the size argument for next searching
     But it means whenever we call, "NOT SET or NOT CLEAR"'s sentinel
return value is changed per call).

Because we should have _find_prev_bit,
I think it's the matter to choose which is better to usein
find_prev_bit (find_last_bit? or _find_prev_bit?)
sustaining find_prev_bit feature (give size as sentinel return, from
where I start).
if my understanding is correct.

In my view, I prefer to use _find_prev_bit like find_next_bit for
integrated format.

But In some of the benchmarking, find_last_bit is better than _find_prev_bit,
here what I tested (look similar but sometimes have some difference).

              Start testing find_bit() with random-filled bitmap
[  +0.001850] find_next_bit:                  842792 ns, 163788 iterations
[  +0.000873] find_prev_bit:                  870914 ns, 163788 iterations
[  +0.000824] find_next_zero_bit:             821959 ns, 163894 iterations
[  +0.000677] find_prev_zero_bit:             676240 ns, 163894 iterations
[  +0.000777] find_last_bit:                  659103 ns, 163788 iterations
[  +0.001822] find_first_bit:                1708041 ns,  16250 iterations
[  +0.000539] find_next_and_bit:              492182 ns,  73871 iterations
[  +0.000001]
              Start testing find_bit() with sparse bitmap
[  +0.000222] find_next_bit:                   13227 ns,    654 iterations
[  +0.000013] find_prev_bit:                   11652 ns,    654 iterations
[  +0.001845] find_next_zero_bit:            1723869 ns, 327028 iterations
[  +0.001538] find_prev_zero_bit:            1355808 ns, 327028 iterations
[  +0.000010] find_last_bit:                    8114 ns,    654 iterations
[  +0.000867] find_first_bit:                 710639 ns,    654 iterations
[  +0.000006] find_next_and_bit:                4273 ns,      1 iterations
[  +0.000004] find_next_and_bit:                3278 ns,      1 iterations

              Start testing find_bit() with random-filled bitmap
[  +0.001784] find_next_bit:                  805553 ns, 164240 iterations
[  +0.000643] find_prev_bit:                  632474 ns, 164240 iterations
[  +0.000950] find_next_zero_bit:             877215 ns, 163442 iterations
[  +0.000664] find_prev_zero_bit:             662339 ns, 163442 iterations
[  +0.000680] find_last_bit:                  602204 ns, 164240 iterations
[  +0.001912] find_first_bit:                1758208 ns,  16408 iterations
[  +0.000760] find_next_and_bit:              531033 ns,  73798 iterations
[  +0.000002]
              Start testing find_bit() with sparse bitmap
[  +0.000203] find_next_bit:                   12468 ns,    656 iterations
[  +0.000205] find_prev_bit:                   10948 ns,    656 iterations
[  +0.001759] find_next_zero_bit:            1579447 ns, 327026 iterations
[  +0.001935] find_prev_zero_bit:            1931961 ns, 327026 iterations
[  +0.000013] find_last_bit:                    9543 ns,    656 iterations
[  +0.000732] find_first_bit:                 562009 ns,    656 iterations
[  +0.000217] find_next_and_bit:                6804 ns,      1 iterations
[  +0.000007] find_next_and_bit:                4367 ns,      1 iterations

Is it better to write find_prev_bit using find_last_bit?
I question again.

Thanks for your great advice, But please forgive my fault and lackness.

HTH.
Levi.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-12-04  1:36                         ` Re: Yun Levi
@ 2020-12-04 18:14                           ` Yury Norov
  2020-12-05  0:45                             ` Re: Yun Levi
  0 siblings, 1 reply; 1546+ messages in thread
From: Yury Norov @ 2020-12-04 18:14 UTC (permalink / raw)
  To: Yun Levi
  Cc: Willy Tarreau, Rasmus Villemoes, dushistov, Arnd Bergmann,
	Andrew Morton, Gustavo A. R. Silva, William Breathitt Gray,
	richard.weiyang, joseph.qi, skalluru, Josh Poimboeuf,
	Linux Kernel Mailing List, linux-arch, Andy Shevchenko

On Thu, Dec 3, 2020 at 5:36 PM Yun Levi <ppbuk5246@gmail.com> wrote:
>
> >On Fri, Dec 4, 2020 at 3:53 AM Willy Tarreau <w@1wt.eu> wrote:
> >
> > On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > > Yun, could you please stop top-posting and excessive trimming in the thread?
> >
> > And re-configure the mail agent to make the "Subject" field appear and
> > fill it.
>
> >On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > Yun, could you please stop top-posting and excessive trimming in the thread?
> Sorry to make you uncomfortable... Thanks for advice.
>
> >On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > As you said, find_last_bit() and proposed find_prev_*_bit() have the
> > same functionality.
> > If you really want to have find_prev_*_bit(), could you please at
> > least write it using find_last_bit(), otherwise it would be just a
> > blottering.
>
> Actually find_prev_*_bit call _find_prev_bit which is a common helper function
> like _find_next_bit.
> As you know this function is required to support __BIGEDIAN's little
> endian search.
> find_prev_bit actually wrapper of _find_prev_bit which have a feature
> the find_last_bit.
>
> That makes the semantics difference between find_last_bit and find_prev_bit.
> -- specify where you find from and
>    In loop, find_last_bit couldn't sustain original size as sentinel
> return value
>     (we should change the size argument for next searching
>      But it means whenever we call, "NOT SET or NOT CLEAR"'s sentinel
> return value is changed per call).
>
> Because we should have _find_prev_bit,
> I think it's the matter to choose which is better to usein
> find_prev_bit (find_last_bit? or _find_prev_bit?)
> sustaining find_prev_bit feature (give size as sentinel return, from
> where I start).
> if my understanding is correct.
>
> In my view, I prefer to use _find_prev_bit like find_next_bit for
> integrated format.
>
> But In some of the benchmarking, find_last_bit is better than _find_prev_bit,
> here what I tested (look similar but sometimes have some difference).
>
>               Start testing find_bit() with random-filled bitmap
> [  +0.001850] find_next_bit:                  842792 ns, 163788 iterations
> [  +0.000873] find_prev_bit:                  870914 ns, 163788 iterations
> [  +0.000824] find_next_zero_bit:             821959 ns, 163894 iterations
> [  +0.000677] find_prev_zero_bit:             676240 ns, 163894 iterations
> [  +0.000777] find_last_bit:                  659103 ns, 163788 iterations
> [  +0.001822] find_first_bit:                1708041 ns,  16250 iterations
> [  +0.000539] find_next_and_bit:              492182 ns,  73871 iterations
> [  +0.000001]
>               Start testing find_bit() with sparse bitmap
> [  +0.000222] find_next_bit:                   13227 ns,    654 iterations
> [  +0.000013] find_prev_bit:                   11652 ns,    654 iterations
> [  +0.001845] find_next_zero_bit:            1723869 ns, 327028 iterations
> [  +0.001538] find_prev_zero_bit:            1355808 ns, 327028 iterations
> [  +0.000010] find_last_bit:                    8114 ns,    654 iterations
> [  +0.000867] find_first_bit:                 710639 ns,    654 iterations
> [  +0.000006] find_next_and_bit:                4273 ns,      1 iterations
> [  +0.000004] find_next_and_bit:                3278 ns,      1 iterations
>
>               Start testing find_bit() with random-filled bitmap
> [  +0.001784] find_next_bit:                  805553 ns, 164240 iterations
> [  +0.000643] find_prev_bit:                  632474 ns, 164240 iterations
> [  +0.000950] find_next_zero_bit:             877215 ns, 163442 iterations
> [  +0.000664] find_prev_zero_bit:             662339 ns, 163442 iterations
> [  +0.000680] find_last_bit:                  602204 ns, 164240 iterations
> [  +0.001912] find_first_bit:                1758208 ns,  16408 iterations
> [  +0.000760] find_next_and_bit:              531033 ns,  73798 iterations
> [  +0.000002]
>               Start testing find_bit() with sparse bitmap
> [  +0.000203] find_next_bit:                   12468 ns,    656 iterations
> [  +0.000205] find_prev_bit:                   10948 ns,    656 iterations
> [  +0.001759] find_next_zero_bit:            1579447 ns, 327026 iterations
> [  +0.001935] find_prev_zero_bit:            1931961 ns, 327026 iterations
> [  +0.000013] find_last_bit:                    9543 ns,    656 iterations
> [  +0.000732] find_first_bit:                 562009 ns,    656 iterations
> [  +0.000217] find_next_and_bit:                6804 ns,      1 iterations
> [  +0.000007] find_next_and_bit:                4367 ns,      1 iterations
>
> Is it better to write find_prev_bit using find_last_bit?
> I question again.

I answer again. It's better not to write find_prev_bit at all and
learn how to use existing functionality.

Yury

> Thanks for your great advice, But please forgive my fault and lackness.
>
> HTH.
> Levi.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-12-04 18:14                           ` Re: Yury Norov
@ 2020-12-05  0:45                             ` Yun Levi
  0 siblings, 0 replies; 1546+ messages in thread
From: Yun Levi @ 2020-12-05  0:45 UTC (permalink / raw)
  To: Yury Norov
  Cc: Willy Tarreau, Rasmus Villemoes, dushistov, Arnd Bergmann,
	Andrew Morton, Gustavo A. R. Silva, William Breathitt Gray,
	richard.weiyang, joseph.qi, skalluru, Josh Poimboeuf,
	Linux Kernel Mailing List, linux-arch, Andy Shevchenko

> I answer again. It's better not to write find_prev_bit at all and
> learn how to use existing functionality.

Thanks for the answer I'll fix and send the patch again :)

On Sat, Dec 5, 2020 at 3:14 AM Yury Norov <yury.norov@gmail.com> wrote:
>
> On Thu, Dec 3, 2020 at 5:36 PM Yun Levi <ppbuk5246@gmail.com> wrote:
> >
> > >On Fri, Dec 4, 2020 at 3:53 AM Willy Tarreau <w@1wt.eu> wrote:
> > >
> > > On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > > > Yun, could you please stop top-posting and excessive trimming in the thread?
> > >
> > > And re-configure the mail agent to make the "Subject" field appear and
> > > fill it.
> >
> > >On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > > Yun, could you please stop top-posting and excessive trimming in the thread?
> > Sorry to make you uncomfortable... Thanks for advice.
> >
> > >On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > > As you said, find_last_bit() and proposed find_prev_*_bit() have the
> > > same functionality.
> > > If you really want to have find_prev_*_bit(), could you please at
> > > least write it using find_last_bit(), otherwise it would be just a
> > > blottering.
> >
> > Actually find_prev_*_bit call _find_prev_bit which is a common helper function
> > like _find_next_bit.
> > As you know this function is required to support __BIGEDIAN's little
> > endian search.
> > find_prev_bit actually wrapper of _find_prev_bit which have a feature
> > the find_last_bit.
> >
> > That makes the semantics difference between find_last_bit and find_prev_bit.
> > -- specify where you find from and
> >    In loop, find_last_bit couldn't sustain original size as sentinel
> > return value
> >     (we should change the size argument for next searching
> >      But it means whenever we call, "NOT SET or NOT CLEAR"'s sentinel
> > return value is changed per call).
> >
> > Because we should have _find_prev_bit,
> > I think it's the matter to choose which is better to usein
> > find_prev_bit (find_last_bit? or _find_prev_bit?)
> > sustaining find_prev_bit feature (give size as sentinel return, from
> > where I start).
> > if my understanding is correct.
> >
> > In my view, I prefer to use _find_prev_bit like find_next_bit for
> > integrated format.
> >
> > But In some of the benchmarking, find_last_bit is better than _find_prev_bit,
> > here what I tested (look similar but sometimes have some difference).
> >
> >               Start testing find_bit() with random-filled bitmap
> > [  +0.001850] find_next_bit:                  842792 ns, 163788 iterations
> > [  +0.000873] find_prev_bit:                  870914 ns, 163788 iterations
> > [  +0.000824] find_next_zero_bit:             821959 ns, 163894 iterations
> > [  +0.000677] find_prev_zero_bit:             676240 ns, 163894 iterations
> > [  +0.000777] find_last_bit:                  659103 ns, 163788 iterations
> > [  +0.001822] find_first_bit:                1708041 ns,  16250 iterations
> > [  +0.000539] find_next_and_bit:              492182 ns,  73871 iterations
> > [  +0.000001]
> >               Start testing find_bit() with sparse bitmap
> > [  +0.000222] find_next_bit:                   13227 ns,    654 iterations
> > [  +0.000013] find_prev_bit:                   11652 ns,    654 iterations
> > [  +0.001845] find_next_zero_bit:            1723869 ns, 327028 iterations
> > [  +0.001538] find_prev_zero_bit:            1355808 ns, 327028 iterations
> > [  +0.000010] find_last_bit:                    8114 ns,    654 iterations
> > [  +0.000867] find_first_bit:                 710639 ns,    654 iterations
> > [  +0.000006] find_next_and_bit:                4273 ns,      1 iterations
> > [  +0.000004] find_next_and_bit:                3278 ns,      1 iterations
> >
> >               Start testing find_bit() with random-filled bitmap
> > [  +0.001784] find_next_bit:                  805553 ns, 164240 iterations
> > [  +0.000643] find_prev_bit:                  632474 ns, 164240 iterations
> > [  +0.000950] find_next_zero_bit:             877215 ns, 163442 iterations
> > [  +0.000664] find_prev_zero_bit:             662339 ns, 163442 iterations
> > [  +0.000680] find_last_bit:                  602204 ns, 164240 iterations
> > [  +0.001912] find_first_bit:                1758208 ns,  16408 iterations
> > [  +0.000760] find_next_and_bit:              531033 ns,  73798 iterations
> > [  +0.000002]
> >               Start testing find_bit() with sparse bitmap
> > [  +0.000203] find_next_bit:                   12468 ns,    656 iterations
> > [  +0.000205] find_prev_bit:                   10948 ns,    656 iterations
> > [  +0.001759] find_next_zero_bit:            1579447 ns, 327026 iterations
> > [  +0.001935] find_prev_zero_bit:            1931961 ns, 327026 iterations
> > [  +0.000013] find_last_bit:                    9543 ns,    656 iterations
> > [  +0.000732] find_first_bit:                 562009 ns,    656 iterations
> > [  +0.000217] find_next_and_bit:                6804 ns,      1 iterations
> > [  +0.000007] find_next_and_bit:                4367 ns,      1 iterations
> >
> > Is it better to write find_prev_bit using find_last_bit?
> > I question again.
>
> I answer again. It's better not to write find_prev_bit at all and
> learn how to use existing functionality.
>
> Yury
>
> > Thanks for your great advice, But please forgive my fault and lackness.
> >
> > HTH.
> > Levi.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-12-03 18:46                     ` Re: Yury Norov
  2020-12-03 18:52                       ` Re: Willy Tarreau
@ 2020-12-05 11:10                       ` Rasmus Villemoes
  2020-12-05 18:20                         ` Re: Yury Norov
  1 sibling, 1 reply; 1546+ messages in thread
From: Rasmus Villemoes @ 2020-12-05 11:10 UTC (permalink / raw)
  To: Yury Norov, Yun Levi
  Cc: dushistov, Arnd Bergmann, Andrew Morton, Gustavo A. R. Silva,
	William Breathitt Gray, richard.weiyang, joseph.qi, skalluru,
	Josh Poimboeuf, Linux Kernel Mailing List, linux-arch,
	Andy Shevchenko

On 03/12/2020 19.46, Yury Norov wrote:

> I would prefer to avoid changing the find*bit() semantics. As for now,
> if any of find_*_bit()
> finds nothing, it returns the size of the bitmap it was passed.

Yeah, we should actually try to fix that, it causes bad code generation.
It's hard, because callers of course do that "if ret == size" check. But
it's really silly that something like find_first_bit needs to do that
"min(i*BPL + __ffs(word), size)" - the caller does a comparison anyway,
that comparison might as well be "ret >= size" rather than "ret ==
size", and then we could get rid of that branch (which min() necessarily
becomes) at the end of find_next_bit.

I haven't dug very deep into this, but I could also imagine the
arch-specific parts of this might become a little easier to do if the
semantics were just "if no such bit, return an indeterminate value >=
the size".

> Changing this for
> a single function would break the consistency, and may cause problems
> for those who
> rely on existing behaviour.

True. But I think it should be possible - I suppose most users are via
the iterator macros, which could all be updated at once. Changing ret ==
size to ret >= size will still work even if the implementations have not
been switched over, so it should be doable.

> 
> Passing non-positive size to find_*_bit() should produce undefined
> behaviour, because we cannot dereference a pointer to the bitmap in
> this case; this is most probably a sign of a problem on a caller side
> anyways.

No, the out-of-line bitmap functions should all handle the case of a
zero-size bitmap sensibly.

Is bitmap full? Yes (all the 0 bits are set).
Is bitmap empty? Yes, (none of the 0 bits are set).
Find the first bit set (returns 0, there's no such bit)

Etc. The static inlines for small_const_nbits do assume that the pointer
can be dereferenced, which is why small_const_nbits was updated to mean
1<=bits<=BITS_PER_LONG rather than just bits<=BITS_PER_LONG.

Rasmus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-12-05 11:10                       ` Re: Rasmus Villemoes
@ 2020-12-05 18:20                         ` Yury Norov
  0 siblings, 0 replies; 1546+ messages in thread
From: Yury Norov @ 2020-12-05 18:20 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Yun Levi, dushistov, Arnd Bergmann, Andrew Morton,
	Gustavo A. R. Silva, William Breathitt Gray, richard.weiyang,
	joseph.qi, skalluru, Josh Poimboeuf, Linux Kernel Mailing List,
	linux-arch, Andy Shevchenko

On Sat, Dec 5, 2020 at 3:10 AM Rasmus Villemoes
<linux@rasmusvillemoes.dk> wrote:
>
> On 03/12/2020 19.46, Yury Norov wrote:
>
> > I would prefer to avoid changing the find*bit() semantics. As for now,
> > if any of find_*_bit()
> > finds nothing, it returns the size of the bitmap it was passed.
>
> Yeah, we should actually try to fix that, it causes bad code generation.
> It's hard, because callers of course do that "if ret == size" check. But
> it's really silly that something like find_first_bit needs to do that
> "min(i*BPL + __ffs(word), size)" - the caller does a comparison anyway,
> that comparison might as well be "ret >= size" rather than "ret ==
> size", and then we could get rid of that branch (which min() necessarily
> becomes) at the end of find_next_bit.

We didn't do that 5 years ago because it's too invasive and the improvement
is barely measurable, the difference is 2 instructions (on arm64).e.
Has something
changed since that?

20000000000000000 <find_first_bit_better>:
   0:   aa0003e3        mov     x3, x0
   4:   aa0103e0        mov     x0, x1
   8:   b4000181        cbz     x1, 38 <find_first_bit_better+0x38>
   c:   f9400064        ldr     x4, [x3]
  10:   d2800802        mov     x2, #0x40                       // #64
  14:   91002063        add     x3, x3, #0x8
  18:   b40000c4        cbz     x4, 30 <find_first_bit_better+0x30>
  1c:   14000008        b       3c <find_first_bit_better+0x3c>
  20:   f8408464        ldr     x4, [x3], #8
  24:   91010045        add     x5, x2, #0x40
  28:   b50000c4        cbnz    x4, 40 <find_first_bit_better+0x40>
  2c:   aa0503e2        mov     x2, x5
  30:   eb00005f        cmp     x2, x0
  34:   54ffff63        b.cc    20 <find_first_bit_better+0x20>  //
b.lo, b.ul, b.last
  38:   d65f03c0        ret
  3c:   d2800002        mov     x2, #0x0                        // #0
  40:   dac00084        rbit    x4, x4
  44:   dac01084        clz     x4, x4
  48:   8b020080        add     x0, x4, x2
  4c:   d65f03c0        ret

0000000000000050 <find_first_bit_worse>:
  50:   aa0003e4        mov     x4, x0
  54:   aa0103e0        mov     x0, x1
  58:   b4000181        cbz     x1, 88 <find_first_bit_worse+0x38>
  5c:   f9400083        ldr     x3, [x4]
  60:   d2800802        mov     x2, #0x40                       // #64
  64:   91002084        add     x4, x4, #0x8
  68:   b40000c3        cbz     x3, 80 <find_first_bit_worse+0x30>
  6c:   14000008        b       8c <find_first_bit_worse+0x3c>
  70:   f8408483        ldr     x3, [x4], #8
  74:   91010045        add     x5, x2, #0x40
  78:   b50000c3        cbnz    x3, 90 <find_first_bit_worse+0x40>
  7c:   aa0503e2        mov     x2, x5
  80:   eb02001f        cmp     x0, x2
  84:   54ffff68        b.hi    70 <find_first_bit_worse+0x20>  // b.pmore
  88:   d65f03c0        ret
  8c:   d2800002        mov     x2, #0x0                        // #0
  90:   dac00063        rbit    x3, x3
  94:   dac01063        clz     x3, x3
  98:   8b020062        add     x2, x3, x2
  9c:   eb02001f        cmp     x0, x2
  a0:   9a829000        csel    x0, x0, x2, ls  // ls = plast
  a4:   d65f03c0        ret

> I haven't dug very deep into this, but I could also imagine the
> arch-specific parts of this might become a little easier to do if the
> semantics were just "if no such bit, return an indeterminate value >=
> the size".
>
> > Changing this for
> > a single function would break the consistency, and may cause problems
> > for those who
> > rely on existing behaviour.
>
> True. But I think it should be possible - I suppose most users are via
> the iterator macros, which could all be updated at once. Changing ret ==
> size to ret >= size will still work even if the implementations have not
> been switched over, so it should be doable.

Since there's no assembler users for it, we can do just:
#define find_first_bit(bitmap, size)
min(better_find_first_bit((bitmap), (size)), (size))

... and deprecate find_first_bit.

> > Passing non-positive size to find_*_bit() should produce undefined
> > behaviour, because we cannot dereference a pointer to the bitmap in
> > this case; this is most probably a sign of a problem on a caller side
> > anyways.
>
> No, the out-of-line bitmap functions should all handle the case of a
> zero-size bitmap sensibly.

I could be more specific, the behaviour is defined: don't dereference
the address and return undefined value (which now is always 0).

> Is bitmap full? Yes (all the 0 bits are set).
> Is bitmap empty? Yes, (none of the 0 bits are set).
> Find the first bit set (returns 0, there's no such bit)

I can't answer because this object is not a map of bits - there's no room for
bits inside.

> Etc. The static inlines for small_const_nbits do assume that the pointer
> can be dereferenced, which is why small_const_nbits was updated to mean
> 1<=bits<=BITS_PER_LONG rather than just bits<=BITS_PER_LONG.

I don't want to do something like

if (size == 0)
        return -1;

... because it legitimizes this kind of usage and hides problems on
callers' side.
Instead, I'd add WARN_ON(size == 0), but I don't think it's so
critical to bother with it.

Yury

> Rasmus

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2020-11-30 16:21 ` Alex Bennée
@ 2020-12-29 15:32   ` Roger Pau Monné
  0 siblings, 0 replies; 1546+ messages in thread
From: Roger Pau Monné @ 2020-12-29 15:32 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Oleksandr Tyshchenko, xen-devel, Oleksandr Tyshchenko,
	Paul Durrant, Jan Beulich, Andrew Cooper, Wei Liu, Julien Grall,
	George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini,
	Tim Deegan, Daniel De Graaf, Volodymyr Babchuk, Jun Nakajima,
	Kevin Tian, Anthony PERARD, Bertrand Marquis, Wei Chen, Kaly Xin,
	Artem Mygaiev

On Mon, Nov 30, 2020 at 04:21:59PM +0000, Alex Bennée wrote:
> 
> Oleksandr Tyshchenko <olekstysh@gmail.com> writes:
> 
> > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> >
> >
> > Date: Sat, 28 Nov 2020 22:33:51 +0200
> > Subject: [PATCH V3 00/23] IOREQ feature (+ virtio-mmio) on Arm
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=UTF-8
> > Content-Transfer-Encoding: 8bit
> >
> > Hello all.
> >
> > The purpose of this patch series is to add IOREQ/DM support to Xen on Arm.
> > You can find an initial discussion at [1] and RFC/V1/V2 series at [2]/[3]/[4].
> > Xen on Arm requires some implementation to forward guest MMIO access to a device
> > model in order to implement virtio-mmio backend or even mediator outside of hypervisor.
> > As Xen on x86 already contains required support this series tries to make it common
> > and introduce Arm specific bits plus some new functionality. Patch series is based on
> > Julien's PoC "xen/arm: Add support for Guest IO forwarding to a device emulator".
> > Besides splitting existing IOREQ/DM support and introducing Arm side, the series
> > also includes virtio-mmio related changes (last 2 patches for toolstack)
> > for the reviewers to be able to see how the whole picture could look
> > like.
> 
> Thanks for posting the latest version.
> 
> >
> > According to the initial discussion there are a few open questions/concerns
> > regarding security, performance in VirtIO solution:
> > 1. virtio-mmio vs virtio-pci, SPI vs MSI, different use-cases require different
> >    transport...
> 
> I think I'm repeating things here I've said in various ephemeral video
> chats over the last few weeks but I should probably put things down on
> the record.
> 
> I think the original intention of the virtio framers is advanced
> features would build on virtio-pci because you get a bunch of things
> "for free" - notably enumeration and MSI support. There is assumption
> that by the time you add these features to virtio-mmio you end up
> re-creating your own less well tested version of virtio-pci. I've not
> been terribly convinced by the argument that the guest implementation of
> PCI presents a sufficiently large blob of code to make the simpler MMIO
> desirable. My attempts to build two virtio kernels (PCI/MMIO) with
> otherwise the same devices wasn't terribly conclusive either way.
> 
> That said virtio-mmio still has life in it because the cloudy slimmed
> down guests moved to using it because the enumeration of PCI is a road
> block to their fast boot up requirements. I'm sure they would also
> appreciate a MSI implementation to reduce the overhead that handling
> notifications currently has on trap-and-emulate.
> 
> AIUI for Xen the other downside to PCI is you would have to emulate it
> in the hypervisor which would be additional code at the most privileged
> level.

Xen already emulates (or maybe it would be better to say decodes) PCI
accesses on the hypervisor and forwards them to the appropriate device
model using the IOREQ interface, so that's not something new. It's
not really emulating the PCI config space, but just detecting accesses
and forwarding them to the device model that should handle them.

You can register different emulators in user space that handle
accesses to different PCI devices from a guest.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2021-01-08 10:35 misono.tomohiro
  0 siblings, 0 replies; 1546+ messages in thread
From: misono.tomohiro @ 2021-01-08 10:35 UTC (permalink / raw)
  To: misono.tomohiro@fujitsu.com,
	'linux-arm-kernel@lists.infradead.org',
	'soc@kernel.org'
  Cc: 'will@kernel.org', 'catalin.marinas@arm.com',
	'arnd@arndb.de', 'olof@lixom.net'

Sorry, I failed to add proper subject to cover letter and it does not show lore archive.
I will resend the whole serieses. Plese discard this.

Tomohiro

> -----Original Message-----
> From: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
> Sent: Friday, January 8, 2021 7:32 PM
> To: linux-arm-kernel@lists.infradead.org; soc@kernel.org
> Cc: will@kernel.org; catalin.marinas@arm.com; arnd@arndb.de; olof@lixom.net; Misono, Tomohiro/味曽野 智礼
> <misono.tomohiro@fujitsu.com>
> Subject:
> 
> Subject: [RFC PATCH 00/10] Add Fujitsu A64FX soc entry/hardware barrier driver
> 
> Hello,
> 
> This series adds Fujitsu A64FX SoC entry in drivers/soc and hardware barrier driver for it.
> 
> [Driver Description]
>  A64FX CPU has several functions for HPC workload and hardware barrier  is one of them. It is a mechanism to realize
> fast synchronization by  PEs belonging to the same L3 cache domain by using implementation  defined hardware
> registers.
>  For more details, see A64FX HPC extension specification in  https://github.com/fujitsu/A64FX
> 
>  The driver mainly offers a set of ioctls to manipulate related registers.
>  Patch 1-9 implements driver code and patch 10 finally adds kconfig,  Makefile and MAINTAINER entry for the driver.
> 
>  Also, C library and test program for this driver is available on:
>  https://github.com/fujitsu/hardware_barrier
> 
>  The driver is based on v5.11-rc2 and tested on FX700 environment.
> 
> [RFC]
>  This is the first time we upstream drivers for our chip and I want to  confirm driver location and patch submission
> process.
> 
>  Based on my observation it seems drivers/soc folder is right place to put  this driver, so I added Kconfig entry for arm64
> platform config, created  soc/fujitsu folder and updated MAINTAINER entry accordingly (last patch).
>  Is it right?
> 
>  Also for final submission I think I need to 1) create some public git  tree to push driver code (github or something), 2)
> make pull request to  SOC team (soc@kernel.org). Is it a correct procedure?
> 
>  I will appreciate any help/comments.
> 
> sidenote: We plan to post other drivers for A64FX HPC extension (prefetch control and cache control) too anytime soon.
> 
> Misono Tomohiro (10):
>   soc: fujitsu: hwb: Add hardware barrier driver init/exit code
>   soc: fujtisu: hwb: Add open operation
>   soc: fujitsu: hwb: Add IOC_BB_ALLOC ioctl
>   soc: fujitsu: hwb: Add IOC_BW_ASSIGN ioctl
>   soc: fujitsu: hwb: Add IOC_BW_UNASSIGN ioctl
>   soc: fujitsu: hwb: Add IOC_BB_FREE ioctl
>   soc: fujitsu: hwb: Add IOC_GET_PE_INFO ioctl
>   soc: fujitsu: hwb: Add release operation
>   soc: fujitsu: hwb: Add sysfs entry
>   soc: fujitsu: hwb: Add Kconfig/Makefile to build fujitsu_hwb driver
> 
>  MAINTAINERS                            |    7 +
>  arch/arm64/Kconfig.platforms           |    5 +
>  drivers/soc/Kconfig                    |    1 +
>  drivers/soc/Makefile                   |    1 +
>  drivers/soc/fujitsu/Kconfig            |   24 +
>  drivers/soc/fujitsu/Makefile           |    2 +
>  drivers/soc/fujitsu/fujitsu_hwb.c      | 1253 ++++++++++++++++++++++++
>  include/uapi/linux/fujitsu_hpc_ioctl.h |   41 +
>  8 files changed, 1334 insertions(+)
>  create mode 100644 drivers/soc/fujitsu/Kconfig  create mode 100644 drivers/soc/fujitsu/Makefile  create mode
> 100644 drivers/soc/fujitsu/fujitsu_hwb.c  create mode 100644 include/uapi/linux/fujitsu_hpc_ioctl.h
> 
> --
> 2.26.2


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2021-01-08 12:30   ` Arnd Bergmann
  0 siblings, 0 replies; 1546+ messages in thread
From: Arnd Bergmann @ 2021-01-08 12:30 UTC (permalink / raw)
  To: Misono Tomohiro
  Cc: Linux ARM, SoC Team, Will Deacon, Catalin Marinas, Arnd Bergmann,
	Olof Johansson

On Fri, Jan 8, 2021 at 11:32 AM Misono Tomohiro
<misono.tomohiro@jp.fujitsu.com> wrote:
> Subject: [RFC PATCH 00/10] Add Fujitsu A64FX soc entry/hardware barrier driver
> [RFC]
>  This is the first time we upstream drivers for our chip and I want to
>  confirm driver location and patch submission process.
>
>  Based on my observation it seems drivers/soc folder is right place to put
>  this driver, so I added Kconfig entry for arm64 platform config, created
>  soc/fujitsu folder and updated MAINTAINER entry accordingly (last patch).
>  Is it right?

This looks good as a start. It may be possible that during review, we
come up with a different location or a different user interface that may
change the code, but if it stays in drivers/soc/fujitsu, then the other
steps are absolutely right.

>  Also for final submission I think I need to 1) create some public git
>  tree to push driver code (github or something), 2) make pull request to
>  SOC team (soc@kernel.org). Is it a correct procedure?

Yes. I would prefer something other than github, e.g. an account
on a fujitsu.com host, on kernel.org, or on git.linaro.org, but github
works if none of the alternatives are easy for you.

When you send a pull request, make sure you sign the tag with
a gpg key, ideally after getting it on the kernel.org keyring [1].

       Arnd

[1] https://korg.docs.kernel.org/pgpkeys.html

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2021-01-08 12:30   ` Arnd Bergmann
  0 siblings, 0 replies; 1546+ messages in thread
From: Arnd Bergmann @ 2021-01-08 12:30 UTC (permalink / raw)
  To: Misono Tomohiro
  Cc: Arnd Bergmann, Catalin Marinas, SoC Team, Olof Johansson,
	Will Deacon, Linux ARM

On Fri, Jan 8, 2021 at 11:32 AM Misono Tomohiro
<misono.tomohiro@jp.fujitsu.com> wrote:
> Subject: [RFC PATCH 00/10] Add Fujitsu A64FX soc entry/hardware barrier driver
> [RFC]
>  This is the first time we upstream drivers for our chip and I want to
>  confirm driver location and patch submission process.
>
>  Based on my observation it seems drivers/soc folder is right place to put
>  this driver, so I added Kconfig entry for arm64 platform config, created
>  soc/fujitsu folder and updated MAINTAINER entry accordingly (last patch).
>  Is it right?

This looks good as a start. It may be possible that during review, we
come up with a different location or a different user interface that may
change the code, but if it stays in drivers/soc/fujitsu, then the other
steps are absolutely right.

>  Also for final submission I think I need to 1) create some public git
>  tree to push driver code (github or something), 2) make pull request to
>  SOC team (soc@kernel.org). Is it a correct procedure?

Yes. I would prefer something other than github, e.g. an account
on a fujitsu.com host, on kernel.org, or on git.linaro.org, but github
works if none of the alternatives are easy for you.

When you send a pull request, make sure you sign the tag with
a gpg key, ideally after getting it on the kernel.org keyring [1].

       Arnd

[1] https://korg.docs.kernel.org/pgpkeys.html

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <w2q9lf-sait7s-qswxlnzeof4i-7j13q0-zgu9pt-xk3x5enp994p-kewn2p-o86qyug0mutj-91m157sheva0-4k2l8v20kyjp-heu04baxqdc7op987-9zc0bxi0jcgo-wyl26layz5p9-esqncc-g48ass.1610618007875@email.android.com>
@ 2021-01-14 10:09 ` Alexander Kapshuk
  0 siblings, 0 replies; 1546+ messages in thread
From: Alexander Kapshuk @ 2021-01-14 10:09 UTC (permalink / raw)
  To: bigbird2444@163.com; +Cc: kernelnewbies@kernelnewbies.org

On Thu, Jan 14, 2021 at 11:54 AM ‪bigbird2444@163.com‬
<bigbird2444@163.com> wrote:
>
> On Thu, Jan 14, 2021 at 8:01 AM Alexander Kapshuk
> <alexander.kapshuk@gmail.com> wrote:
> >
> > On Thu, Jan 14, 2021 at 8:14 AM bigbird2444@163.com <bigbird2444@163.com> wrote:
> > >
> > >
> > > I've just added a newbies mailing list, How to join other mailing lists, and I'd like to see what other people are communicating with.
> > >
> > > _______________________________________________
> > > Kernelnewbies mailing list
> > > Kernelnewbies@kernelnewbies.org
> > > https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
> >
> > Not sure what other lists you were referring to, but you may want to
> > check out these mailing lists, http://vger.kernel.org/vger-lists.html,
> > and see if that's what you were after.
>
> >If you just would like to read the mails on >the different mailing
> >list, you do not need to subscribe.
>
> >You can find all emails at >https://lore.kernel.org/lists.html, just
> >look into the various mailing lists and see >what is of interest to
> >you.
>
> >Lukas
>
>
> Thank you, how do I subscribe to other mailing lists?
>
> Liang Peng
>
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

By clicking on the link for the mailing list of interest, e.g.
linux-next, http://vger.kernel.org/vger-lists.html#linux-next,
followed by clicking on the subscribe link, which would launch your
email client, if available, with majordomo@vger.kernel.org as the
recipient and the following email body:
subscribe name-of-mailing-list

Alternatively, you could simply send the subscription request above
using an email client of your preference.

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-01-19  0:10 David Howells
@ 2021-01-20 14:46 ` Jarkko Sakkinen
  0 siblings, 0 replies; 1546+ messages in thread
From: Jarkko Sakkinen @ 2021-01-20 14:46 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, Tobias Markus, Tianjia Zhang, keyrings, linux-crypto,
	linux-security-module, stable, linux-kernel

On Tue, Jan 19, 2021 at 12:10:33AM +0000, David Howells wrote:
> 
> From: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
> 
> On the following call path, `sig->pkey_algo` is not assigned
> in asymmetric_key_verify_signature(), which causes runtime
> crash in public_key_verify_signature().
> 
>   keyctl_pkey_verify
>     asymmetric_key_verify_signature
>       verify_signature
>         public_key_verify_signature
> 
> This patch simply check this situation and fixes the crash
> caused by NULL pointer.
> 
> Fixes: 215525639631 ("X.509: support OSCCA SM2-with-SM3 certificate verification")
> Reported-by: Tobias Markus <tobias@markus-regensburg.de>
> Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
> Signed-off-by: David Howells <dhowells@redhat.com>
> Reviewed-and-tested-by: Toke Høiland-Jørgensen <toke@redhat.com>
> Tested-by: João Fonseca <jpedrofonseca@ua.pt>
> Cc: stable@vger.kernel.org # v5.10+
> ---

For what it's worth

Acked-by: Jarkko Sakkinen <jarkko@kernel.org>

/Jarkko

> 
>  crypto/asymmetric_keys/public_key.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/crypto/asymmetric_keys/public_key.c b/crypto/asymmetric_keys/public_key.c
> index 8892908ad58c..788a4ba1e2e7 100644
> --- a/crypto/asymmetric_keys/public_key.c
> +++ b/crypto/asymmetric_keys/public_key.c
> @@ -356,7 +356,8 @@ int public_key_verify_signature(const struct public_key *pkey,
>  	if (ret)
>  		goto error_free_key;
>  
> -	if (strcmp(sig->pkey_algo, "sm2") == 0 && sig->data_size) {
> +	if (sig->pkey_algo && strcmp(sig->pkey_algo, "sm2") == 0 &&
> +	    sig->data_size) {
>  		ret = cert_sig_digest_update(sig, tfm);
>  		if (ret)
>  			goto error_free_key;
> 
> 

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CAMCTd2kkax9P-OFNHYYz8nKuaKOOkz-zoJ7h2nZ6maUGmjXC-g@mail.gmail.com>
@ 2021-03-16 12:16 ` westjoshuaalan
  0 siblings, 0 replies; 1546+ messages in thread
From: westjoshuaalan @ 2021-03-16 12:16 UTC (permalink / raw)
  To: linux-rdma

subscribe linux-rdma

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-04-05 21:12 David Villasana Jiménez
@ 2021-04-06  5:17 ` Greg KH
  0 siblings, 0 replies; 1546+ messages in thread
From: Greg KH @ 2021-04-06  5:17 UTC (permalink / raw)
  To: David Villasana Jiménez; +Cc: mchehab, linux-media, linux-staging

On Mon, Apr 05, 2021 at 04:12:48PM -0500, David Villasana Jiménez wrote:
> linux-kernel@vger.kernel.org, outreachy-kernel@googlegroups.com
> Bcc: 
> Subject: [PATCH] staging: media: atomisp: i2c: Fix alignment to match open
>  parenthesis
> Reply-To: 

Something went wrong with your email again :(

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-04-05  0:01 Mitali Borkar
@ 2021-04-06  7:03 ` Arnd Bergmann
  0 siblings, 0 replies; 1546+ messages in thread
From: Arnd Bergmann @ 2021-04-06  7:03 UTC (permalink / raw)
  To: Mitali Borkar
  Cc: manish, GR-Linux-NIC-Dev, gregkh, linux-staging,
	Linux Kernel Mailing List

On Mon, Apr 5, 2021 at 2:03 AM Mitali Borkar <mitaliborkar810@gmail.com> wrote:
>
> outreachy-kernel@googlegroups.com, mitaliborkar810@gmail.com
> Bcc:
> Subject: [PATCH] staging: qlge:remove else after break
> Reply-To:
>
> Fixed Warning:- else is not needed after break
> break terminates the loop if encountered. else is unnecessary and
> increases indenatation
>
> Signed-off-by: Mitali Borkar <mitaliborkar810@gmail.com>
> ---
>  drivers/staging/qlge/qlge_mpi.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/staging/qlge/qlge_mpi.c b/drivers/staging/qlge/qlge_mpi.c
> index 2630ebf50341..3a49f187203b 100644
> --- a/drivers/staging/qlge/qlge_mpi.c
> +++ b/drivers/staging/qlge/qlge_mpi.c
> @@ -935,13 +935,11 @@ static int qlge_idc_wait(struct qlge_adapter *qdev)
>                         netif_err(qdev, drv, qdev->ndev, "IDC Success.\n");
>                         status = 0;
>                         break;
> -               } else {
> -                       netif_err(qdev, drv, qdev->ndev,
> +               }       netif_err(qdev, drv, qdev->ndev,
>                                   "IDC: Invalid State 0x%.04x.\n",
>                                   mbcp->mbox_out[0]);
>                         status = -EIO;
>                         break;
> -               }
>         }

It looks like you got this one wrong in multiple ways:

- This is not an equivalent transformation, since the errror is now
  printed in the first part of the 'if()' block as well.

- The indentation is wrong now, with the netif_err() starting in the
  same line as the '}'.

- The description mentions a change in indentation, but you did not
   actually change it.

- The changelog text appears mangled.

        Arnd

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-04-15 13:41 Emmanuel Blot
@ 2021-04-15 16:07 ` Palmer Dabbelt
  2021-04-15 22:27 ` Re: Alistair Francis
  1 sibling, 0 replies; 1546+ messages in thread
From: Palmer Dabbelt @ 2021-04-15 16:07 UTC (permalink / raw)
  To: emmanuel.blot
  Cc: qemu-riscv, emmanuel.blot, Alistair Francis, sagark,
	Bastian Koppelmann

On Thu, 15 Apr 2021 06:41:29 PDT (-0700), emmanuel.blot@sifive.com wrote:
> Date: Tue, 13 Apr 2021 18:01:52 +0200
> Subject: [PATCH] target/riscv: fix exception index on instruction access fault
>
> When no MMU is used and the guest code attempts to fetch an instruction
> from an invalid memory location, the exception index defaults to a data
> load access fault, rather an instruction access fault.
>
> Signed-off-by: Emmanuel Blot <emmanuel.blot@sifive.com>
>
> ---
>  target/riscv/cpu_helper.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 21c54ef5613..4e107b1bd23 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -691,8 +691,10 @@ void riscv_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr,
>
>      if (access_type == MMU_DATA_STORE) {
>          cs->exception_index = RISCV_EXCP_STORE_AMO_ACCESS_FAULT;
> -    } else {
> +    } else if (access_type == MMU_DATA_LOAD) {
>          cs->exception_index = RISCV_EXCP_LOAD_ACCESS_FAULT;
> +    } else {
> +        cs->exception_index = RISCV_EXCP_INST_ACCESS_FAULT;
>      }
>
>      env->badaddr = addr;

Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-04-15 13:41 Emmanuel Blot
  2021-04-15 16:07 ` Palmer Dabbelt
@ 2021-04-15 22:27 ` Alistair Francis
  1 sibling, 0 replies; 1546+ messages in thread
From: Alistair Francis @ 2021-04-15 22:27 UTC (permalink / raw)
  To: qemu-riscv@nongnu.org, emmanuel.blot@sifive.com
  Cc: palmer@dabbelt.com, kbastian@mail.uni-paderborn.de,
	sagark@eecs.berkeley.edu

On Thu, 2021-04-15 at 15:41 +0200, Emmanuel Blot wrote:
> Date: Tue, 13 Apr 2021 18:01:52 +0200
> Subject: [PATCH] target/riscv: fix exception index on instruction
> access fault
> 
> When no MMU is used and the guest code attempts to fetch an instruction
> from an invalid memory location, the exception index defaults to a data
> load access fault, rather an instruction access fault.
> 
> Signed-off-by: Emmanuel Blot <emmanuel.blot@sifive.com>

Thanks for the patch. Can you send the patch to the QEMU mailling list?
qemu-devel@nongnu.org

Alistair

> 
> ---
>  target/riscv/cpu_helper.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 21c54ef5613..4e107b1bd23 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -691,8 +691,10 @@ void riscv_cpu_do_transaction_failed(CPUState *cs,
> hwaddr physaddr,
>  
>      if (access_type == MMU_DATA_STORE) {
>          cs->exception_index = RISCV_EXCP_STORE_AMO_ACCESS_FAULT;
> -    } else {
> +    } else if (access_type == MMU_DATA_LOAD) {
>          cs->exception_index = RISCV_EXCP_LOAD_ACCESS_FAULT;
> +    } else {
> +        cs->exception_index = RISCV_EXCP_INST_ACCESS_FAULT;
>      }
>  
>      env->badaddr = addr;


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <b84772b0-e009-3b68-4e74-525ad8531f95@gmail.com>
@ 2021-04-23 13:57 ` Ivan Koveshnikov
  2021-04-23 20:35   ` Re: Kajetan Puchalski
  0 siblings, 1 reply; 1546+ messages in thread
From: Ivan Koveshnikov @ 2021-04-23 13:57 UTC (permalink / raw)
  To: Kajetan Puchalski; +Cc: rust-for-linux

Hi Kajetan,

You need to send an `subscribe rust-for-linux` to
<majordomo@vger.kernel.org> email, not to the mail list.
http://vger.kernel.org/vger-lists.html page contains links that can
prepare an email message with correct format and address.

Best regards,
Ivan Koveshnikov

On Fri, 23 Apr 2021 at 02:53, Kajetan Puchalski <mrkajetanp@gmail.com> wrote:
>
> subscribe

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-04-23 13:57 ` Re: Ivan Koveshnikov
@ 2021-04-23 20:35   ` Kajetan Puchalski
  0 siblings, 0 replies; 1546+ messages in thread
From: Kajetan Puchalski @ 2021-04-23 20:35 UTC (permalink / raw)
  To: Ivan Koveshnikov; +Cc: rust-for-linux

On 4/23/21 2:57 PM, Ivan Koveshnikov wrote:
> Hi Kajetan,
> 
> You need to send an `subscribe rust-for-linux` to
> <majordomo@vger.kernel.org> email, not to the mail list.
> http://vger.kernel.org/vger-lists.html page contains links that can
> prepare an email message with correct format and address.
> 
> Best regards,
> Ivan Koveshnikov
> 
> 
> On Fri, 23 Apr 2021 at 02:53, Kajetan Puchalski <mrkajetanp@gmail.com> wrote:
>>
>> subscribe

Hi Ivan,

Thank you, apologies, I realised that I wasn't supposed to do that 
literally right after I had sent the email.

Regards,
Kajetan

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CAJr+-6ZR2oH0J4D_Ou13JvX8HLUUK=MKQwD0Kn53cmvAuT99bg@mail.gmail.com>
@ 2021-04-27  7:56 ` Fox Chen
  0 siblings, 0 replies; 1546+ messages in thread
From: Fox Chen @ 2021-04-27  7:56 UTC (permalink / raw)
  To: Skylar Givens; +Cc: rust-for-linux

Hi Skylar,

On Tue, Apr 27, 2021 at 11:38 AM Skylar Givens <skylargivens@gmail.com> wrote:
>
> subscribe rust-for-linux

For subscribing please see:
http://vger.kernel.org/majordomo-info.html
http://vger.kernel.org/vger-lists.html


thanks,
fox

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <60a57e3a.lbqA81rLGmtH2qoy%Radisson97@gmx.de>
@ 2021-05-21 11:04 ` Alejandro Colomar (man-pages)
  0 siblings, 0 replies; 1546+ messages in thread
From: Alejandro Colomar (man-pages) @ 2021-05-21 11:04 UTC (permalink / raw)
  To: Radisson97; +Cc: linux-man, Michael Kerrisk (man-pages)

Hi Walter,

On 5/19/21 11:08 PM, Radisson97@gmx.de wrote:
>  From 765db7b7714514780b4e613c6d09d2ff454b1ef8 Mon Sep 17 00:00:00 2001
> From: Harms <wharms@bfs.de>
> Date: Wed, 19 May 2021 22:25:08 +0200
> Subject: [PATCH] gamma.3:Add reentrant functions gamma_r
> 
> Add three variants of gamma_r and explain
> the use of the second argument sig
> 
> Signed-off-by: Harms <wharms@bfs.de>

I just read the manual page about gamma, and those functions/macros are 
deprecated (use either lgamma or tgamma instead).  As far as I can read, 
those alternative functions have all the functionality one can need, so 
I guess there's zero reasons to use gamma at all, which is a misleading 
alias for lgamma.  I think I won't patch that page at all.

The glibc source code itself has a comment saying that gamma macros are 
obsolete:

[
#if defined __USE_MISC || (defined __USE_XOPEN && !defined  __USE_XOPEN2K)
# if !__MATH_DECLARING_FLOATN
/* Obsolete alias for `lgamma'.  */
__MATHCALL (gamma,, (_Mdouble_));
# endif
#endif
]

Thanks,

Alex


-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
Senior SW Engineer; http://www.alejandro-colomar.es/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-05-15 22:57 Dmitry Baryshkov
@ 2021-06-02 21:45 ` Dmitry Baryshkov
  0 siblings, 0 replies; 1546+ messages in thread
From: Dmitry Baryshkov @ 2021-06-02 21:45 UTC (permalink / raw)
  To: Bjorn Andersson, Rob Clark, Sean Paul, Abhinav Kumar
  Cc: Jonathan Marek, Stephen Boyd, David Airlie, Daniel Vetter,
	linux-arm-msm, dri-devel, freedreno

On 16/05/2021 01:57, Dmitry Baryshkov wrote:
>  From Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # This line is ignored.
> From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> Reply-To:
> Subject: [PATCH v2 0/6] drm/msm/dpu: simplify RM code
> In-Reply-To:
> 
> There is no need to request most of hardware blocks through the resource
> manager (RM), since typically there is 1:1 or N:1 relationship between
> corresponding blocks. Each LM is tied to the single PP. Each MERGE_3D
> can be used by the specified pair of PPs.  Each DSPP is also tied to
> single LM. So instead of allocating them through the RM, get them via
> static configuration.
> 
> Depends on: https://lore.kernel.org/linux-arm-msm/20210515190909.1809050-1-dmitry.baryshkov@linaro.org
> 
> Changes since v1:
>   - Split into separate patch series to ease review.

Another gracious ping, now for this series.

I want to send next version with minor changes, but I'd like to hear 
your overall opinion before doing that.

> 
> ----------------------------------------------------------------
> Dmitry Baryshkov (6):
>        drm/msm/dpu: get DSPP blocks directly rather than through RM
>        drm/msm/dpu: get MERGE_3D blocks directly rather than through RM
>        drm/msm/dpu: get PINGPONG blocks directly rather than through RM
>        drm/msm/dpu: get INTF blocks directly rather than through RM
>        drm/msm/dpu: drop unused lm_max_width from RM
>        drm/msm/dpu: simplify peer LM handling
> 
>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c        |  54 +---
>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h        |   8 -
>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h   |   5 -
>   .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c   |   8 -
>   .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c   |   8 -
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c     |   2 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h     |   4 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.c          |  14 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h          |   7 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c    |   7 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h    |   4 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c            |  53 +++-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h            |   5 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c             | 310 ++-------------------
>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h             |  18 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h          |   9 +-
>   16 files changed, 115 insertions(+), 401 deletions(-)
> 
> 


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-06-06 19:19 Davidlohr Bueso
@ 2021-06-07 16:02 ` André Almeida
  0 siblings, 0 replies; 1546+ messages in thread
From: André Almeida @ 2021-06-07 16:02 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	linux-kernel, Steven Rostedt, Sebastian Andrzej Siewior, kernel,
	krisman, pgriffais, z.figura12, joel, malteskarupke, linux-api,
	fweimer, libc-alpha, linux-kselftest, shuah, acme, corbet,
	Peter Oskolkov, Andrey Semashev, mtk.manpages

Às 16:19 de 06/06/21, Davidlohr Bueso escreveu:
> Bcc:
> Subject: Re: [PATCH v4 07/15] docs: locking: futex2: Add documentation
> Reply-To:
> In-Reply-To: <20210603195924.361327-8-andrealmeid@collabora.com>
> 
> On Thu, 03 Jun 2021, Andrï¿½ Almeida wrote:
> 
>> Add a new documentation file specifying both userspace API and internal
>> implementation details of futex2 syscalls.
> 
> I think equally important would be to provide a manpage for each new
> syscall you are introducing, and keep mkt in the loop as in the past he
> extensively documented and improved futex manpages, and overall has a
> lot of experience with dealing with kernel interfaces.

Right, I'll add the man pages in a future version and make sure to have
mkt in the loop, thanks for the tip.

> 
> Thanks,
> Davidlohr
> 
>>
>> Signed-off-by: André Almeida <andrealmeid@collabora.com>
>> ---
>> Documentation/locking/futex2.rst | 198 +++++++++++++++++++++++++++++++
>> Documentation/locking/index.rst  |   1 +
>> 2 files changed, 199 insertions(+)
>> create mode 100644 Documentation/locking/futex2.rst
>>
>> diff --git a/Documentation/locking/futex2.rst
>> b/Documentation/locking/futex2.rst
>> new file mode 100644
>> index 000000000000..2f74d7c97a55
>> --- /dev/null
>> +++ b/Documentation/locking/futex2.rst
>> @@ -0,0 +1,198 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> +======
>> +futex2
>> +======
>> +
>> +:Author: André Almeida <andrealmeid@collabora.com>
>> +
>> +futex, or fast user mutex, is a set of syscalls to allow userspace to
>> create
>> +performant synchronization mechanisms, such as mutexes, semaphores and
>> +conditional variables in userspace. C standard libraries, like glibc,
>> uses it
>> +as a means to implement more high level interfaces like pthreads.
>> +
>> +The interface
>> +=============
>> +
>> +uAPI functions
>> +--------------
>> +
>> +.. kernel-doc:: kernel/futex2.c
>> +   :identifiers: sys_futex_wait sys_futex_wake sys_futex_waitv
>> sys_futex_requeue
>> +
>> +uAPI structures
>> +---------------
>> +
>> +.. kernel-doc:: include/uapi/linux/futex.h
>> +
>> +The ``flag`` argument
>> +---------------------
>> +
>> +The flag is used to specify the size of the futex word
>> +(FUTEX_[8, 16, 32, 64]). It's mandatory to define one, since there's no
>> +default size.
>> +
>> +By default, the timeout uses a monotonic clock, but can be used as a
>> realtime
>> +one by using the FUTEX_REALTIME_CLOCK flag.
>> +
>> +By default, futexes are of the private type, that means that this
>> user address
>> +will be accessed by threads that share the same memory region. This
>> allows for
>> +some internal optimizations, so they are faster. However, if the
>> address needs
>> +to be shared with different processes (like using ``mmap()`` or
>> ``shm()``), they
>> +need to be defined as shared and the flag FUTEX_SHARED_FLAG is used
>> to set that.
>> +
>> +By default, the operation has no NUMA-awareness, meaning that the
>> user can't
>> +choose the memory node where the kernel side futex data will be
>> stored. The
>> +user can choose the node where it wants to operate by setting the
>> +FUTEX_NUMA_FLAG and using the following structure (where X can be 8,
>> 16, 32 or
>> +64)::
>> +
>> + struct futexX_numa {
>> +         __uX value;
>> +         __sX hint;
>> + };
>> +
>> +This structure should be passed at the ``void *uaddr`` of futex
>> functions. The
>> +address of the structure will be used to be waited on/waken on, and the
>> +``value`` will be compared to ``val`` as usual. The ``hint`` member
>> is used to
>> +define which node the futex will use. When waiting, the futex will be
>> +registered on a kernel-side table stored on that node; when waking,
>> the futex
>> +will be searched for on that given table. That means that there's no
>> redundancy
>> +between tables, and the wrong ``hint`` value will lead to undesired
>> behavior.
>> +Userspace is responsible for dealing with node migrations issues that
>> may
>> +occur. ``hint`` can range from [0, MAX_NUMA_NODES), for specifying a
>> node, or
>> +-1, to use the same node the current process is using.
>> +
>> +When not using FUTEX_NUMA_FLAG on a NUMA system, the futex will be
>> stored on a
>> +global table on allocated on the first node.
>> +
>> +The ``timo`` argument
>> +---------------------
>> +
>> +As per the Y2038 work done in the kernel, new interfaces shouldn't
>> add timeout
>> +options known to be buggy. Given that, ``timo`` should be a 64-bit
>> timeout at
>> +all platforms, using an absolute timeout value.
>> +
>> +Implementation
>> +==============
>> +
>> +The internal implementation follows a similar design to the original
>> futex.
>> +Given that we want to replicate the same external behavior of current
>> futex,
>> +this should be somewhat expected.
>> +
>> +Waiting
>> +-------
>> +
>> +For the wait operations, they are all treated as if you want to wait
>> on N
>> +futexes, so the path for futex_wait and futex_waitv is the basically
>> the same.
>> +For both syscalls, the first step is to prepare an internal list for
>> the list
>> +of futexes to wait for (using struct futexv_head). For futex_wait()
>> calls, this
>> +list will have a single object.
>> +
>> +We have a hash table, where waiters register themselves before
>> sleeping. Then
>> +the wake function checks this table looking for waiters at uaddr. 
>> The hash
>> +bucket to be used is determined by a struct futex_key, that stores
>> information
>> +to uniquely identify an address from a given process. Given the huge
>> address
>> +space, there'll be hash collisions, so we store information to be
>> later used on
>> +collision treatment.
>> +
>> +First, for every futex we want to wait on, we check if (``*uaddr ==
>> val``).
>> +This check is done holding the bucket lock, so we are correctly
>> serialized with
>> +any futex_wake() calls. If any waiter fails the check above, we
>> dequeue all
>> +futexes. The check (``*uaddr == val``) can fail for two reasons:
>> +
>> +- The values are different, and we return -EAGAIN. However, if while
>> +  dequeueing we found that some futexes were awakened, we prioritize
>> this
>> +  and return success.
>> +
>> +- When trying to access the user address, we do so with page faults
>> +  disabled because we are holding a bucket's spin lock (and can't sleep
>> +  while holding a spin lock). If there's an error, it might be a page
>> +  fault, or an invalid address. We release the lock, dequeue everyone
>> +  (because it's illegal to sleep while there are futexes enqueued, we
>> +  could lose wakeups) and try again with page fault enabled. If we
>> +  succeed, this means that the address is valid, but we need to do
>> +  all the work again. For serialization reasons, we need to have the
>> +  spin lock when getting the user value. Additionally, for shared
>> +  futexes, we also need to recalculate the hash, since the underlying
>> +  mapping mechanisms could have changed when dealing with page fault.
>> +  If, even with page fault enabled, we can't access the address, it
>> +  means it's an invalid user address, and we return -EFAULT. For this
>> +  case, we prioritize the error, even if some futexes were awaken.
>> +
>> +If the check is OK, they are enqueued on a linked list in our bucket,
>> and
>> +proceed to the next one. If all waiters succeed, we put the thread to
>> sleep
>> +until a futex_wake() call, timeout expires or we get a signal. After
>> waking up,
>> +we dequeue everyone, and check if some futex was awakened. This
>> dequeue is done
>> +by iteratively walking at each element of struct futex_head list.
>> +
>> +All enqueuing/dequeuing operations requires to hold the bucket lock,
>> to avoid
>> +racing while modifying the list.
>> +
>> +Waking
>> +------
>> +
>> +We get the bucket that's storing the waiters at uaddr, and wake the
>> required
>> +number of waiters, checking for hash collision.
>> +
>> +There's an optimization that makes futex_wake() not take the bucket
>> lock if
>> +there's no one to be woken on that bucket. It checks an atomic
>> counter that each
>> +bucket has, if it says 0, then the syscall exits. In order for this
>> to work, the
>> +waiter thread increases it before taking the lock, so the wake thread
>> will
>> +correctly see that there's someone waiting and will continue the path
>> to take
>> +the bucket lock. To get the correct serialization, the waiter issues
>> a memory
>> +barrier after increasing the bucket counter and the waker issues a
>> memory
>> +barrier before checking it.
>> +
>> +Requeuing
>> +---------
>> +
>> +The requeue path first checks for each struct futex_requeue and their
>> flags.
>> +Then, it will compare the expected value with the one at uaddr1::uaddr.
>> +Following the same serialization explained at Waking_, we increase
>> the atomic
>> +counter for the bucket of uaddr2 before taking the lock. We need to
>> have both
>> +buckets locks at same time so we don't race with other futex
>> operation. To
>> +ensure the locks are taken in the same order for all threads (and
>> thus avoiding
>> +deadlocks), every requeue operation takes the "smaller" bucket first,
>> when
>> +comparing both addresses.
>> +
>> +If the compare with user value succeeds, we proceed by waking
>> ``nr_wake``
>> +futexes, and then requeuing ``nr_requeue`` from bucket of uaddr1 to
>> the uaddr2.
>> +This consists in a simple list deletion/addition and replacing the
>> old futex key
>> +with the new one.
>> +
>> +Futex keys
>> +----------
>> +
>> +There are two types of futexes: private and shared ones. The private
>> are futexes
>> +meant to be used by threads that share the same memory space, are
>> easier to be
>> +uniquely identified and thus can have some performance optimization. The
>> +elements for identifying one are: the start address of the page where
>> the
>> +address is, the address offset within the page and the current->mm
>> pointer.
>> +
>> +Now, for uniquely identifying a shared futex:
>> +
>> +- If the page containing the user address is an anonymous page, we can
>> +  just use the same data used for private futexes (the start address of
>> +  the page, the address offset within the page and the current->mm
>> +  pointer); that will be enough for uniquely identifying such futex. We
>> +  also set one bit at the key to differentiate if a private futex is
>> +  used on the same address (mixing shared and private calls does not
>> +  work).
>> +
>> +- If the page is file-backed, current->mm maybe isn't the same one for
>> +  every user of this futex, so we need to use other data: the
>> +  page->index, a UUID for the struct inode and the offset within the
>> +  page.
>> +
>> +Note that members of futex_key don't have any particular meaning
>> after they
>> +are part of the struct - they are just bytes to identify a futex. 
>> Given that,
>> +we don't need to use a particular name or type that matches the
>> original data,
>> +we only need to care about the bitsize of each component and make
>> both private
>> +and shared fit in the same memory space.
>> +
>> +Source code documentation
>> +=========================
>> +
>> +.. kernel-doc:: kernel/futex2.c
>> +   :no-identifiers: sys_futex_wait sys_futex_wake sys_futex_waitv
>> sys_futex_requeue
>> diff --git a/Documentation/locking/index.rst
>> b/Documentation/locking/index.rst
>> index 7003bd5aeff4..9bf03c7fa1ec 100644
>> --- a/Documentation/locking/index.rst
>> +++ b/Documentation/locking/index.rst
>> @@ -24,6 +24,7 @@ locking
>>     percpu-rw-semaphore
>>     robust-futexes
>>     robust-futex-ABI
>> +    futex2
>>
>> .. only::  subproject and html
>>
>> -- 
>> 2.31.1
>>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CAFBCWQJX4Xy8Sot7en5JBTuKrzy=_6xFkc+QgOxJEC7G6x+jzg@mail.gmail.com>
@ 2021-06-12  3:43 ` Ammar Faizi
  0 siblings, 0 replies; 1546+ messages in thread
From: Ammar Faizi @ 2021-06-12  3:43 UTC (permalink / raw)
  To: io-uring

subscribe io-uring

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-07-16 17:07 Subhasmita Swain
@ 2021-07-16 18:15 ` Lukas Bulwahn
  0 siblings, 0 replies; 1546+ messages in thread
From: Lukas Bulwahn @ 2021-07-16 18:15 UTC (permalink / raw)
  To: Subhasmita Swain; +Cc: linux-kernel-mentees

On Fri, Jul 16, 2021 at 7:07 PM Subhasmita Swain
<subhasmitaofc@gmail.com> wrote:
>
> I am interested in the Mining Maintainers mentorship program and I would like to work on the tasks for the mentee selection.

Thanks for your interest.

For the mentee selection, please work on the following exercise:

First, download the kernel git repository and compile the kernel with
the x86-defconfig build configuration.

There is a file called MAINTAINERS in the root directory of the git
repository. Read the introduction at the beginning of the MAINTAINERS
file and understand the content in the file and its organisation.

Explain in your words: What is stored in the MAINTAINERS file?

Now, search for specific MAINTAINER entries; Please answer: Who are
the maintainers and reviewers of the following sections?

AMD IOMMU (AMD-VI)
DRIVER CORE, KOBJECTS, DEBUGFS AND SYSFS
DRM DRIVERS
FUTEX SUBSYSTEM
I2C SUBSYSTEM
JAILHOUSE HYPERVISOR INTERFACE
KCOV
KCSAN
LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM)
NAND FLASH SUBSYSTEM
THE REST

Please let me know about your answers and always send your responses
to the linux-kernel-mentees mailing list.

After that first exercise, exercises 2 and 3 will follow.

Lukas
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-07-27 15:10 ` Darrick J. Wong
@ 2021-07-27 15:23   ` Andreas Grünbacher
  2021-07-27 15:30   ` Re: Gao Xiang
  1 sibling, 0 replies; 1546+ messages in thread
From: Andreas Grünbacher @ 2021-07-27 15:23 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Andreas Gruenbacher, LKML, Matthew Wilcox, Joseph Qi,
	Linux FS-devel Mailing List, linux-erofs, Christoph Hellwig

Am Di., 27. Juli 2021 um 17:11 Uhr schrieb Darrick J. Wong <djwong@kernel.org>:
> I'll change the subject to:
>
> iomap: support reading inline data from non-zero pos

That surely works for me.

Thanks,
Andreas

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-07-27 15:10 ` Darrick J. Wong
  2021-07-27 15:23   ` Andreas Grünbacher
@ 2021-07-27 15:30   ` Gao Xiang
  1 sibling, 0 replies; 1546+ messages in thread
From: Gao Xiang @ 2021-07-27 15:30 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Andreas Gruenbacher, LKML, Matthew Wilcox, Joseph Qi,
	linux-fsdevel, linux-erofs, Christoph Hellwig

On Tue, Jul 27, 2021 at 08:10:51AM -0700, Darrick J. Wong wrote:
> I'll change the subject to:
> 
> iomap: support reading inline data from non-zero pos

I'm fine with this too. Many thanks for updating!

Thanks,
Gao Xiang


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CAKPXbjesQH_k1Z7k4kNwpoAf-jYgbUaPqPCgNTJZ35peVBy_pA@mail.gmail.com>
@ 2021-08-29 12:01 ` Lukas Bulwahn
  0 siblings, 0 replies; 1546+ messages in thread
From: Lukas Bulwahn @ 2021-08-29 12:01 UTC (permalink / raw)
  To: Harshita; +Cc: linux-kernel-mentees

On Sat, Aug 28, 2021 at 4:18 AM Harshita <hrsa.kshyp@gmail.com> wrote:
>
> Hello, I m Harshita.
>
> I am interested in the Checkpatch Documentation mentorship program and I would like to work on the tasks for the mentee selection.
>

Sorry, you were too late and missed the deadline. Please re-apply for
the next period.

Lukas
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
@ 2021-09-03 20:51 Mr. James Khmalo
  0 siblings, 0 replies; 1546+ messages in thread
From: Mr. James Khmalo @ 2021-09-03 20:51 UTC (permalink / raw)
  To: soc

Good Day,

I know this email might come to you as a surprise as first coming from one you haven’t met with before.
I am Mr. James Khmalo, the bank manager with ABSA bank of South Africa,  and a personal banker of Dr.Mohamed Farouk Ibrahim, an Egyptian who happened to be a medical contractor attached to the overthrown Afghan government by the Taliban government.   
Dr.Mohamed Farouk Ibrahim deposits some sum of money with our bank but passed away with his family while trying to escape from Kandahar.
The said sum can be used for an investment if you are interested.  Details relating to the funds are in my position and will present you as the Next-of-Kin because there was none, and I shall furnish you with more detail once your response.

Regards,
Mr. James Khmalo
Tel: 27-632696383
South Africa

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-10-12  1:23 ` James Bottomley
@ 2021-10-12  2:30   ` Bart Van Assche
  0 siblings, 0 replies; 1546+ messages in thread
From: Bart Van Assche @ 2021-10-12  2:30 UTC (permalink / raw)
  To: jejb, docfate111, linux-scsi

On 10/11/21 18:23, James Bottomley wrote:
> On Mon, 2021-10-11 at 19:15 -0400, docfate111 wrote:
>> linux-scsi@vger.kernel.org,
>> linux-kernel@vger.kernel.org,
>> martin.petersen@oracle.com
>> Bcc:
>> Subject: [PATCH] scsi_lib fix the NULL pointer dereference
>> Reply-To:
>>
>> scsi_setup_scsi_cmnd should check for the pointer before
>> scsi_command_size dereferences it.
> 
> Have you seen this?  As in do you have a trace?  This should be an
> impossible condition, so we need to see where it came from.  The patch
> as proposed is not right, because if something is setting cmd_len
> without setting the cmnd pointer we need the cause fixed rather than
> applying a band aid in scsi_setup_scsi_cmnd().

Hi James and Thelford,

This patch looks like a duplicate of a patch posted one month ago? I 
think Christoph agrees to remove the cmd_len == 0 check. See also 
https://lore.kernel.org/linux-scsi/20210904064534.1919476-1-qiulaibin@huawei.com/.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-10-08  1:24 Dmitry Baryshkov
@ 2021-10-12 23:59 ` Linus Walleij
  2021-10-13  3:46   ` Re: Dmitry Baryshkov
  2021-10-17 16:54   ` Re: Bjorn Andersson
  2021-10-17 21:35 ` Re: Linus Walleij
  1 sibling, 2 replies; 1546+ messages in thread
From: Linus Walleij @ 2021-10-12 23:59 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Andy Gross, Bjorn Andersson, Rob Herring,
	open list:GPIO SUBSYSTEM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, MSM

On Fri, Oct 8, 2021 at 3:25 AM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:

> In 2019 (in kernel 5.4) spmi-gpio and ssbi-gpio drivers were converted
> to hierarchical IRQ helpers, however MPP drivers were not converted at
> that moment. Complete this by converting MPP drivers.
>
> Changes since v2:
>  - Add patches fixing/updating mpps nodes in the existing device trees

Thanks a *lot* for being thorough and fixing all this properly!

I am happy to apply the pinctrl portions to the pinctrl tree, I'm
uncertain about Rob's syntax checker robot here, are there real
problems? Sometimes it complains about things being changed
in the DTS files at the same time.

I could apply all of this (including DTS changes) to an immutable
branch and offer to Bjorn if he is fine with the patches and
the general approach.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-10-12 23:59 ` Linus Walleij
@ 2021-10-13  3:46   ` Dmitry Baryshkov
  2021-10-13 23:39     ` Re: Linus Walleij
  2021-10-17 16:54   ` Re: Bjorn Andersson
  1 sibling, 1 reply; 1546+ messages in thread
From: Dmitry Baryshkov @ 2021-10-13  3:46 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Andy Gross, Bjorn Andersson, Rob Herring,
	open list:GPIO SUBSYSTEM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, MSM

On Wed, 13 Oct 2021 at 02:59, Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Fri, Oct 8, 2021 at 3:25 AM Dmitry Baryshkov
> <dmitry.baryshkov@linaro.org> wrote:
>
> > In 2019 (in kernel 5.4) spmi-gpio and ssbi-gpio drivers were converted
> > to hierarchical IRQ helpers, however MPP drivers were not converted at
> > that moment. Complete this by converting MPP drivers.
> >
> > Changes since v2:
> >  - Add patches fixing/updating mpps nodes in the existing device trees
>
> Thanks a *lot* for being thorough and fixing all this properly!
>
> I am happy to apply the pinctrl portions to the pinctrl tree, I'm
> uncertain about Rob's syntax checker robot here, are there real
> problems? Sometimes it complains about things being changed
> in the DTS files at the same time.

Rob's checker reports issue that are being fixed by respective
patches. I think I've updated all dts entries for the mpp devices tree
nodes.

> I could apply all of this (including DTS changes) to an immutable
> branch and offer to Bjorn if he is fine with the patches and
> the general approach.

I'm fine with either approach.

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-10-13  3:46   ` Re: Dmitry Baryshkov
@ 2021-10-13 23:39     ` Linus Walleij
  0 siblings, 0 replies; 1546+ messages in thread
From: Linus Walleij @ 2021-10-13 23:39 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Andy Gross, Bjorn Andersson, Rob Herring,
	open list:GPIO SUBSYSTEM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, MSM

On Wed, Oct 13, 2021 at 5:46 AM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:
> On Wed, 13 Oct 2021 at 02:59, Linus Walleij <linus.walleij@linaro.org> wrote:

> > I am happy to apply the pinctrl portions to the pinctrl tree, I'm
> > uncertain about Rob's syntax checker robot here, are there real
> > problems? Sometimes it complains about things being changed
> > in the DTS files at the same time.
>
> Rob's checker reports issue that are being fixed by respective
> patches. I think I've updated all dts entries for the mpp devices tree
> nodes.
>
> > I could apply all of this (including DTS changes) to an immutable
> > branch and offer to Bjorn if he is fine with the patches and
> > the general approach.
>
> I'm fine with either approach.

Let's see what Bjorn says, if nothing happens poke me again and I'll
create an immutable branch and merge it.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-10-12 23:59 ` Linus Walleij
  2021-10-13  3:46   ` Re: Dmitry Baryshkov
@ 2021-10-17 16:54   ` Bjorn Andersson
  2021-10-17 21:31     ` Re: Linus Walleij
  1 sibling, 1 reply; 1546+ messages in thread
From: Bjorn Andersson @ 2021-10-17 16:54 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Dmitry Baryshkov, Andy Gross, Rob Herring,
	open list:GPIO SUBSYSTEM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, MSM

On Tue 12 Oct 18:59 CDT 2021, Linus Walleij wrote:

> On Fri, Oct 8, 2021 at 3:25 AM Dmitry Baryshkov
> <dmitry.baryshkov@linaro.org> wrote:
> 
> > In 2019 (in kernel 5.4) spmi-gpio and ssbi-gpio drivers were converted
> > to hierarchical IRQ helpers, however MPP drivers were not converted at
> > that moment. Complete this by converting MPP drivers.
> >
> > Changes since v2:
> >  - Add patches fixing/updating mpps nodes in the existing device trees
> 
> Thanks a *lot* for being thorough and fixing all this properly!
> 
> I am happy to apply the pinctrl portions to the pinctrl tree, I'm
> uncertain about Rob's syntax checker robot here, are there real
> problems? Sometimes it complains about things being changed
> in the DTS files at the same time.
> 
> I could apply all of this (including DTS changes) to an immutable
> branch and offer to Bjorn if he is fine with the patches and
> the general approach.
> 

I like the driver changes and I'm wrapping up a second pull for the dts
pieces in the coming few days. So if you're happy to take the driver
patches I'll include the DT changes for 5.16 as well.

Thanks,
Bjorn

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-10-17 16:54   ` Re: Bjorn Andersson
@ 2021-10-17 21:31     ` Linus Walleij
  0 siblings, 0 replies; 1546+ messages in thread
From: Linus Walleij @ 2021-10-17 21:31 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Dmitry Baryshkov, Andy Gross, Rob Herring,
	open list:GPIO SUBSYSTEM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, MSM

On Sun, Oct 17, 2021 at 6:54 PM Bjorn Andersson
<bjorn.andersson@linaro.org> wrote:

> I like the driver changes and I'm wrapping up a second pull for the dts
> pieces in the coming few days. So if you're happy to take the driver
> patches I'll include the DT changes for 5.16 as well.

OK let's do like that. I'll queue the binding changes and driver
changes so we finally get this fixed up.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-10-08  1:24 Dmitry Baryshkov
  2021-10-12 23:59 ` Linus Walleij
@ 2021-10-17 21:35 ` Linus Walleij
  1 sibling, 0 replies; 1546+ messages in thread
From: Linus Walleij @ 2021-10-17 21:35 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Andy Gross, Bjorn Andersson, Rob Herring,
	open list:GPIO SUBSYSTEM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, MSM

I queued thes patches in the pinctrl tree for v5.16:

On Fri, Oct 8, 2021 at 3:25 AM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:

>       dt-bindings: pinctrl: qcom,pmic-mpp: Convert qcom pmic mpp bindings to YAML
>       pinctrl: qcom: ssbi-mpp: hardcode IRQ counts
>       pinctrl: qcom: ssbi-mpp: add support for hierarchical IRQ chip
>       pinctrl: qcom: spmi-mpp: hardcode IRQ counts
>       pinctrl: qcom: spmi-mpp: add support for hierarchical IRQ chip
>       dt-bindings: pinctrl: qcom,pmic-mpp: switch to #interrupt-cells

Any breakages will be fixed when Bjorn applies the DTS changes to his
tree.

I wonder about the MFD patch, maybe Lee can expedite merging that too
or ACK it for Bjorn to merge with the remainders.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CAGGnn3JZdc3ETS_AijasaFUqLY9e5Q1ZHK3+806rtsEBnAo5Og@mail.gmail.com>
@ 2021-11-23 17:20 ` Christian COMMARMOND
  0 siblings, 0 replies; 1546+ messages in thread
From: Christian COMMARMOND @ 2021-11-23 17:20 UTC (permalink / raw)
  To: linux-btrfs

Hi,

I use a TERRAMASTER F5-422, with 14Gb disks with 3 btrfs partitions.
After repeated power outages, the 3rd partition mounts, but data is
not visible, other that the first root directory.

I tried to repair the disk and get this:
[root@TNAS-00E1FD ~]# btrfsck --repair /dev/mapper/vg0-lv2
enabling repair mode
...
Starting repair.
Opening filesystem to check...
Checking filesystem on /dev/mapper/vg0-lv2
UUID: a7b536f5-1827-479c-9170-eccbbc624370
[1/7] checking root items
Error: could not find btree root extent for root 257
ERROR: failed to repair root items: No such file or directory

(I put the full /var/log/messages at the end of this mail).

What can I do to get my data back?
This is a backup disk, and I am supposed to have a copy of it in
another place, but there too, murphy's law, I had some disk failures,
and lost some of my data.
So it would be very good to be able to recover some data from these disks.

Other informations:
lsblk:
NAME          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda             8:0    0   3.7T  0 disk
|-sda1          8:1    0   285M  0 part
|-sda2          8:2    0   1.9G  0 part
| `-md9         9:9    0   1.9G  0 raid1 /
-|-sda3          8:3    0   977M  0 part
| `-md8         9:8    0 976.4M  0 raid1 [SWAP]
`-sda4          8:4    0   3.7T  0 part
  `-md0         9:0    0  14.6T  0 raid5
    |-vg0-lv0 251:0    0     2T  0 lvm   /mnt/md0
    |-vg0-lv1 251:1    0   3.9T  0 lvm   /mnt/md1
    `-vg0-lv2 251:2    0   8.7T  0 lvm   /mnt/md2
sdb             8:16   0   3.7T  0 disk
|-sdb1          8:17   0   285M  0 part
|-sdb2          8:18   0   1.9G  0 part
| `-md9         9:9    0   1.9G  0 raid1 /
|-sdb3          8:19   0   977M  0 part
| `-md8         9:8    0 976.4M  0 raid1 [SWAP]
`-sdb4          8:20   0   3.7T  0 part
  `-md0         9:0    0  14.6T  0 raid5
    |-vg0-lv0 251:0    0     2T  0 lvm   /mnt/md0
    |-vg0-lv1 251:1    0   3.9T  0 lvm   /mnt/md1
    `-vg0-lv2 251:2    0   8.7T  0 lvm   /mnt/md2
sdc             8:32   0   3.7T  0 disk
|-sdc1          8:33   0   285M  0 part
|-sdc2          8:34   0   1.9G  0 part
| `-md9         9:9    0   1.9G  0 raid1 /
|-sdc3          8:35   0   977M  0 part
| `-md8         9:8    0 976.4M  0 raid1 [SWAP]
`-sdc4          8:36   0   3.7T  0 part
  `-md0         9:0    0  14.6T  0 raid5
    |-vg0-lv0 251:0    0     2T  0 lvm   /mnt/md0
    |-vg0-lv1 251:1    0   3.9T  0 lvm   /mnt/md1
    `-vg0-lv2 251:2    0   8.7T  0 lvm   /mnt/md2
sdd             8:48   0   3.7T  0 disk
|-sdd1          8:49   0   285M  0 part
|-sdd2          8:50   0   1.9G  0 part
| `-md9         9:9    0   1.9G  0 raid1 /
|-sdd3          8:51   0   977M  0 part
| `-md8         9:8    0 976.4M  0 raid1 [SWAP]
`-sdd4          8:52   0   3.7T  0 part
  `-md0         9:0    0  14.6T  0 raid5
    |-vg0-lv0 251:0    0     2T  0 lvm   /mnt/md0
    |-vg0-lv1 251:1    0   3.9T  0 lvm   /mnt/md1
    `-vg0-lv2 251:2    0   8.7T  0 lvm   /mnt/md2
sde             8:64   0   3.7T  0 disk
|-sde1          8:65   0   285M  0 part
|-sde2          8:66   0   1.9G  0 part
| `-md9         9:9    0   1.9G  0 raid1 /
|-sde3          8:67   0   977M  0 part
| `-md8         9:8    0 976.4M  0 raid1 [SWAP]
`-sde4          8:68   0   3.7T  0 part
  `-md0         9:0    0  14.6T  0 raid5
    |-vg0-lv0 251:0    0     2T  0 lvm   /mnt/md0
    |-vg0-lv1 251:1    0   3.9T  0 lvm   /mnt/md1
    `-vg0-lv2 251:2    0   8.7T  0 lvm   /mnt/md2


df -h:
Filesystem                Size      Used Available Use% Mounted on
/dev/md9                  1.8G    576.8M      1.2G  32% /
devtmpfs                  1.8G         0      1.8G   0% /dev
tmpfs                     1.8G         0      1.8G   0% /dev/shm
tmpfs                     1.8G      1.1M      1.8G   0% /tmp
tmpfs                     1.8G    236.0K      1.8G   0% /run
tmpfs                     1.8G      6.3M      1.8G   0% /opt/var
/dev/mapper/vg0-lv0       2.0T     34.5M      2.0T   0% /mnt/md0
/dev/mapper/vg0-lv1       3.9T     16.3M      3.9T   0% /mnt/md1
/dev/mapper/vg0-lv2       8.7T      2.9T      5.8T  33% /mnt/md2

This physical disks are new (a few months) and do not show errors.

I hope there is a way to fix this.

regards,

Christian COMMARMOND


Here, the full (restricted to 'kernel') from the lines where I begin
to see errors:
Nov 23 17:00:46 TNAS-00E1FD kernel: [   34.540572] Detached from
scsi7, channel 0, id 0, lun 0, type 0
Nov 23 17:00:48 TNAS-00E1FD kernel: [   37.148169] md: md8 stopped.
Nov 23 17:00:48 TNAS-00E1FD kernel: [   37.154395] md/raid1:md8:
active with 1 out of 72 mirrors
Nov 23 17:00:48 TNAS-00E1FD kernel: [   37.155564] md8: detected
capacity change from 0 to 1023868928
Nov 23 17:00:49 TNAS-00E1FD kernel: [   38.240910] md: recovery of
RAID array md8
Nov 23 17:00:49 TNAS-00E1FD kernel: [   38.276712] md: md8: recovery
interrupted.
Nov 23 17:00:50 TNAS-00E1FD kernel: [   38.346552] md: recovery of
RAID array md8
Nov 23 17:00:50 TNAS-00E1FD kernel: [   38.392148] md: md8: recovery
interrupted.
Nov 23 17:00:50 TNAS-00E1FD kernel: [   38.458126] md: recovery of
RAID array md8
Nov 23 17:00:50 TNAS-00E1FD kernel: [   38.494025] md: md8: recovery
interrupted.
Nov 23 17:00:50 TNAS-00E1FD kernel: [   38.576871] md: recovery of
RAID array md8
Nov 23 17:00:50 TNAS-00E1FD kernel: [   38.837269] Adding 999868k swap
on /dev/md8.  Priority:-1 extents:1 across:999868k
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.801285] md: md0 stopped.
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.859798] md/raid:md0: device
sda4 operational as raid disk 0
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.861417] md/raid:md0: device
sde4 operational as raid disk 4
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.863675] md/raid:md0: device
sdd4 operational as raid disk 3
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.865059] md/raid:md0: device
sdc4 operational as raid disk 2
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.866373] md/raid:md0: device
sdb4 operational as raid disk 1
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.869300] md/raid:md0: raid
level 5 active with 5 out of 5 devices, algorithm 2
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.926721] md0: detected
capacity change from 0 to 15989118861312
Nov 23 17:00:57 TNAS-00E1FD kernel: [   46.111539] md: md8: recovery done.
Nov 23 17:00:57 TNAS-00E1FD kernel: [   46.269349] flashcache:
flashcache-3.1.1 initialized
Nov 23 17:00:58 TNAS-00E1FD kernel: [   46.394510] BTRFS: device fsid
bdc3dbee-00a3-4541-99b4-096cd27939f2 devid 1 transid 679
/dev/mapper/vg0-lv0
Nov 23 17:00:58 TNAS-00E1FD kernel: [   46.397072] BTRFS info (device
dm-0): metadata ratio 50
Nov 23 17:00:58 TNAS-00E1FD kernel: [   46.399122] BTRFS info (device
dm-0): using free space tree
Nov 23 17:00:58 TNAS-00E1FD kernel: [   46.400380] BTRFS info (device
dm-0): has skinny extents
Nov 23 17:00:58 TNAS-00E1FD kernel: [   46.471236] BTRFS info (device
dm-0): new size for /dev/mapper/vg0-lv0 is 2147483648000
Nov 23 17:00:58 TNAS-00E1FD kernel: [   47.087622] BTRFS: device fsid
a5828e5a-1b11-4743-891c-11d0d8aeb1ae devid 1 transid 107
/dev/mapper/vg0-lv1
Nov 23 17:00:58 TNAS-00E1FD kernel: [   47.089943] BTRFS info (device
dm-1): metadata ratio 50
Nov 23 17:00:58 TNAS-00E1FD kernel: [   47.091505] BTRFS info (device
dm-1): using free space tree
Nov 23 17:00:58 TNAS-00E1FD kernel: [   47.093062] BTRFS info (device
dm-1): has skinny extents
Nov 23 17:00:58 TNAS-00E1FD kernel: [   47.150713] BTRFS info (device
dm-1): new size for /dev/mapper/vg0-lv1 is 4294967296000
Nov 23 17:00:59 TNAS-00E1FD kernel: [   47.737119] BTRFS: device fsid
a7b536f5-1827-479c-9170-eccbbc624370 devid 1 transid 142633
/dev/mapper/vg0-lv2
Nov 23 17:00:59 TNAS-00E1FD kernel: [   47.739313] BTRFS info (device
dm-2): metadata ratio 50
Nov 23 17:00:59 TNAS-00E1FD kernel: [   47.740630] BTRFS info (device
dm-2): using free space tree
Nov 23 17:00:59 TNAS-00E1FD kernel: [   47.741892] BTRFS info (device
dm-2): has skinny extents
Nov 23 17:00:59 TNAS-00E1FD kernel: [   47.946451] BTRFS info (device
dm-2): bdev /dev/mapper/vg0-lv2 errs: wr 0, rd 0, flush 0, corrupt 0,
gen 8
Nov 23 17:01:01 TNAS-00E1FD kernel: [   49.693394] BTRFS info (device
dm-2): checking UUID tree
Nov 23 17:01:01 TNAS-00E1FD kernel: [   49.700560] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:01 TNAS-00E1FD kernel: [   49.707394] BTRFS info (device
dm-2): new size for /dev/mapper/vg0-lv2 is 9546663723008
Nov 23 17:01:01 TNAS-00E1FD kernel: [   49.713109] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:01 TNAS-00E1FD kernel: [   49.715107] BTRFS warning
(device dm-2): iterating uuid_tree failed -5
Nov 23 17:01:01 TNAS-00E1FD kernel: [   49.795716] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:01 TNAS-00E1FD kernel: [   49.798231] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:03 TNAS-00E1FD kernel: [   52.272802] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:03 TNAS-00E1FD kernel: [   52.275264] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:03 TNAS-00E1FD kernel: [   52.277208] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:03 TNAS-00E1FD kernel: [   52.278483] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:04 TNAS-00E1FD kernel: [   52.570033] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:04 TNAS-00E1FD kernel: [   52.571487] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:05 TNAS-00E1FD kernel: [   54.250527] nf_conntrack:
default automatic helper assignment has been turned off for security
reasons and CT-based  firewall rule not found. Use the iptables CT
target to attach helpers instead.
Nov 23 17:01:07 TNAS-00E1FD kernel: [   56.050418]
verify_parent_transid: 2 callbacks suppressed
Nov 23 17:01:07 TNAS-00E1FD kernel: [   56.050424] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:07 TNAS-00E1FD kernel: [   56.063012] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:07 TNAS-00E1FD kernel: [   56.166746] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:07 TNAS-00E1FD kernel: [   56.167903] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:07 TNAS-00E1FD kernel: [   56.274188] NFSD: starting
90-second grace period (net ffffffff9db5abc0)
Nov 23 17:01:09 TNAS-00E1FD kernel: [   57.524631] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:09 TNAS-00E1FD kernel: [   57.525878] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:09 TNAS-00E1FD kernel: [   57.589706] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:09 TNAS-00E1FD kernel: [   57.590882] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:10 TNAS-00E1FD kernel: [   58.315852] warning: `smbd'
uses legacy ethtool link settings API, link modes are only partially
reported
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.882060] BTRFS error (device
dm-2): incorrect extent count for 29360128; counted 740, expected 677
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.883000] BTRFS: error
(device dm-2) in convert_free_space_to_extents:457: errno=-5 IO
failure
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.883946] BTRFS info (device
dm-2): forced readonly
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.884896] BTRFS: error
(device dm-2) in add_to_free_space_tree:1052: errno=-5 IO failure
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.885863] BTRFS: error
(device dm-2) in __btrfs_free_extent:7106: errno=-5 IO failure
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.886825] BTRFS: error
(device dm-2) in btrfs_run_delayed_refs:3009: errno=-5 IO failure
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.887803] BTRFS warning
(device dm-2): Skipping commit of aborted transaction.
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.888807] BTRFS: error
(device dm-2) in cleanup_transaction:1873: errno=-5 IO failure
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.892906] BTRFS error (device
dm-2): incorrect extent count for 29360128; counted 739, expected 676
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.199509] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.212280] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.214362] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.216331] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.224184] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.225500] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.227338] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.228636] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.915492] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.936745] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.938543] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.940375] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.951375] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.952810] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.972430] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.973548] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.974819] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.975984] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:54 TNAS-00E1FD kernel: [  222.807122]
verify_parent_transid: 6 callbacks suppressed
Nov 23 17:03:54 TNAS-00E1FD kernel: [  222.807127] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:54 TNAS-00E1FD kernel: [  222.819996] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:54 TNAS-00E1FD kernel: [  222.923926] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:54 TNAS-00E1FD kernel: [  222.925434] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:54 TNAS-00E1FD kernel: [  223.061241] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:54 TNAS-00E1FD kernel: [  223.062463] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:59 TNAS-00E1FD kernel: [  227.554549] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:59 TNAS-00E1FD kernel: [  227.556100] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:04:13 TNAS-00E1FD kernel: [  242.190152] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:04:13 TNAS-00E1FD kernel: [  242.202843] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:04:13 TNAS-00E1FD kernel: [  242.215390] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:04:13 TNAS-00E1FD kernel: [  242.217241] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:05:15 TNAS-00E1FD kernel: [  303.772878] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:05:15 TNAS-00E1FD kernel: [  303.785862] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:14 TNAS-00E1FD kernel: [  362.480763] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:14 TNAS-00E1FD kernel: [  362.493848] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.055419] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.068306] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.069074] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.069862] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.076040] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.076821] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.077643] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.078360] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:02 TNAS-00E1FD kernel: [  830.643054] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:02 TNAS-00E1FD kernel: [  830.664937] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:11 TNAS-00E1FD kernel: [  839.988330] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:11 TNAS-00E1FD kernel: [  839.989850] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:11 TNAS-00E1FD kernel: [  839.991371] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:11 TNAS-00E1FD kernel: [  839.992867] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:12 TNAS-00E1FD kernel: [  840.488126] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:12 TNAS-00E1FD kernel: [  840.488998] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.266877] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.288688] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.289624] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.290454] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.300198] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.300917] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.301704] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.302318] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:34:24 TNAS-00E1FD kernel: [ 2052.815271] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:34:24 TNAS-00E1FD kernel: [ 2052.838506] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:34:52 TNAS-00E1FD kernel: [ 2081.273231] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:34:52 TNAS-00E1FD kernel: [ 2081.296585] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:39:26 TNAS-00E1FD kernel: [ 2354.866442] BTRFS error (device
dm-2): cleaner transaction attach returned -30
Nov 23 17:56:30 TNAS-00E1FD kernel: [ 3378.825461] BTRFS info (device
dm-2): using free space tree
Nov 23 17:56:30 TNAS-00E1FD kernel: [ 3378.825891] BTRFS info (device
dm-2): has skinny extents
Nov 23 17:56:30 TNAS-00E1FD kernel: [ 3378.968533] BTRFS info (device
dm-2): bdev /dev/mapper/vg0-lv2 errs: wr 0, rd 0, flush 0, corrupt 0,
gen 8
Nov 23 17:56:32 TNAS-00E1FD kernel: [ 3380.525294] BTRFS info (device
dm-2): checking UUID tree
Nov 23 17:56:32 TNAS-00E1FD kernel: [ 3380.535839] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:56:32 TNAS-00E1FD kernel: [ 3380.544791] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:56:32 TNAS-00E1FD kernel: [ 3380.545579] BTRFS warning
(device dm-2): iterating uuid_tree failed -5
Nov 23 17:56:42 TNAS-00E1FD kernel: [ 3391.302453] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:56:43 TNAS-00E1FD kernel: [ 3391.328368] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.806326] BTRFS error (device
dm-2): incorrect extent count for 29360128; counted 740, expected 677
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.806836] BTRFS: error
(device dm-2) in convert_free_space_to_extents:457: errno=-5 IO
failure
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.807367] BTRFS info (device
dm-2): forced readonly
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.807904] BTRFS: error
(device dm-2) in add_to_free_space_tree:1052: errno=-5 IO failure
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.808493] BTRFS: error
(device dm-2) in __btrfs_free_extent:7106: errno=-5 IO failure
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.809160] BTRFS: error
(device dm-2) in btrfs_run_delayed_refs:3009: errno=-5 IO failure
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.809785] BTRFS warning
(device dm-2): Skipping commit of aborted transaction.
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.810444] BTRFS: error
(device dm-2) in cleanup_transaction:1873: errno=-5 IO failure
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.814113] BTRFS error (device
dm-2): incorrect extent count for 29360128; counted 739, expected 676

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]   ` <163588780885.2993099.2088131017920983969@swboyd.mtv.corp.google.com>
@ 2021-11-25 15:01     ` Hans de Goede
  0 siblings, 0 replies; 1546+ messages in thread
From: Hans de Goede @ 2021-11-25 15:01 UTC (permalink / raw)
  To: Stephen Boyd, Andy Shevchenko, Daniel Scally, Laurent Pinchart,
	Liam Girdwood, Mark Brown, Mark Gross, Mauro Carvalho Chehab,
	Michael Turquette, Mika Westerberg, Rafael J.Wysocki,
	Wolfram Sang
  Cc: Len Brown, linux-acpi, platform-driver-x86, linux-kernel,
	linux-i2c, Sakari Ailus, Kate Hsuan, linux-media, linux-clk

Hi,

On 11/2/21 22:16, Stephen Boyd wrote:
> Quoting Hans de Goede (2021-11-02 02:49:01)
>> diff --git a/drivers/clk/clk-tps68470.c b/drivers/clk/clk-tps68470.c
>> new file mode 100644
>> index 000000000000..2ad0ac2f4096
>> --- /dev/null
>> +++ b/drivers/clk/clk-tps68470.c
>> @@ -0,0 +1,257 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Clock driver for TPS68470 PMIC
>> + *
>> + * Copyright (c) 2021 Red Hat Inc.
>> + * Copyright (C) 2018 Intel Corporation
>> + *
>> + * Authors:
>> + *     Hans de Goede <hdegoede@redhat.com>
>> + *     Zaikuo Wang <zaikuo.wang@intel.com>
>> + *     Tianshu Qiu <tian.shu.qiu@intel.com>
>> + *     Jian Xu Zheng <jian.xu.zheng@intel.com>
>> + *     Yuning Pu <yuning.pu@intel.com>
>> + *     Antti Laakso <antti.laakso@intel.com>
>> + */
>> +
>> +#include <linux/clk-provider.h>
>> +#include <linux/clkdev.h>
>> +#include <linux/kernel.h>
>> +#include <linux/mfd/tps68470.h>
>> +#include <linux/module.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/platform_data/tps68470.h>
>> +#include <linux/regmap.h>
>> +
>> +#define TPS68470_CLK_NAME "tps68470-clk"
>> +
>> +#define to_tps68470_clkdata(clkd) \
>> +       container_of(clkd, struct tps68470_clkdata, clkout_hw)
>> +
> [...]
>> +
>> +static int tps68470_clk_set_rate(struct clk_hw *hw, unsigned long rate,
>> +                                unsigned long parent_rate)
>> +{
>> +       struct tps68470_clkdata *clkdata = to_tps68470_clkdata(hw);
>> +       unsigned int idx = tps68470_clk_cfg_lookup(rate);
>> +
>> +       if (rate != clk_freqs[idx].freq)
>> +               return -EINVAL;
>> +
>> +       clkdata->clk_cfg_idx = idx;
> 
> It deserves a comment that set_rate can only be called when the clk is
> gated. We have CLK_SET_RATE_GATE flag as well that should be set if the
> clk can't support changing rate while enabled. With that flag set, this
> function should be able to actually change hardware with the assumption
> that the framework won't call down into this clk_op when the clk is
> enabled.

Ok for v6 I've added the CLK_SET_RATE_GATE flag + a comment why
it used and moved the divider programming to tps68470_clk_set_rate()m
while keeping the PLL_EN + output-enable writes in tps68470_clk_prepare()


> 
>> +
>> +       return 0;
>> +}
>> +
>> +static const struct clk_ops tps68470_clk_ops = {
>> +       .is_prepared = tps68470_clk_is_prepared,
>> +       .prepare = tps68470_clk_prepare,
>> +       .unprepare = tps68470_clk_unprepare,
>> +       .recalc_rate = tps68470_clk_recalc_rate,
>> +       .round_rate = tps68470_clk_round_rate,
>> +       .set_rate = tps68470_clk_set_rate,
>> +};
>> +
>> +static const struct clk_init_data tps68470_clk_initdata = {
> 
> Is there a reason to make this a static global? It's probably better to
> throw it on the stack so that a structure isn't sitting around after
> driver probe being unused.

Fixed for v6.

Thanks & Regards,

Hans


> 
>> +       .name = TPS68470_CLK_NAME,
>> +       .ops = &tps68470_clk_ops,
>> +};
> 


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <20211126221034.21331-1-lukasz.bartosik@semihalf.com--annotate>
@ 2021-11-29 21:59   ` sean.wang
  0 siblings, 0 replies; 1546+ messages in thread
From: sean.wang @ 2021-11-29 21:59 UTC (permalink / raw)
  To: lb
  Cc: marcel, johan.hedberg, luiz.dentz, upstream, linux-bluetooth,
	linux-mediatek, linux-kernel, Sean Wang

From: Sean Wang <sean.wang@mediatek.com>

>Enable msft opcode for btmtksdio driver.
>
>Signed-off-by: Łukasz Bartosik <lb@semihalf.com>
>---
> drivers/bluetooth/btmtksdio.c | 1 +
> 1 file changed, 1 insertion(+)
>
>diff --git a/drivers/bluetooth/btmtksdio.c b/drivers/bluetooth/btmtksdio.c index d9cf0c492e29..2a7a615663b9 100644
>--- a/drivers/bluetooth/btmtksdio.c
>+++ b/drivers/bluetooth/btmtksdio.c
>@@ -887,6 +887,7 @@ static int btmtksdio_setup(struct hci_dev *hdev)
>	if (enable_autosuspend)
>		pm_runtime_allow(bdev->dev);
>
>+	hci_set_msft_opcode(hdev, 0xFD30);

Hi Łukasz,

msft feature is supposed only supported on mt7921. Could you help rework the patch to enalbe msft opocde only for mt7921?

	Sean

>	bt_dev_info(hdev, "Device setup in %llu usecs", duration);
>
>	return 0;
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2021-11-29 21:59   ` sean.wang
  0 siblings, 0 replies; 1546+ messages in thread
From: sean.wang @ 2021-11-29 21:59 UTC (permalink / raw)
  To: lb
  Cc: marcel, johan.hedberg, luiz.dentz, upstream, linux-bluetooth,
	linux-mediatek, linux-kernel, Sean Wang

From: Sean Wang <sean.wang@mediatek.com>

>Enable msft opcode for btmtksdio driver.
>
>Signed-off-by: Łukasz Bartosik <lb@semihalf.com>
>---
> drivers/bluetooth/btmtksdio.c | 1 +
> 1 file changed, 1 insertion(+)
>
>diff --git a/drivers/bluetooth/btmtksdio.c b/drivers/bluetooth/btmtksdio.c index d9cf0c492e29..2a7a615663b9 100644
>--- a/drivers/bluetooth/btmtksdio.c
>+++ b/drivers/bluetooth/btmtksdio.c
>@@ -887,6 +887,7 @@ static int btmtksdio_setup(struct hci_dev *hdev)
>	if (enable_autosuspend)
>		pm_runtime_allow(bdev->dev);
>
>+	hci_set_msft_opcode(hdev, 0xFD30);

Hi Łukasz,

msft feature is supposed only supported on mt7921. Could you help rework the patch to enalbe msft opocde only for mt7921?

	Sean

>	bt_dev_info(hdev, "Device setup in %llu usecs", duration);
>
>	return 0;
>

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-12-20  6:46 Ralf Beck
@ 2021-12-20  7:55 ` Greg KH
  2021-12-20 10:01 ` Re: Oliver Neukum
  1 sibling, 0 replies; 1546+ messages in thread
From: Greg KH @ 2021-12-20  7:55 UTC (permalink / raw)
  To: Ralf Beck; +Cc: linux-usb

On Mon, Dec 20, 2021 at 07:46:34AM +0100, Ralf Beck wrote:
> 
> Currently the usb core is disabling the use of and endpoint, if the endpoint address is present in two different USB interface descriptors within the same USB configuration.
> This behaviour is obviously based on following passage in the USB specification:
> 
> "An endpoint is not shared among interfaces within a single configuration unless the endpoint is used by alternate settings of the same interface."
> 
> However, this behaviour prevents using some interfaces (in my case the Motu AVB audio devices) in their vendor specific mode.
> 
> They use a single USB configuration with tqo sets of interfaces, which use the same isochronous entpoint numbers.
> 
> One set with audio class specific interfaces for use by an audi class driver.
> The other set with vendor specific interfaces for use by the vendor driver.
> Obviously the class specific interfaces and vendor specific interfaces are not intended to be use by a driver simultaniously.
> 
> There must be another solution to deal with this. It is unacceptable to request a user of these devices to have to disablethe duplicate endpoint check and recompile the kernel on every update in order to be able to use their devices in vendor mode.

The device sounds like it des not follow the USB specification, so how
does it work with any operating system?

What in-kernel driver binds to the device in vendor mode?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2021-12-20  6:46 Ralf Beck
  2021-12-20  7:55 ` Greg KH
@ 2021-12-20 10:01 ` Oliver Neukum
  1 sibling, 0 replies; 1546+ messages in thread
From: Oliver Neukum @ 2021-12-20 10:01 UTC (permalink / raw)
  To: Ralf Beck, linux-usb


On 20.12.21 07:46, Ralf Beck wrote
> One set with audio class specific interfaces for use by an audi class driver.
> The other set with vendor specific interfaces for use by the vendor driver.
> Obviously the class specific interfaces and vendor specific interfaces are not intended to be use by a driver simultaniously.
Such devices are buggy. We usually define quirks for such devices.
> There must be another solution to deal with this. It is unacceptable to request a user of these devices to have to disablethe duplicate endpoint check and recompile the kernel on every update in order to be able to use their devices in vendor mode.
I suggest you write a patch to introduce a quirk that disables one of the
interfaces and disregards disabled interfaces for purposes of the check.

    Regards
        Oliver

PS: Please use a subject line when you post.


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <20211229092443.GA10533@L-PF27918B-1352.localdomain>
@ 2022-01-05  6:05 ` Jason Wang
  2022-01-05  6:27   ` Re: Jason Wang
  0 siblings, 1 reply; 1546+ messages in thread
From: Jason Wang @ 2022-01-05  6:05 UTC (permalink / raw)
  To: Wu Zongyong; +Cc: virtualization

On Wed, Dec 29, 2021 at 5:31 PM Wu Zongyong
<wuzongyong@linux.alibaba.com> wrote:
>
> linux-kernel@vger.kernel.org
> Bcc:
> Subject: Should we call vdpa_config_ops->get_vq_num_{max,min} with a
>  virtqueue index?
> Reply-To: Wu Zongyong <wuzongyong@linux.alibaba.com>
>
> Hi jason,
>
> AFAIK, a virtio device may have multiple virtqueues of diffrent sizes.
> It is okay for modern devices with the vdpa_config_ops->get_vq_num_max
> implementation with a static number currently since modern devices can
> reset the queue size. But for legacy-virtio based devices, we cannot
> allocate correct sizes for these virtqueues since it is not supported to
> negotiate the queue size with harware.
>
> So as the title said, I wonder is it neccessary to add a new parameter
> `index` to vdpa_config_ops->get_vq_num_{max,min} to help us get the size
> of a dedicated virtqueue.

I've posted something like this in the past here:

https://lore.kernel.org/lkml/CACycT3tMd750PQ0mgqCjHnxM4RmMcx2+Eo=2RBs2E2W3qPJang@mail.gmail.com/

>
> Or we can introduce a new callback like get_config_vq_num?
>
> What do you think?

If you wish, you can carry on my work. We can start by reusing the
current ops, if it doesn't work, we can use new.

Thanks

>
> Thanks
>
>
>
>
>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-01-05  6:05 ` Re: Jason Wang
@ 2022-01-05  6:27   ` Jason Wang
  0 siblings, 0 replies; 1546+ messages in thread
From: Jason Wang @ 2022-01-05  6:27 UTC (permalink / raw)
  To: Wu Zongyong; +Cc: virtualization

On Wed, Jan 5, 2022 at 2:05 PM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Dec 29, 2021 at 5:31 PM Wu Zongyong
> <wuzongyong@linux.alibaba.com> wrote:
> >
> > linux-kernel@vger.kernel.org
> > Bcc:
> > Subject: Should we call vdpa_config_ops->get_vq_num_{max,min} with a
> >  virtqueue index?
> > Reply-To: Wu Zongyong <wuzongyong@linux.alibaba.com>
> >
> > Hi jason,
> >
> > AFAIK, a virtio device may have multiple virtqueues of diffrent sizes.
> > It is okay for modern devices with the vdpa_config_ops->get_vq_num_max
> > implementation with a static number currently since modern devices can
> > reset the queue size. But for legacy-virtio based devices, we cannot
> > allocate correct sizes for these virtqueues since it is not supported to
> > negotiate the queue size with harware.
> >
> > So as the title said, I wonder is it neccessary to add a new parameter
> > `index` to vdpa_config_ops->get_vq_num_{max,min} to help us get the size
> > of a dedicated virtqueue.
>
> I've posted something like this in the past here:
>
> https://lore.kernel.org/lkml/CACycT3tMd750PQ0mgqCjHnxM4RmMcx2+Eo=2RBs2E2W3qPJang@mail.gmail.com/
>
> >
> > Or we can introduce a new callback like get_config_vq_num?
> >
> > What do you think?
>
> If you wish, you can carry on my work. We can start by reusing the
> current ops, if it doesn't work, we can use new.

Just to clarify, I meant, we probably need to introduce a new uAPI on
top of the above version.

Thanks

>
> Thanks
>
> >
> > Thanks
> >
> >
> >
> >
> >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-01-13 17:53 Varun Sethi
@ 2022-01-14 17:17 ` Fabio Estevam
  0 siblings, 0 replies; 1546+ messages in thread
From: Fabio Estevam @ 2022-01-14 17:17 UTC (permalink / raw)
  To: Varun Sethi
  Cc: linux-crypto@vger.kernel.org, andrew.smirnov@gmail.com,
	Horia Geanta, Gaurav Jain, Pankaj Gupta

Hi Varun,

On Thu, Jan 13, 2022 at 2:53 PM Varun Sethi <V.Sethi@nxp.com> wrote:
>
> Hi Fabio, Andrey,
> So far we have observed this issue on i.MX6 only. Disabling prediction resistance isn't the solution for the problem. We are working on identifying the proper fix for this issue and would post the patch for the same.

Please copy me when you submit a fix for this issue.

Thanks!

Fabio Estevam

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-01-20 15:28 Myrtle Shah
@ 2022-01-20 15:37 ` Vitaly Wool
  2022-01-20 23:29   ` Re: Damien Le Moal
  2022-02-04 21:45   ` Re: Palmer Dabbelt
  0 siblings, 2 replies; 1546+ messages in thread
From: Vitaly Wool @ 2022-01-20 15:37 UTC (permalink / raw)
  To: Myrtle Shah; +Cc: linux-riscv, Paul Walmsley, Palmer Dabbelt, LKML

Hey,

On Thu, Jan 20, 2022 at 4:30 PM Myrtle Shah <gatecat@ds0.me> wrote:
>
> These are some initial patches to bugs I found attempting to
> get a XIP kernel working on hardware:
>  - 32-bit VexRiscv processor
>  - kernel in SPI flash, at 0x00200000
>  - 16MB of RAM at 0x10000000
>  - MMU enabled
>
> I still have some more debugging to do, but these at least
> get the kernel as far as initialising the MMU, and I would
> appreciate feedback if anyone else is working on RISC-V XIP.

I'll try to support you as much as I can, unfortunately I don't have
any 32-bit RISC-V around so I was rather thinking of extending the
RISC-V XIP support to 64-bit non-MMU targets.
For now just please keep in mind that there might be some inherent
assumptions that a target is 64 bit.

Best regards,
Vitaly

>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-01-20 15:37 ` Vitaly Wool
@ 2022-01-20 23:29   ` Damien Le Moal
  2022-02-04 21:45   ` Re: Palmer Dabbelt
  1 sibling, 0 replies; 1546+ messages in thread
From: Damien Le Moal @ 2022-01-20 23:29 UTC (permalink / raw)
  To: Vitaly Wool, Myrtle Shah; +Cc: linux-riscv, Paul Walmsley, Palmer Dabbelt, LKML

On 2022/01/21 0:37, Vitaly Wool wrote:
> Hey,
> 
> On Thu, Jan 20, 2022 at 4:30 PM Myrtle Shah <gatecat@ds0.me> wrote:
>>
>> These are some initial patches to bugs I found attempting to
>> get a XIP kernel working on hardware:
>>  - 32-bit VexRiscv processor
>>  - kernel in SPI flash, at 0x00200000
>>  - 16MB of RAM at 0x10000000
>>  - MMU enabled
>>
>> I still have some more debugging to do, but these at least
>> get the kernel as far as initialising the MMU, and I would
>> appreciate feedback if anyone else is working on RISC-V XIP.
> 
> I'll try to support you as much as I can, unfortunately I don't have
> any 32-bit RISC-V around so I was rather thinking of extending the
> RISC-V XIP support to 64-bit non-MMU targets.

That would be great ! I am completing the buildroot patches for the K210. Got
u-boot almost working for SD card boot too (fighting a problem with rootfs
kernel mount on boot when using u-boot though).

> For now just please keep in mind that there might be some inherent
> assumptions that a target is 64 bit.
> 
> Best regards,
> Vitaly
> 
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


-- 
Damien Le Moal
Western Digital Research

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-01-24 12:43 Arınç ÜNAL
@ 2022-01-25 14:03 ` Sergio Paracuellos
  2022-01-25 15:24   ` Re: Arınç ÜNAL
  0 siblings, 1 reply; 1546+ messages in thread
From: Sergio Paracuellos @ 2022-01-25 14:03 UTC (permalink / raw)
  To: Arınç ÜNAL
  Cc: Greg KH, NeilBrown, DENG Qingfang, Andrew Lunn,
	Luiz Angelo Daros de Luca, linux-staging

Hi Arinc!

On Mon, Jan 24, 2022 at 1:45 PM Arınç ÜNAL <arinc.unal@arinc9.com> wrote:
>
> Hey everyone,
>
> In preperation to mainline mt7621-dts; fix formatting, dtc warning on
> switch0@0 node and pinctrl properties for ethernet node on the mt7621.dtsi.
> Move the GB-PC2 specific external phy configuration on the main dtsi to
> GB-PC2's devicetree, gbpc2.dts.
>
> Now that pinctrl properties are properly defined on the ethernet node,
> GMAC1 will start working.
>
> Traffic flow on GMAC1 was tested on a mt7621a board with these modes:
> External phy <-> GMAC1
> PHY 0/4 <-> GMAC1
>
> Cheers.
> Arınç

Nitpick: next time try to put also a subject like "staging:
mt7621-dts: cleanups (or whatever)" in the cover letter of the series.

>
> [0]: https://lore.kernel.org/netdev/83a35aa3-6cb8-2bc4-2ff4-64278bbcd8c8@arinc9.com/T/
>
> Arınç ÜNAL (4):
>       staging: mt7621-dts: fix formatting
>       staging: mt7621-dts: fix switch0@0 warnings
>       staging: mt7621-dts: use trgmii on gmac0 and enable flow control on port@6
>       staging: mt7621-dts: fix pinctrl properties for ethernet
>
>  drivers/staging/mt7621-dts/gbpc2.dts   | 16 +++++++++++-----
>  drivers/staging/mt7621-dts/mt7621.dtsi | 32 ++++++++++++++++----------------
>  2 files changed, 27 insertions(+), 21 deletions(-)
>
>

Thanks for doing this!

Best regards,
    Sergio Paracuellos

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-01-25 14:03 ` Sergio Paracuellos
@ 2022-01-25 15:24   ` Arınç ÜNAL
  2022-01-25 15:50     ` Re: Sergio Paracuellos
  0 siblings, 1 reply; 1546+ messages in thread
From: Arınç ÜNAL @ 2022-01-25 15:24 UTC (permalink / raw)
  To: Sergio Paracuellos
  Cc: Greg KH, NeilBrown, DENG Qingfang, Andrew Lunn,
	Luiz Angelo Daros de Luca, linux-staging

Hey Sergio,

On 25/01/2022 17:03, Sergio Paracuellos wrote:
> Hi Arinc!
> 
> On Mon, Jan 24, 2022 at 1:45 PM Arınç ÜNAL <arinc.unal@arinc9.com> wrote:
>>
>> Hey everyone,
>>
>> In preperation to mainline mt7621-dts; fix formatting, dtc warning on
>> switch0@0 node and pinctrl properties for ethernet node on the mt7621.dtsi.
>> Move the GB-PC2 specific external phy configuration on the main dtsi to
>> GB-PC2's devicetree, gbpc2.dts.
>>
>> Now that pinctrl properties are properly defined on the ethernet node,
>> GMAC1 will start working.
>>
>> Traffic flow on GMAC1 was tested on a mt7621a board with these modes:
>> External phy <-> GMAC1
>> PHY 0/4 <-> GMAC1
>>
>> Cheers.
>> Arınç
> 
> Nitpick: next time try to put also a subject like "staging:
> mt7621-dts: cleanups (or whatever)" in the cover letter of the series.

I had already sent v2 with that. I'll send v3 with your input on the 
series, thanks!

> 
>>
>> [0]: https://lore.kernel.org/netdev/83a35aa3-6cb8-2bc4-2ff4-64278bbcd8c8@arinc9.com/T/
>>
>> Arınç ÜNAL (4):
>>        staging: mt7621-dts: fix formatting
>>        staging: mt7621-dts: fix switch0@0 warnings
>>        staging: mt7621-dts: use trgmii on gmac0 and enable flow control on port@6
>>        staging: mt7621-dts: fix pinctrl properties for ethernet
>>
>>   drivers/staging/mt7621-dts/gbpc2.dts   | 16 +++++++++++-----
>>   drivers/staging/mt7621-dts/mt7621.dtsi | 32 ++++++++++++++++----------------
>>   2 files changed, 27 insertions(+), 21 deletions(-)
>>
>>
> 
> Thanks for doing this!
> 
> Best regards,
>      Sergio Paracuellos

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-01-25 15:24   ` Re: Arınç ÜNAL
@ 2022-01-25 15:50     ` Sergio Paracuellos
  0 siblings, 0 replies; 1546+ messages in thread
From: Sergio Paracuellos @ 2022-01-25 15:50 UTC (permalink / raw)
  To: Arınç ÜNAL
  Cc: Greg KH, NeilBrown, DENG Qingfang, Andrew Lunn,
	Luiz Angelo Daros de Luca, linux-staging

On Tue, Jan 25, 2022 at 4:24 PM Arınç ÜNAL <arinc.unal@arinc9.com> wrote:
>
> Hey Sergio,
>
> On 25/01/2022 17:03, Sergio Paracuellos wrote:
> > Hi Arinc!
> >
> > On Mon, Jan 24, 2022 at 1:45 PM Arınç ÜNAL <arinc.unal@arinc9.com> wrote:
> >>
> >> Hey everyone,
> >>
> >> In preperation to mainline mt7621-dts; fix formatting, dtc warning on
> >> switch0@0 node and pinctrl properties for ethernet node on the mt7621.dtsi.
> >> Move the GB-PC2 specific external phy configuration on the main dtsi to
> >> GB-PC2's devicetree, gbpc2.dts.
> >>
> >> Now that pinctrl properties are properly defined on the ethernet node,
> >> GMAC1 will start working.
> >>
> >> Traffic flow on GMAC1 was tested on a mt7621a board with these modes:
> >> External phy <-> GMAC1
> >> PHY 0/4 <-> GMAC1
> >>
> >> Cheers.
> >> Arınç
> >
> > Nitpick: next time try to put also a subject like "staging:
> > mt7621-dts: cleanups (or whatever)" in the cover letter of the series.
>
> I had already sent v2 with that. I'll send v3 with your input on the
> series, thanks!

True, sorry I missed that!

Thanks,
    Sergio Paracuellos
>
> >
> >>
> >> [0]: https://lore.kernel.org/netdev/83a35aa3-6cb8-2bc4-2ff4-64278bbcd8c8@arinc9.com/T/
> >>
> >> Arınç ÜNAL (4):
> >>        staging: mt7621-dts: fix formatting
> >>        staging: mt7621-dts: fix switch0@0 warnings
> >>        staging: mt7621-dts: use trgmii on gmac0 and enable flow control on port@6
> >>        staging: mt7621-dts: fix pinctrl properties for ethernet
> >>
> >>   drivers/staging/mt7621-dts/gbpc2.dts   | 16 +++++++++++-----
> >>   drivers/staging/mt7621-dts/mt7621.dtsi | 32 ++++++++++++++++----------------
> >>   2 files changed, 27 insertions(+), 21 deletions(-)
> >>
> >>
> >
> > Thanks for doing this!
> >
> > Best regards,
> >      Sergio Paracuellos

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-01-17 12:54 ` 转发: Caine Chen
@ 2022-02-03 11:49   ` Daniel Vacek
  0 siblings, 0 replies; 1546+ messages in thread
From: Daniel Vacek @ 2022-02-03 11:49 UTC (permalink / raw)
  To: Caine Chen; +Cc: linux-rt-users@vger.kernel.org

Hi Caine,

On Tue, Jan 18, 2022 at 4:44 AM Caine Chen <caine.chen@dji.com> wrote:
>
> Hi guys:
> We found that some IRQ threads will block in local_bh_disable( ) for
> long time in some situation and we hope to get your valuable suggestions.
> My kernel version is 5.4 and the irq-delay is caused by the use of
> write_lock_bh().
> It can be described in the following figure:
> (1) Thread_1 which is a SCHED_NORMAL thread runs on CPU1,
>     and it uses read_lock_bh() to protect some data.
> (2) Thread_2 which is a SCHED_RR thread runs on CPU1 and it preempts thread_1
>     after thread_1 invoked read_lock_bh(). Thread_2 may run 60 ms in my system.
> (3) Thread_3 which is a SCHED_NORMAL thread runs on CPU0. This thread acquires
>     writer's lock by invoking write_lock_bh(). This function will disable
>     button-half firstly by invoking local_bh_disable( ). But it will block in
>     rt_write_lock() , because read lock is held by thread_1.
> (4) At this time, if irq thread without IRQF_NO_THREAD flag on CPU0 trys to
>     acquire bh_lock(it has been renamed as softirq_ctrl.lock now), irq
>     thread will block because this lock is held by thread_3.
>
> ------------------------------------------------------------------------------------------------------------------------------------
> CPU1                                                                            CPU0
> -------------------------------------------------                    ---------------------------------------------------------------
> thread_2                       thread_1                           thread_3                               irq_thread
> --------------                  -----------                           -----------                            --------------
>                                  read_lock_bh()
>
> ......
>                                                                      write_lock_bh()
> /*do work*/                                                                                               /* irq thread block here*/
>                                                                                                               local_bh_disable()
> ......
>                                  read_unlock_bh()
>                                                                      ......
>                                                                      /* do work */
>                                                                      ......
>                                                                      write_unlock_bh()
>                                                                                                               irq_thread_fn()
> ----------------------------------------------------------------------------------------------------------------------------------
>
> In this case, if SCHED_RR thread_2 preempts thread_1 and runs too much time, all
> irq threads on CPU0 will be blocked.
> It looks like a priority reverse problem of real-time thread preempt.

Not really. I guess there's one misunderstanding in your description.
Disabling the bottom half is local to running thread and not to the
CPU which executes that thread. As an effect, preemption practically
enables the bottom half again (as long as the new thread did not have
it already disabled before, of course...).

That said, the irq_thread will _not_ be blocked as bottom half is not
disabled in it's context. From your chart, it's disabled only in
thread_3 context and thread_1 context. But these two are independent
(due to the different thread contexts and not the different CPU
contexts as you misassumed) and they do not block each other either,
it's the rw_lock serializing these threads, right?

You should be able to see this with tracing. There should be no issue
or the issue is different than you think it is and different than you
described here.

Hopefully the above helps you,
Daniel

> How can I avoid this problem?  I have a few thoughts:
> (1) The key point, I think, is that write_lock_bh()/read_lock_bh() will disable
>     buttom half which will disable some irq threads too. Could I use
>     write_lock_irq()/read_lock_irq() instead?
> (2) If my irq handler wants to get better performance, I should request a
>     threaded handler for the IRQ as Sebastian suggested in LKML
>     <RE: irq thread latency caused by softirq_ctrl.lock contention>.
>     Is threaded handler designed for low irq delay?
> (3) Thread_2 takes too long time for running. So it is not suitable to set this
>     thread with high rt-priority. Should I reduce this thread's priority to
>     solve this problem?
>
> Are there better ways to avoid this problem? We hope to get your valuable
> suggestions. Thanks!
>
> Best regards,
> Caine.chen
> This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.
>
> 此电子邮件及附件所包含内容具有机密性，且仅限于接收人使用。未经允许，禁止第三人阅读、复制或传播该电子邮件中的任何信息。如果您不属于以上电子邮件的目标接收者，请您立即通知发送人并删除原电子邮件及其相关的附件。

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-01-20 15:37 ` Vitaly Wool
  2022-01-20 23:29   ` Re: Damien Le Moal
@ 2022-02-04 21:45   ` Palmer Dabbelt
  1 sibling, 0 replies; 1546+ messages in thread
From: Palmer Dabbelt @ 2022-02-04 21:45 UTC (permalink / raw)
  To: vitaly.wool; +Cc: gatecat, linux-riscv, Paul Walmsley, linux-kernel

On Thu, 20 Jan 2022 07:37:00 PST (-0800), vitaly.wool@konsulko.com wrote:
> Hey,
>
> On Thu, Jan 20, 2022 at 4:30 PM Myrtle Shah <gatecat@ds0.me> wrote:
>>
>> These are some initial patches to bugs I found attempting to
>> get a XIP kernel working on hardware:
>>  - 32-bit VexRiscv processor
>>  - kernel in SPI flash, at 0x00200000
>>  - 16MB of RAM at 0x10000000
>>  - MMU enabled
>>
>> I still have some more debugging to do, but these at least
>> get the kernel as far as initialising the MMU, and I would
>> appreciate feedback if anyone else is working on RISC-V XIP.
>
> I'll try to support you as much as I can, unfortunately I don't have
> any 32-bit RISC-V around so I was rather thinking of extending the
> RISC-V XIP support to 64-bit non-MMU targets.
> For now just please keep in mind that there might be some inherent
> assumptions that a target is 64 bit.

I don't test any of the XIP configs, but if you guys have something that's sane
to run in QEMU I'm happy to do so.  Given that there's now some folks finding
boot bugs it's probably worth getting what does boot into a regression test so
it's less likely to break moving forwards.

These are on fixes, with the second one split up so it's got a better chance of
landing in the stable trees.

Thanks!

>
> Best regards,
> Vitaly
>
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-02-10 15:00 ` Ferruh Yigit
@ 2022-02-10 16:08   ` Gaëtan Rivet
  0 siblings, 0 replies; 1546+ messages in thread
From: Gaëtan Rivet @ 2022-02-10 16:08 UTC (permalink / raw)
  To: Ferruh Yigit, madhuker.mythri; +Cc: dev

On Thu, Feb 10, 2022, at 16:00, Ferruh Yigit wrote:
> On 2/10/2022 7:10 AM, madhuker.mythri@oracle.com wrote:
>> From: Madhuker Mythri <madhuker.mythri@oracle.com>
>> 
>> Failsafe pmd started crashing with global devargs syntax as devargs is
>> not memset to zero. Access it to in rte_devargs_parse resulted in a
>> crash when called from secondary process.
>> 
>> Bugzilla Id: 933
>> 
>> Signed-off-by: Madhuker Mythri <madhuker.mythri@oracle.com>
>> ---
>>   drivers/net/failsafe/failsafe.c | 1 +
>>   1 file changed, 1 insertion(+)
>> 
>> diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
>> index 3c754a5f66..aa93cc6000 100644
>> --- a/drivers/net/failsafe/failsafe.c
>> +++ b/drivers/net/failsafe/failsafe.c
>> @@ -360,6 +360,7 @@ rte_pmd_failsafe_probe(struct rte_vdev_device *vdev)
>>   			if (sdev->devargs.name[0] == '\0')
>>   				continue;
>>   
>> +			memset(&devargs, 0, sizeof(devargs));
>>   			/* rebuild devargs to be able to get the bus name. */
>>   			ret = rte_devargs_parse(&devargs,
>>   						sdev->devargs.name);
>
> if 'rte_devargs_parse()' requires 'devargs' parameter to be memset,
> what do you think memset it in the API?
> This prevents forgotten cases like this.

Hi,

I was looking at it this morning.
Before the last release, rte_devargs_parse() was only supporting legacy syntax.
It never read from the devargs structure, only wrote to it. So it was safe to
use with a non-memset devargs.

The rte_devargs_layer_parse() however is more complex. To allow rte_dev_iterator_init() to call it without doing memory allocation, it reads parts of the devargs to make decisions.

Doing a first call to rte_devargs_layer_parse() as part of rte_devargs_parse() thus
modified the contract it had with the users, that it would never read from devargs.

It is not possible to completely avoid reading from devargs in rte_devargs_layer_parse().
It is necessary for RTE_DEV_FOREACH() to be safe to interrupt without having to do iterator cleanup.

This is my current understanding. In that context, yes I think it is preferrable to
do memset() within rte_devargs_parse(). It will restore the previous part of the API saying that calling it with non-memset devargs was safe to do.

Thanks,
-- 
Gaetan Rivet

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re: Re:
@ 2022-02-11 15:06 Caine Chen
  0 siblings, 0 replies; 1546+ messages in thread
From: Caine Chen @ 2022-02-11 15:06 UTC (permalink / raw)
  To: neelx.g@gmail.com; +Cc: linux-rt-users@vger.kernel.org

Hi Daniel:
Thanks for your reply.

> Not really. I guess there's one misunderstanding in your description.
> Disabling the bottom half is local to running thread and not to the
> CPU which executes that thread. As an effect, preemption practically
> enables the bottom half again (as long as the new thread did not have
> it already disabled before, of course...).

It's a bit confused for me why disabling the bottom half is local to thread
and not to the CPU. From my humble perspective, every forced threaded
irq_threads will invoke local_bh_disable( ) and try to get bh_lock before they
enter irq handler. If bh_lock(now is softirq_ctrl.lock) is held by other thread,
all forced-threaded irq_threads on this CPU will wait until the lock is released.
So how does preemption enable the bottom half again?

To test this, I did an experiment in v5.4 kernel.
First, I created a kthread and bound it to CPU0:

int test_init( )
{
        ......
        p = kthread_create(my_debug_func, NULL, "my_test");
        kthread_bind(p, 0);
        wake_up_process(p);
        ......
}

This kthread will invoke local_bh_disable()/local_bh_enable() periodically:

int my_debug_func(void *arg)
{
        ......
        while(!kthread_should_stop()) {
                ......
                local_bh_disable();
                /* just do some busy work, such as memcpy, kmalloc and so on */
                do_some_work();
                local_bh_enable();
        }
        ......
}

What'more, I added some logs in some forced-threaded irq handlers to find out when they was excuted.
After "my_test" thread disabled local bh, there were no forced-threaded irq threads running on CPU0.
But after "my_test" thread enabled local bh, forced-threaded irqs came again.

It seems that disabling the bottom half is local to CPU.

> That said, the irq_thread will _not_ be blocked as bottom half is not
> disabled in it's context. From your chart, it's disabled only in
> thread_3 context and thread_1 context. But these two are independent
> (due to the different thread contexts and not the different CPU
> contexts as you misassumed) and they do not block each other either,
> it's the rw_lock serializing these threads, right?

> You should be able to see this with tracing. There should be no issue
> or the issue is different than you think it is and different than you
> described here.

> Hopefully the above helps you,
> Daniel

Thanks
Caine
This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

此电子邮件及附件所包含内容具有机密性，且仅限于接收人使用。未经允许，禁止第三人阅读、复制或传播该电子邮件中的任何信息。如果您不属于以上电子邮件的目标接收者，请您立即通知发送人并删除原电子邮件及其相关的附件。

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-02-13 22:40 Ronnie Sahlberg
@ 2022-02-14  7:52 ` ronnie sahlberg
  0 siblings, 0 replies; 1546+ messages in thread
From: ronnie sahlberg @ 2022-02-14  7:52 UTC (permalink / raw)
  To: Ronnie Sahlberg; +Cc: linux-cifs, Steve French

Steve,

I  have added a test to the buildbot to verify the fix.
I.e. two ACEs when file are created, one for mode and one for AuthenticatedUsers
and that after chmod we still have two ACEs but the one for mode has
been updated.

The test is cifs/107
and it can also show how we can now modify the mountoptions we need on
a test by test
basis by using -o remount, ...     wooohooo new mount api :-)



On Mon, Feb 14, 2022 at 9:47 AM Ronnie Sahlberg <lsahlber@redhat.com> wrote:
>
> Steve, List,
>
> Here is a small patch htat fixes an issue with modefromsid where
> it would strip off and remove all the ACEs that grants us access to the file.
> It fixes this by restoring the "allow AuthenticatedUsers access" ACE that is stripped in
>
> set_chmod_dacl():
>                 /* If it's any one of the ACE we're replacing, skip! */
>                 if (((compare_sids(&pntace->sid, &sid_unix_NFS_mode) == 0) ||
>                                 (compare_sids(&pntace->sid, pownersid) == 0) ||
>                                 (compare_sids(&pntace->sid, pgrpsid) == 0) ||
>                                 (compare_sids(&pntace->sid, &sid_everyone) == 0) ||
>                                 (compare_sids(&pntace->sid, &sid_authusers) == 0))) {
>                         goto next_ace;
>                 }
>
> This part is confusing since form many of these cases we are NOT replacing
> all these ACEs but only some of them but the code unconditionally removes
> all of them, contrary to what the comment suggests.
>
> I think some of my confusion here is that afaik we don't have good documentation
> of how modefromsid, and idsfromsid, are supposed to work, what the
> restrictions are or the expected semantics.
> We need to document both modefromsid and idsfromsid and what the expected
> semantics are for when either of them or both of them are enabled.
>
>
>
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2022-03-04  8:47 Harald Hauge
  0 siblings, 0 replies; 1546+ messages in thread
From: Harald Hauge @ 2022-03-04  8:47 UTC (permalink / raw)
  To: bpf

Hello,
I'm Harald Hauge, an Investment Manager from Norway.
I will your assistance in executing this Business from my country
to yours.

This is a short term investment with good returns. Kindly
reply to confirm the validity of your email so I can give you comprehensive details about the project.

Best Regards,
Harald Hauge
Business Consultant

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-06 11:10 ` Jaydeep P Das
@ 2022-03-06 11:22   ` Jaydeep Das
  0 siblings, 0 replies; 1546+ messages in thread
From: Jaydeep Das @ 2022-03-06 11:22 UTC (permalink / raw)
  To: git

Please ignore this patch. I think I made some mistake
when copy-pasting the In-reply-to code.

Sorry for the trouble. I have sent this same patch on
the appropriate thread.

Thanks,
Jaydeep.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <Yj1hkpyUqJE9sQ2p@redhat.com>
@ 2022-03-25  7:52 ` Jason Wang
  2022-03-25  9:10   ` Re: Michael S. Tsirkin
  0 siblings, 1 reply; 1546+ messages in thread
From: Jason Wang @ 2022-03-25  7:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> Bcc:
> Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> Reply-To:
> In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
>
> On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> >
> > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > This is a rework on the previous IRQ hardening that is done for
> > > > virtio-pci where several drawbacks were found and were reverted:
> > > >
> > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > >     that is used by some device such as virtio-blk
> > > > 2) done only for PCI transport
> > > >
> > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > by introducing a global irq_soft_enabled variable for each
> > > > virtio_device. Then we can to toggle it during
> > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > the future, we may provide config_ops for the transport that doesn't
> > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > but the cost should be acceptable.
> > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> >
> >
> > Even if we allow the transport driver to synchornize through
> > synchronize_irq() we still need a check in the vring_interrupt().
> >
> > We do something like the following previously:
> >
> >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> >                 return IRQ_NONE;
> >
> > But it looks like a bug since speculative read can be done before the check
> > where the interrupt handler can't see the uncommitted setup which is done by
> > the driver.
>
> I don't think so - if you sync after setting the value then
> you are guaranteed that any handler running afterwards
> will see the new value.

The problem is not disabled but the enable. We use smp_store_relase()
to make sure the driver commits the setup before enabling the irq. It
means the read needs to be ordered as well in vring_interrupt().

>
> Although I couldn't find anything about this in memory-barriers.txt
> which surprises me.
>
> CC Paul to help make sure I'm right.
>
>
> >
> > >
> > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > module parameter is introduced to enable the hardening so function
> > > > hardening is disabled by default.
> > > Which devices are these? How come they send an interrupt before there
> > > are any buffers in any queues?
> >
> >
> > I copied this from the commit log for 22b7050a024d7
> >
> > "
> >
> >     This change will also benefit old hypervisors (before 2009)
> >     that send interrupts without checking DRIVER_OK: previously,
> >     the callback could race with driver-specific initialization.
> > "
> >
> > If this is only for config interrupt, I can remove the above log.
>
>
> This is only for config interrupt.

Ok.

>
> >
> > >
> > > > Note that the hardening is only done for vring interrupt since the
> > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > ("virtio: defer config changed notifications"). But the method that is
> > > > used by config interrupt can't be reused by the vring interrupt
> > > > handler because it uses spinlock to do the synchronization which is
> > > > expensive.
> > > >
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > >
> > > > ---
> > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > >   include/linux/virtio.h        |  4 ++++
> > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > --- a/drivers/virtio/virtio.c
> > > > +++ b/drivers/virtio/virtio.c
> > > > @@ -7,6 +7,12 @@
> > > >   #include <linux/of.h>
> > > >   #include <uapi/linux/virtio_ids.h>
> > > > +static bool irq_hardening = false;
> > > > +
> > > > +module_param(irq_hardening, bool, 0444);
> > > > +MODULE_PARM_DESC(irq_hardening,
> > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > +
> > > >   /* Unique numbering for virtio devices. */
> > > >   static DEFINE_IDA(virtio_index_ida);
> > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > >    * */
> > > >   void virtio_reset_device(struct virtio_device *dev)
> > > >   {
> > > > + /*
> > > > +  * The below synchronize_rcu() guarantees that any
> > > > +  * interrupt for this line arriving after
> > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > +  * irq_soft_enabled == false.
> > > News to me I did not know synchronize_rcu has anything to do
> > > with interrupts. Did not you intend to use synchronize_irq?
> > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > though it's most likely is ...
> >
> >
> > According to the comment above tree RCU version of synchronize_rcu():
> >
> > """
> >
> >  * RCU read-side critical sections are delimited by rcu_read_lock()
> >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> >  * v5.0 and later, regions of code across which interrupts, preemption,
> >  * or softirqs have been disabled also serve as RCU read-side critical
> >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> >  * and NMI handlers.
> > """
> >
> > So interrupt handlers are treated as read-side critical sections.
> >
> > And it has the comment for explain the barrier:
> >
> > """
> >
> >  * Note that this guarantee implies further memory-ordering guarantees.
> >  * On systems with more than one CPU, when synchronize_rcu() returns,
> >  * each CPU is guaranteed to have executed a full memory barrier since
> >  * the end of its last RCU read-side critical section whose beginning
> >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > """
> >
> > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > irq_soft_enabled as false.
> >
>
> You are right. So then
> 1. I do not think we need load_acquire - why is it needed? Just
>    READ_ONCE should do.

See above.

> 2. isn't synchronize_irq also doing the same thing?


Yes, but it requires a config ops since the IRQ knowledge is transport specific.

>
>
> > >
> > > > +  */
> > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > + synchronize_rcu();
> > > > +
> > > >           dev->config->reset(dev);
> > > >   }
> > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > Please add comment explaining where it will be enabled.
> > > Also, we *really* don't need to synch if it was already disabled,
> > > let's not add useless overhead to the boot sequence.
> >
> >
> > Ok.
> >
> >
> > >
> > >
> > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > >           spin_lock_init(&dev->config_lock);
> > > >           dev->config_enabled = false;
> > > >           dev->config_change_pending = false;
> > > > + dev->irq_soft_check = irq_hardening;
> > > > +
> > > > + if (dev->irq_soft_check)
> > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > >           /* We always start by resetting the device, in case a previous
> > > >            * driver messed it up.  This also tests that code path a little. */
> > > one of the points of hardening is it's also helpful for buggy
> > > devices. this flag defeats the purpose.
> >
> >
> > Do you mean:
> >
> > 1) we need something like config_enable? This seems not easy to be
> > implemented without obvious overhead, mainly the synchronize with the
> > interrupt handlers
>
> But synchronize is only on tear-down path. That is not critical for any
> users at the moment, even less than probe.

I meant if we have vq->irq_pending, we need to call vring_interrupt()
in the virtio_device_ready() and synchronize the IRQ handlers with
spinlock or others.

>
> > 2) enable this by default, so I don't object, but this may have some risk
> > for old hypervisors
>
>
> The risk if there's a driver adding buffers without setting DRIVER_OK.

Probably not, we have devices that accept random inputs from outside,
net, console, input etc. I've done a round of audits of the Qemu
codes. They look all fine since day0.

> So with this approach, how about we rename the flag "driver_ok"?
> And then add_buf can actually test it and BUG_ON if not there  (at least
> in the debug build).

This looks like a hardening of the driver in the core instead of the
device. I think it can be done but in a separate series.

>
> And going down from there, how about we cache status in the
> device? Then we don't need to keep re-reading it every time,
> speeding boot up a tiny bit.

I don't fully understand here, actually spec requires status to be
read back for validation in many cases.

Thanks

>
> >
> > >
> > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > --- a/drivers/virtio/virtio_ring.c
> > > > +++ b/drivers/virtio/virtio_ring.c
> > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > >   }
> > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > >   {
> > > > + struct virtqueue *_vq = v;
> > > > + struct virtio_device *vdev = _vq->vdev;
> > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > +         return IRQ_NONE;
> > > > + }
> > > > +
> > > >           if (!more_used(vq)) {
> > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > >                   return IRQ_NONE;
> > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > index 5464f398912a..957d6ad604ac 100644
> > > > --- a/include/linux/virtio.h
> > > > +++ b/include/linux/virtio.h
> > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > >    * @config_enabled: configuration change reporting enabled
> > > >    * @config_change_pending: configuration change reported while disabled
> > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > + * @irq_soft_enabled: callbacks enabled
> > > >    * @config_lock: protects configuration change reporting
> > > >    * @dev: underlying device.
> > > >    * @id: the device type identification (used to match it with a driver).
> > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > >           bool failed;
> > > >           bool config_enabled;
> > > >           bool config_change_pending;
> > > > + bool irq_soft_check;
> > > > + bool irq_soft_enabled;
> > > >           spinlock_t config_lock;
> > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > >           struct device dev;
> > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > --- a/include/linux/virtio_config.h
> > > > +++ b/include/linux/virtio_config.h
> > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > >           return __virtio_test_bit(vdev, fbit);
> > > >   }
> > > > +/*
> > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > + * @vdev: the device
> > > > + */
> > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > +{
> > > > + if (!vdev->irq_soft_check)
> > > > +         return true;
> > > > +
> > > > + /*
> > > > +  * Read irq_soft_enabled before reading other device specific
> > > > +  * data. Paried with smp_store_relase() in
> > > paired
> >
> >
> > Will fix.
> >
> > Thanks
> >
> >
> > >
> > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > +  * virtio_reset_device().
> > > > +  */
> > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > +}
> > > > +
> > > >   /**
> > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > >    * @vdev: the device
> > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > >           if (dev->config->enable_cbs)
> > > >                     dev->config->enable_cbs(dev);
> > > > + /*
> > > > +  * Commit the driver setup before enabling the virtqueue
> > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > +  * virtio_irq_soft_enabled()
> > > > +  */
> > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > +
> > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > >   }
> > > > --
> > > > 2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-25  7:52 ` Re: Jason Wang
@ 2022-03-25  9:10   ` Michael S. Tsirkin
  2022-03-25  9:20     ` Re: Jason Wang
  0 siblings, 1 reply; 1546+ messages in thread
From: Michael S. Tsirkin @ 2022-03-25  9:10 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > Bcc:
> > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > Reply-To:
> > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> >
> > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > >
> > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > >
> > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > >     that is used by some device such as virtio-blk
> > > > > 2) done only for PCI transport
> > > > >
> > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > by introducing a global irq_soft_enabled variable for each
> > > > > virtio_device. Then we can to toggle it during
> > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > but the cost should be acceptable.
> > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > >
> > >
> > > Even if we allow the transport driver to synchornize through
> > > synchronize_irq() we still need a check in the vring_interrupt().
> > >
> > > We do something like the following previously:
> > >
> > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > >                 return IRQ_NONE;
> > >
> > > But it looks like a bug since speculative read can be done before the check
> > > where the interrupt handler can't see the uncommitted setup which is done by
> > > the driver.
> >
> > I don't think so - if you sync after setting the value then
> > you are guaranteed that any handler running afterwards
> > will see the new value.
> 
> The problem is not disabled but the enable.

So a misbehaving device can lose interrupts? That's not a problem at all
imo.

> We use smp_store_relase()
> to make sure the driver commits the setup before enabling the irq. It
> means the read needs to be ordered as well in vring_interrupt().
> 
> >
> > Although I couldn't find anything about this in memory-barriers.txt
> > which surprises me.
> >
> > CC Paul to help make sure I'm right.
> >
> >
> > >
> > > >
> > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > module parameter is introduced to enable the hardening so function
> > > > > hardening is disabled by default.
> > > > Which devices are these? How come they send an interrupt before there
> > > > are any buffers in any queues?
> > >
> > >
> > > I copied this from the commit log for 22b7050a024d7
> > >
> > > "
> > >
> > >     This change will also benefit old hypervisors (before 2009)
> > >     that send interrupts without checking DRIVER_OK: previously,
> > >     the callback could race with driver-specific initialization.
> > > "
> > >
> > > If this is only for config interrupt, I can remove the above log.
> >
> >
> > This is only for config interrupt.
> 
> Ok.
> 
> >
> > >
> > > >
> > > > > Note that the hardening is only done for vring interrupt since the
> > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > handler because it uses spinlock to do the synchronization which is
> > > > > expensive.
> > > > >
> > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > >
> > > > > ---
> > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > >   include/linux/virtio.h        |  4 ++++
> > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > --- a/drivers/virtio/virtio.c
> > > > > +++ b/drivers/virtio/virtio.c
> > > > > @@ -7,6 +7,12 @@
> > > > >   #include <linux/of.h>
> > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > +static bool irq_hardening = false;
> > > > > +
> > > > > +module_param(irq_hardening, bool, 0444);
> > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > +
> > > > >   /* Unique numbering for virtio devices. */
> > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > >    * */
> > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > >   {
> > > > > + /*
> > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > +  * interrupt for this line arriving after
> > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > +  * irq_soft_enabled == false.
> > > > News to me I did not know synchronize_rcu has anything to do
> > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > though it's most likely is ...
> > >
> > >
> > > According to the comment above tree RCU version of synchronize_rcu():
> > >
> > > """
> > >
> > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > >  * or softirqs have been disabled also serve as RCU read-side critical
> > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > >  * and NMI handlers.
> > > """
> > >
> > > So interrupt handlers are treated as read-side critical sections.
> > >
> > > And it has the comment for explain the barrier:
> > >
> > > """
> > >
> > >  * Note that this guarantee implies further memory-ordering guarantees.
> > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > >  * each CPU is guaranteed to have executed a full memory barrier since
> > >  * the end of its last RCU read-side critical section whose beginning
> > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > """
> > >
> > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > irq_soft_enabled as false.
> > >
> >
> > You are right. So then
> > 1. I do not think we need load_acquire - why is it needed? Just
> >    READ_ONCE should do.
> 
> See above.
> 
> > 2. isn't synchronize_irq also doing the same thing?
> 
> 
> Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> 
> >
> >
> > > >
> > > > > +  */
> > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > + synchronize_rcu();
> > > > > +
> > > > >           dev->config->reset(dev);
> > > > >   }
> > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > Please add comment explaining where it will be enabled.
> > > > Also, we *really* don't need to synch if it was already disabled,
> > > > let's not add useless overhead to the boot sequence.
> > >
> > >
> > > Ok.
> > >
> > >
> > > >
> > > >
> > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > >           spin_lock_init(&dev->config_lock);
> > > > >           dev->config_enabled = false;
> > > > >           dev->config_change_pending = false;
> > > > > + dev->irq_soft_check = irq_hardening;
> > > > > +
> > > > > + if (dev->irq_soft_check)
> > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > >           /* We always start by resetting the device, in case a previous
> > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > one of the points of hardening is it's also helpful for buggy
> > > > devices. this flag defeats the purpose.
> > >
> > >
> > > Do you mean:
> > >
> > > 1) we need something like config_enable? This seems not easy to be
> > > implemented without obvious overhead, mainly the synchronize with the
> > > interrupt handlers
> >
> > But synchronize is only on tear-down path. That is not critical for any
> > users at the moment, even less than probe.
> 
> I meant if we have vq->irq_pending, we need to call vring_interrupt()
> in the virtio_device_ready() and synchronize the IRQ handlers with
> spinlock or others.
> 
> >
> > > 2) enable this by default, so I don't object, but this may have some risk
> > > for old hypervisors
> >
> >
> > The risk if there's a driver adding buffers without setting DRIVER_OK.
> 
> Probably not, we have devices that accept random inputs from outside,
> net, console, input etc. I've done a round of audits of the Qemu
> codes. They look all fine since day0.
> 
> > So with this approach, how about we rename the flag "driver_ok"?
> > And then add_buf can actually test it and BUG_ON if not there  (at least
> > in the debug build).
> 
> This looks like a hardening of the driver in the core instead of the
> device. I think it can be done but in a separate series.
> 
> >
> > And going down from there, how about we cache status in the
> > device? Then we don't need to keep re-reading it every time,
> > speeding boot up a tiny bit.
> 
> I don't fully understand here, actually spec requires status to be
> read back for validation in many cases.
> 
> Thanks
> 
> >
> > >
> > > >
> > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > >   }
> > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > >   {
> > > > > + struct virtqueue *_vq = v;
> > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > +         return IRQ_NONE;
> > > > > + }
> > > > > +
> > > > >           if (!more_used(vq)) {
> > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > >                   return IRQ_NONE;
> > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > --- a/include/linux/virtio.h
> > > > > +++ b/include/linux/virtio.h
> > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > >    * @config_enabled: configuration change reporting enabled
> > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > + * @irq_soft_enabled: callbacks enabled
> > > > >    * @config_lock: protects configuration change reporting
> > > > >    * @dev: underlying device.
> > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > >           bool failed;
> > > > >           bool config_enabled;
> > > > >           bool config_change_pending;
> > > > > + bool irq_soft_check;
> > > > > + bool irq_soft_enabled;
> > > > >           spinlock_t config_lock;
> > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > >           struct device dev;
> > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > --- a/include/linux/virtio_config.h
> > > > > +++ b/include/linux/virtio_config.h
> > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > >           return __virtio_test_bit(vdev, fbit);
> > > > >   }
> > > > > +/*
> > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > + * @vdev: the device
> > > > > + */
> > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > +{
> > > > > + if (!vdev->irq_soft_check)
> > > > > +         return true;
> > > > > +
> > > > > + /*
> > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > +  * data. Paried with smp_store_relase() in
> > > > paired
> > >
> > >
> > > Will fix.
> > >
> > > Thanks
> > >
> > >
> > > >
> > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > +  * virtio_reset_device().
> > > > > +  */
> > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > +}
> > > > > +
> > > > >   /**
> > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > >    * @vdev: the device
> > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > >           if (dev->config->enable_cbs)
> > > > >                     dev->config->enable_cbs(dev);
> > > > > + /*
> > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > +  * virtio_irq_soft_enabled()
> > > > > +  */
> > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > +
> > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > >   }
> > > > > --
> > > > > 2.25.1
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-25  9:10   ` Re: Michael S. Tsirkin
@ 2022-03-25  9:20     ` Jason Wang
  2022-03-25 10:09       ` Re: Michael S. Tsirkin
  0 siblings, 1 reply; 1546+ messages in thread
From: Jason Wang @ 2022-03-25  9:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Fri, Mar 25, 2022 at 5:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> > On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > Bcc:
> > > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > > Reply-To:
> > > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> > >
> > > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > > >
> > > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > > >
> > > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > > >     that is used by some device such as virtio-blk
> > > > > > 2) done only for PCI transport
> > > > > >
> > > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > > by introducing a global irq_soft_enabled variable for each
> > > > > > virtio_device. Then we can to toggle it during
> > > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > > but the cost should be acceptable.
> > > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > > >
> > > >
> > > > Even if we allow the transport driver to synchornize through
> > > > synchronize_irq() we still need a check in the vring_interrupt().
> > > >
> > > > We do something like the following previously:
> > > >
> > > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > > >                 return IRQ_NONE;
> > > >
> > > > But it looks like a bug since speculative read can be done before the check
> > > > where the interrupt handler can't see the uncommitted setup which is done by
> > > > the driver.
> > >
> > > I don't think so - if you sync after setting the value then
> > > you are guaranteed that any handler running afterwards
> > > will see the new value.
> >
> > The problem is not disabled but the enable.
>
> So a misbehaving device can lose interrupts? That's not a problem at all
> imo.

It's the interrupt raised before setting irq_soft_enabled to true:

CPU 0 probe) driver specific setup (not commited)
CPU 1 IRQ handler) read the uninitialized variable
CPU 0 probe) set irq_soft_enabled to true
CPU 1 IRQ handler) read irq_soft_enable as true
CPU 1 IRQ handler) use the uninitialized variable

Thanks

>
> > We use smp_store_relase()
> > to make sure the driver commits the setup before enabling the irq. It
> > means the read needs to be ordered as well in vring_interrupt().
> >
> > >
> > > Although I couldn't find anything about this in memory-barriers.txt
> > > which surprises me.
> > >
> > > CC Paul to help make sure I'm right.
> > >
> > >
> > > >
> > > > >
> > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > module parameter is introduced to enable the hardening so function
> > > > > > hardening is disabled by default.
> > > > > Which devices are these? How come they send an interrupt before there
> > > > > are any buffers in any queues?
> > > >
> > > >
> > > > I copied this from the commit log for 22b7050a024d7
> > > >
> > > > "
> > > >
> > > >     This change will also benefit old hypervisors (before 2009)
> > > >     that send interrupts without checking DRIVER_OK: previously,
> > > >     the callback could race with driver-specific initialization.
> > > > "
> > > >
> > > > If this is only for config interrupt, I can remove the above log.
> > >
> > >
> > > This is only for config interrupt.
> >
> > Ok.
> >
> > >
> > > >
> > > > >
> > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > expensive.
> > > > > >
> > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > >
> > > > > > ---
> > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > --- a/drivers/virtio/virtio.c
> > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > @@ -7,6 +7,12 @@
> > > > > >   #include <linux/of.h>
> > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > +static bool irq_hardening = false;
> > > > > > +
> > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > +
> > > > > >   /* Unique numbering for virtio devices. */
> > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > >    * */
> > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > >   {
> > > > > > + /*
> > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > +  * interrupt for this line arriving after
> > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > +  * irq_soft_enabled == false.
> > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > though it's most likely is ...
> > > >
> > > >
> > > > According to the comment above tree RCU version of synchronize_rcu():
> > > >
> > > > """
> > > >
> > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > >  * and NMI handlers.
> > > > """
> > > >
> > > > So interrupt handlers are treated as read-side critical sections.
> > > >
> > > > And it has the comment for explain the barrier:
> > > >
> > > > """
> > > >
> > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > >  * the end of its last RCU read-side critical section whose beginning
> > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > """
> > > >
> > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > irq_soft_enabled as false.
> > > >
> > >
> > > You are right. So then
> > > 1. I do not think we need load_acquire - why is it needed? Just
> > >    READ_ONCE should do.
> >
> > See above.
> >
> > > 2. isn't synchronize_irq also doing the same thing?
> >
> >
> > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> >
> > >
> > >
> > > > >
> > > > > > +  */
> > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > + synchronize_rcu();
> > > > > > +
> > > > > >           dev->config->reset(dev);
> > > > > >   }
> > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > Please add comment explaining where it will be enabled.
> > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > let's not add useless overhead to the boot sequence.
> > > >
> > > >
> > > > Ok.
> > > >
> > > >
> > > > >
> > > > >
> > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > >           spin_lock_init(&dev->config_lock);
> > > > > >           dev->config_enabled = false;
> > > > > >           dev->config_change_pending = false;
> > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > +
> > > > > > + if (dev->irq_soft_check)
> > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > >           /* We always start by resetting the device, in case a previous
> > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > one of the points of hardening is it's also helpful for buggy
> > > > > devices. this flag defeats the purpose.
> > > >
> > > >
> > > > Do you mean:
> > > >
> > > > 1) we need something like config_enable? This seems not easy to be
> > > > implemented without obvious overhead, mainly the synchronize with the
> > > > interrupt handlers
> > >
> > > But synchronize is only on tear-down path. That is not critical for any
> > > users at the moment, even less than probe.
> >
> > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > in the virtio_device_ready() and synchronize the IRQ handlers with
> > spinlock or others.
> >
> > >
> > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > for old hypervisors
> > >
> > >
> > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> >
> > Probably not, we have devices that accept random inputs from outside,
> > net, console, input etc. I've done a round of audits of the Qemu
> > codes. They look all fine since day0.
> >
> > > So with this approach, how about we rename the flag "driver_ok"?
> > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > in the debug build).
> >
> > This looks like a hardening of the driver in the core instead of the
> > device. I think it can be done but in a separate series.
> >
> > >
> > > And going down from there, how about we cache status in the
> > > device? Then we don't need to keep re-reading it every time,
> > > speeding boot up a tiny bit.
> >
> > I don't fully understand here, actually spec requires status to be
> > read back for validation in many cases.
> >
> > Thanks
> >
> > >
> > > >
> > > > >
> > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > >   }
> > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > >   {
> > > > > > + struct virtqueue *_vq = v;
> > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > +         return IRQ_NONE;
> > > > > > + }
> > > > > > +
> > > > > >           if (!more_used(vq)) {
> > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > >                   return IRQ_NONE;
> > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > --- a/include/linux/virtio.h
> > > > > > +++ b/include/linux/virtio.h
> > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > >    * @config_lock: protects configuration change reporting
> > > > > >    * @dev: underlying device.
> > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > >           bool failed;
> > > > > >           bool config_enabled;
> > > > > >           bool config_change_pending;
> > > > > > + bool irq_soft_check;
> > > > > > + bool irq_soft_enabled;
> > > > > >           spinlock_t config_lock;
> > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > >           struct device dev;
> > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > --- a/include/linux/virtio_config.h
> > > > > > +++ b/include/linux/virtio_config.h
> > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > >   }
> > > > > > +/*
> > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > + * @vdev: the device
> > > > > > + */
> > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > +{
> > > > > > + if (!vdev->irq_soft_check)
> > > > > > +         return true;
> > > > > > +
> > > > > > + /*
> > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > +  * data. Paried with smp_store_relase() in
> > > > > paired
> > > >
> > > >
> > > > Will fix.
> > > >
> > > > Thanks
> > > >
> > > >
> > > > >
> > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > +  * virtio_reset_device().
> > > > > > +  */
> > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > +}
> > > > > > +
> > > > > >   /**
> > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > >    * @vdev: the device
> > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > >           if (dev->config->enable_cbs)
> > > > > >                     dev->config->enable_cbs(dev);
> > > > > > + /*
> > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > +  * virtio_irq_soft_enabled()
> > > > > > +  */
> > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > +
> > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > >   }
> > > > > > --
> > > > > > 2.25.1
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-25  9:20     ` Re: Jason Wang
@ 2022-03-25 10:09       ` Michael S. Tsirkin
  2022-03-28  4:56         ` Re: Jason Wang
  0 siblings, 1 reply; 1546+ messages in thread
From: Michael S. Tsirkin @ 2022-03-25 10:09 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Fri, Mar 25, 2022 at 05:20:19PM +0800, Jason Wang wrote:
> On Fri, Mar 25, 2022 at 5:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> > > On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > Bcc:
> > > > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > > > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > > > Reply-To:
> > > > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> > > >
> > > > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > > > >
> > > > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > > > >
> > > > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > > > >     that is used by some device such as virtio-blk
> > > > > > > 2) done only for PCI transport
> > > > > > >
> > > > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > > > by introducing a global irq_soft_enabled variable for each
> > > > > > > virtio_device. Then we can to toggle it during
> > > > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > > > but the cost should be acceptable.
> > > > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > > > >
> > > > >
> > > > > Even if we allow the transport driver to synchornize through
> > > > > synchronize_irq() we still need a check in the vring_interrupt().
> > > > >
> > > > > We do something like the following previously:
> > > > >
> > > > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > > > >                 return IRQ_NONE;
> > > > >
> > > > > But it looks like a bug since speculative read can be done before the check
> > > > > where the interrupt handler can't see the uncommitted setup which is done by
> > > > > the driver.
> > > >
> > > > I don't think so - if you sync after setting the value then
> > > > you are guaranteed that any handler running afterwards
> > > > will see the new value.
> > >
> > > The problem is not disabled but the enable.
> >
> > So a misbehaving device can lose interrupts? That's not a problem at all
> > imo.
> 
> It's the interrupt raised before setting irq_soft_enabled to true:
> 
> CPU 0 probe) driver specific setup (not commited)
> CPU 1 IRQ handler) read the uninitialized variable
> CPU 0 probe) set irq_soft_enabled to true
> CPU 1 IRQ handler) read irq_soft_enable as true
> CPU 1 IRQ handler) use the uninitialized variable
> 
> Thanks

Yea, it hurts if you do it.  So do not do it then ;).

irq_soft_enabled (I think driver_ok or status is a better name)
should be initialized to false *before* irq is requested.

And requesting irq commits all memory otherwise all drivers would be
broken, if it doesn't it just needs to be fixed, not worked around in
virtio.


> >
> > > We use smp_store_relase()
> > > to make sure the driver commits the setup before enabling the irq. It
> > > means the read needs to be ordered as well in vring_interrupt().
> > >
> > > >
> > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > which surprises me.
> > > >
> > > > CC Paul to help make sure I'm right.
> > > >
> > > >
> > > > >
> > > > > >
> > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > hardening is disabled by default.
> > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > are any buffers in any queues?
> > > > >
> > > > >
> > > > > I copied this from the commit log for 22b7050a024d7
> > > > >
> > > > > "
> > > > >
> > > > >     This change will also benefit old hypervisors (before 2009)
> > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > >     the callback could race with driver-specific initialization.
> > > > > "
> > > > >
> > > > > If this is only for config interrupt, I can remove the above log.
> > > >
> > > >
> > > > This is only for config interrupt.
> > >
> > > Ok.
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > expensive.
> > > > > > >
> > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > >
> > > > > > > ---
> > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > @@ -7,6 +7,12 @@
> > > > > > >   #include <linux/of.h>
> > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > +static bool irq_hardening = false;
> > > > > > > +
> > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > +
> > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > >    * */
> > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > >   {
> > > > > > > + /*
> > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > +  * interrupt for this line arriving after
> > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > +  * irq_soft_enabled == false.
> > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > though it's most likely is ...
> > > > >
> > > > >
> > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > >
> > > > > """
> > > > >
> > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > >  * and NMI handlers.
> > > > > """
> > > > >
> > > > > So interrupt handlers are treated as read-side critical sections.
> > > > >
> > > > > And it has the comment for explain the barrier:
> > > > >
> > > > > """
> > > > >
> > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > """
> > > > >
> > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > irq_soft_enabled as false.
> > > > >
> > > >
> > > > You are right. So then
> > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > >    READ_ONCE should do.
> > >
> > > See above.
> > >
> > > > 2. isn't synchronize_irq also doing the same thing?
> > >
> > >
> > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > >
> > > >
> > > >
> > > > > >
> > > > > > > +  */
> > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > + synchronize_rcu();
> > > > > > > +
> > > > > > >           dev->config->reset(dev);
> > > > > > >   }
> > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > Please add comment explaining where it will be enabled.
> > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > let's not add useless overhead to the boot sequence.
> > > > >
> > > > >
> > > > > Ok.
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > >           dev->config_enabled = false;
> > > > > > >           dev->config_change_pending = false;
> > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > +
> > > > > > > + if (dev->irq_soft_check)
> > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > devices. this flag defeats the purpose.
> > > > >
> > > > >
> > > > > Do you mean:
> > > > >
> > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > interrupt handlers
> > > >
> > > > But synchronize is only on tear-down path. That is not critical for any
> > > > users at the moment, even less than probe.
> > >
> > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > spinlock or others.
> > >
> > > >
> > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > for old hypervisors
> > > >
> > > >
> > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > >
> > > Probably not, we have devices that accept random inputs from outside,
> > > net, console, input etc. I've done a round of audits of the Qemu
> > > codes. They look all fine since day0.
> > >
> > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > in the debug build).
> > >
> > > This looks like a hardening of the driver in the core instead of the
> > > device. I think it can be done but in a separate series.
> > >
> > > >
> > > > And going down from there, how about we cache status in the
> > > > device? Then we don't need to keep re-reading it every time,
> > > > speeding boot up a tiny bit.
> > >
> > > I don't fully understand here, actually spec requires status to be
> > > read back for validation in many cases.
> > >
> > > Thanks
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > >   }
> > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > >   {
> > > > > > > + struct virtqueue *_vq = v;
> > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > +         return IRQ_NONE;
> > > > > > > + }
> > > > > > > +
> > > > > > >           if (!more_used(vq)) {
> > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > >                   return IRQ_NONE;
> > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > --- a/include/linux/virtio.h
> > > > > > > +++ b/include/linux/virtio.h
> > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > >    * @dev: underlying device.
> > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > >           bool failed;
> > > > > > >           bool config_enabled;
> > > > > > >           bool config_change_pending;
> > > > > > > + bool irq_soft_check;
> > > > > > > + bool irq_soft_enabled;
> > > > > > >           spinlock_t config_lock;
> > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > >           struct device dev;
> > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > >   }
> > > > > > > +/*
> > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > + * @vdev: the device
> > > > > > > + */
> > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > +{
> > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > +         return true;
> > > > > > > +
> > > > > > > + /*
> > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > paired
> > > > >
> > > > >
> > > > > Will fix.
> > > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > >
> > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > +  * virtio_reset_device().
> > > > > > > +  */
> > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > +}
> > > > > > > +
> > > > > > >   /**
> > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > >    * @vdev: the device
> > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > >           if (dev->config->enable_cbs)
> > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > + /*
> > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > +  */
> > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > +
> > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > >   }
> > > > > > > --
> > > > > > > 2.25.1
> > > >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-25 10:09       ` Re: Michael S. Tsirkin
@ 2022-03-28  4:56         ` Jason Wang
  2022-03-28  5:59           ` Re: Michael S. Tsirkin
  0 siblings, 1 reply; 1546+ messages in thread
From: Jason Wang @ 2022-03-28  4:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Fri, Mar 25, 2022 at 6:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Mar 25, 2022 at 05:20:19PM +0800, Jason Wang wrote:
> > On Fri, Mar 25, 2022 at 5:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> > > > On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > Bcc:
> > > > > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > > > > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > > > > Reply-To:
> > > > > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> > > > >
> > > > > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > > > > >
> > > > > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > > > > >
> > > > > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > > > > >     that is used by some device such as virtio-blk
> > > > > > > > 2) done only for PCI transport
> > > > > > > >
> > > > > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > > > > by introducing a global irq_soft_enabled variable for each
> > > > > > > > virtio_device. Then we can to toggle it during
> > > > > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > > > > but the cost should be acceptable.
> > > > > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > > > > >
> > > > > >
> > > > > > Even if we allow the transport driver to synchornize through
> > > > > > synchronize_irq() we still need a check in the vring_interrupt().
> > > > > >
> > > > > > We do something like the following previously:
> > > > > >
> > > > > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > > > > >                 return IRQ_NONE;
> > > > > >
> > > > > > But it looks like a bug since speculative read can be done before the check
> > > > > > where the interrupt handler can't see the uncommitted setup which is done by
> > > > > > the driver.
> > > > >
> > > > > I don't think so - if you sync after setting the value then
> > > > > you are guaranteed that any handler running afterwards
> > > > > will see the new value.
> > > >
> > > > The problem is not disabled but the enable.
> > >
> > > So a misbehaving device can lose interrupts? That's not a problem at all
> > > imo.
> >
> > It's the interrupt raised before setting irq_soft_enabled to true:
> >
> > CPU 0 probe) driver specific setup (not commited)
> > CPU 1 IRQ handler) read the uninitialized variable
> > CPU 0 probe) set irq_soft_enabled to true
> > CPU 1 IRQ handler) read irq_soft_enable as true
> > CPU 1 IRQ handler) use the uninitialized variable
> >
> > Thanks
>
> Yea, it hurts if you do it.  So do not do it then ;).
>
> irq_soft_enabled (I think driver_ok or status is a better name)

I can change it to driver_ok.

> should be initialized to false *before* irq is requested.
>
> And requesting irq commits all memory otherwise all drivers would be
> broken,

So I think we might talk different issues:

1) Whether request_irq() commits the previous setups, I think the
answer is yes, since the spin_unlock of desc->lock (release) can
guarantee this though there seems no documentation around
request_irq() to say this.

And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
using smp_wmb() before the request_irq().

And even if write is ordered we still need read to be ordered to be
paired with that.

> if it doesn't it just needs to be fixed, not worked around in
> virtio.

2) virtio drivers might do a lot of setups between request_irq() and
virtio_device_ready():

request_irq()
driver specific setups
virtio_device_ready()

CPU 0 probe) request_irq()
CPU 1 IRQ handler) read the uninitialized variable
CPU 0 probe) driver specific setups
CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
CPU 1 IRQ handler) read irq_soft_enable as true
CPU 1 IRQ handler) use the uninitialized variable

Thanks

>
>
> > >
> > > > We use smp_store_relase()
> > > > to make sure the driver commits the setup before enabling the irq. It
> > > > means the read needs to be ordered as well in vring_interrupt().
> > > >
> > > > >
> > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > which surprises me.
> > > > >
> > > > > CC Paul to help make sure I'm right.
> > > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > hardening is disabled by default.
> > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > are any buffers in any queues?
> > > > > >
> > > > > >
> > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > >
> > > > > > "
> > > > > >
> > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > >     the callback could race with driver-specific initialization.
> > > > > > "
> > > > > >
> > > > > > If this is only for config interrupt, I can remove the above log.
> > > > >
> > > > >
> > > > > This is only for config interrupt.
> > > >
> > > > Ok.
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > expensive.
> > > > > > > >
> > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > >
> > > > > > > > ---
> > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > >   #include <linux/of.h>
> > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > +static bool irq_hardening = false;
> > > > > > > > +
> > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > +
> > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > >    * */
> > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > >   {
> > > > > > > > + /*
> > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > though it's most likely is ...
> > > > > >
> > > > > >
> > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > >
> > > > > > """
> > > > > >
> > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > >  * and NMI handlers.
> > > > > > """
> > > > > >
> > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > >
> > > > > > And it has the comment for explain the barrier:
> > > > > >
> > > > > > """
> > > > > >
> > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > """
> > > > > >
> > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > irq_soft_enabled as false.
> > > > > >
> > > > >
> > > > > You are right. So then
> > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > >    READ_ONCE should do.
> > > >
> > > > See above.
> > > >
> > > > > 2. isn't synchronize_irq also doing the same thing?
> > > >
> > > >
> > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > >
> > > > >
> > > > >
> > > > > > >
> > > > > > > > +  */
> > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > + synchronize_rcu();
> > > > > > > > +
> > > > > > > >           dev->config->reset(dev);
> > > > > > > >   }
> > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > let's not add useless overhead to the boot sequence.
> > > > > >
> > > > > >
> > > > > > Ok.
> > > > > >
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > >           dev->config_enabled = false;
> > > > > > > >           dev->config_change_pending = false;
> > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > +
> > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > devices. this flag defeats the purpose.
> > > > > >
> > > > > >
> > > > > > Do you mean:
> > > > > >
> > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > interrupt handlers
> > > > >
> > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > users at the moment, even less than probe.
> > > >
> > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > spinlock or others.
> > > >
> > > > >
> > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > for old hypervisors
> > > > >
> > > > >
> > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > >
> > > > Probably not, we have devices that accept random inputs from outside,
> > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > codes. They look all fine since day0.
> > > >
> > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > in the debug build).
> > > >
> > > > This looks like a hardening of the driver in the core instead of the
> > > > device. I think it can be done but in a separate series.
> > > >
> > > > >
> > > > > And going down from there, how about we cache status in the
> > > > > device? Then we don't need to keep re-reading it every time,
> > > > > speeding boot up a tiny bit.
> > > >
> > > > I don't fully understand here, actually spec requires status to be
> > > > read back for validation in many cases.
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > >   }
> > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > >   {
> > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > +         return IRQ_NONE;
> > > > > > > > + }
> > > > > > > > +
> > > > > > > >           if (!more_used(vq)) {
> > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > >                   return IRQ_NONE;
> > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > >    * @dev: underlying device.
> > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > >           bool failed;
> > > > > > > >           bool config_enabled;
> > > > > > > >           bool config_change_pending;
> > > > > > > > + bool irq_soft_check;
> > > > > > > > + bool irq_soft_enabled;
> > > > > > > >           spinlock_t config_lock;
> > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > >           struct device dev;
> > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > >   }
> > > > > > > > +/*
> > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > + * @vdev: the device
> > > > > > > > + */
> > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > +{
> > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > +         return true;
> > > > > > > > +
> > > > > > > > + /*
> > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > paired
> > > > > >
> > > > > >
> > > > > > Will fix.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > +  * virtio_reset_device().
> > > > > > > > +  */
> > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > +}
> > > > > > > > +
> > > > > > > >   /**
> > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > >    * @vdev: the device
> > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > + /*
> > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > +  */
> > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > +
> > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > >   }
> > > > > > > > --
> > > > > > > > 2.25.1
> > > > >
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-28  4:56         ` Re: Jason Wang
@ 2022-03-28  5:59           ` Michael S. Tsirkin
  2022-03-28  6:18             ` Re: Jason Wang
  0 siblings, 1 reply; 1546+ messages in thread
From: Michael S. Tsirkin @ 2022-03-28  5:59 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Mon, Mar 28, 2022 at 12:56:41PM +0800, Jason Wang wrote:
> On Fri, Mar 25, 2022 at 6:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Fri, Mar 25, 2022 at 05:20:19PM +0800, Jason Wang wrote:
> > > On Fri, Mar 25, 2022 at 5:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> > > > > On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > Bcc:
> > > > > > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > > > > > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > > > > > Reply-To:
> > > > > > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> > > > > >
> > > > > > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > > > > > >
> > > > > > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > > > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > > > > > >
> > > > > > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > > > > > >     that is used by some device such as virtio-blk
> > > > > > > > > 2) done only for PCI transport
> > > > > > > > >
> > > > > > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > > > > > by introducing a global irq_soft_enabled variable for each
> > > > > > > > > virtio_device. Then we can to toggle it during
> > > > > > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > > > > > but the cost should be acceptable.
> > > > > > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > > > > > >
> > > > > > >
> > > > > > > Even if we allow the transport driver to synchornize through
> > > > > > > synchronize_irq() we still need a check in the vring_interrupt().
> > > > > > >
> > > > > > > We do something like the following previously:
> > > > > > >
> > > > > > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > > > > > >                 return IRQ_NONE;
> > > > > > >
> > > > > > > But it looks like a bug since speculative read can be done before the check
> > > > > > > where the interrupt handler can't see the uncommitted setup which is done by
> > > > > > > the driver.
> > > > > >
> > > > > > I don't think so - if you sync after setting the value then
> > > > > > you are guaranteed that any handler running afterwards
> > > > > > will see the new value.
> > > > >
> > > > > The problem is not disabled but the enable.
> > > >
> > > > So a misbehaving device can lose interrupts? That's not a problem at all
> > > > imo.
> > >
> > > It's the interrupt raised before setting irq_soft_enabled to true:
> > >
> > > CPU 0 probe) driver specific setup (not commited)
> > > CPU 1 IRQ handler) read the uninitialized variable
> > > CPU 0 probe) set irq_soft_enabled to true
> > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > CPU 1 IRQ handler) use the uninitialized variable
> > >
> > > Thanks
> >
> > Yea, it hurts if you do it.  So do not do it then ;).
> >
> > irq_soft_enabled (I think driver_ok or status is a better name)
> 
> I can change it to driver_ok.
> 
> > should be initialized to false *before* irq is requested.
> >
> > And requesting irq commits all memory otherwise all drivers would be
> > broken,
> 
> So I think we might talk different issues:
> 
> 1) Whether request_irq() commits the previous setups, I think the
> answer is yes, since the spin_unlock of desc->lock (release) can
> guarantee this though there seems no documentation around
> request_irq() to say this.
> 
> And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> using smp_wmb() before the request_irq().
> 
> And even if write is ordered we still need read to be ordered to be
> paired with that.
> 
> > if it doesn't it just needs to be fixed, not worked around in
> > virtio.
> 
> 2) virtio drivers might do a lot of setups between request_irq() and
> virtio_device_ready():
> 
> request_irq()
> driver specific setups
> virtio_device_ready()
> 
> CPU 0 probe) request_irq()
> CPU 1 IRQ handler) read the uninitialized variable
> CPU 0 probe) driver specific setups
> CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> CPU 1 IRQ handler) read irq_soft_enable as true
> CPU 1 IRQ handler) use the uninitialized variable
> 
> Thanks


As I said, virtio_device_ready needs to do synchronize_irq.
That will guarantee all setup is visible to the specific IRQ, this
is what it's point is.


> >
> >
> > > >
> > > > > We use smp_store_relase()
> > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > >
> > > > > >
> > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > which surprises me.
> > > > > >
> > > > > > CC Paul to help make sure I'm right.
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > hardening is disabled by default.
> > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > are any buffers in any queues?
> > > > > > >
> > > > > > >
> > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > >
> > > > > > > "
> > > > > > >
> > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > "
> > > > > > >
> > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > >
> > > > > >
> > > > > > This is only for config interrupt.
> > > > >
> > > > > Ok.
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > expensive.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > >
> > > > > > > > > ---
> > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > >   #include <linux/of.h>
> > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > +
> > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > +
> > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > >    * */
> > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > >   {
> > > > > > > > > + /*
> > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > though it's most likely is ...
> > > > > > >
> > > > > > >
> > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > >
> > > > > > > """
> > > > > > >
> > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > >  * and NMI handlers.
> > > > > > > """
> > > > > > >
> > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > >
> > > > > > > And it has the comment for explain the barrier:
> > > > > > >
> > > > > > > """
> > > > > > >
> > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > """
> > > > > > >
> > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > irq_soft_enabled as false.
> > > > > > >
> > > > > >
> > > > > > You are right. So then
> > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > >    READ_ONCE should do.
> > > > >
> > > > > See above.
> > > > >
> > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > >
> > > > >
> > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > >
> > > > > >
> > > > > >
> > > > > > > >
> > > > > > > > > +  */
> > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > + synchronize_rcu();
> > > > > > > > > +
> > > > > > > > >           dev->config->reset(dev);
> > > > > > > > >   }
> > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > >
> > > > > > >
> > > > > > > Ok.
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > >           dev->config_enabled = false;
> > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > +
> > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > devices. this flag defeats the purpose.
> > > > > > >
> > > > > > >
> > > > > > > Do you mean:
> > > > > > >
> > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > interrupt handlers
> > > > > >
> > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > users at the moment, even less than probe.
> > > > >
> > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > spinlock or others.
> > > > >
> > > > > >
> > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > for old hypervisors
> > > > > >
> > > > > >
> > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > >
> > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > codes. They look all fine since day0.
> > > > >
> > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > in the debug build).
> > > > >
> > > > > This looks like a hardening of the driver in the core instead of the
> > > > > device. I think it can be done but in a separate series.
> > > > >
> > > > > >
> > > > > > And going down from there, how about we cache status in the
> > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > speeding boot up a tiny bit.
> > > > >
> > > > > I don't fully understand here, actually spec requires status to be
> > > > > read back for validation in many cases.
> > > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > >   }
> > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > >   {
> > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > + }
> > > > > > > > > +
> > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > >    * @dev: underlying device.
> > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > >           bool failed;
> > > > > > > > >           bool config_enabled;
> > > > > > > > >           bool config_change_pending;
> > > > > > > > > + bool irq_soft_check;
> > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > >           spinlock_t config_lock;
> > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > >           struct device dev;
> > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > >   }
> > > > > > > > > +/*
> > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > + * @vdev: the device
> > > > > > > > > + */
> > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > +{
> > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > +         return true;
> > > > > > > > > +
> > > > > > > > > + /*
> > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > paired
> > > > > > >
> > > > > > >
> > > > > > > Will fix.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > +  */
> > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > >   /**
> > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > >    * @vdev: the device
> > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > + /*
> > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > +  */
> > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > +
> > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > >   }
> > > > > > > > > --
> > > > > > > > > 2.25.1
> > > > > >
> > > >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-28  5:59           ` Re: Michael S. Tsirkin
@ 2022-03-28  6:18             ` Jason Wang
  2022-03-28 10:40               ` Re: Michael S. Tsirkin
  0 siblings, 1 reply; 1546+ messages in thread
From: Jason Wang @ 2022-03-28  6:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Mon, Mar 28, 2022 at 1:59 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Mar 28, 2022 at 12:56:41PM +0800, Jason Wang wrote:
> > On Fri, Mar 25, 2022 at 6:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Fri, Mar 25, 2022 at 05:20:19PM +0800, Jason Wang wrote:
> > > > On Fri, Mar 25, 2022 at 5:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> > > > > > On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > Bcc:
> > > > > > > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > > > > > > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > > > > > > Reply-To:
> > > > > > > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> > > > > > >
> > > > > > > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > > > > > > >
> > > > > > > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > > > > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > > > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > > > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > > > > > > >
> > > > > > > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > > > > > > >     that is used by some device such as virtio-blk
> > > > > > > > > > 2) done only for PCI transport
> > > > > > > > > >
> > > > > > > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > > > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > > > > > > by introducing a global irq_soft_enabled variable for each
> > > > > > > > > > virtio_device. Then we can to toggle it during
> > > > > > > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > > > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > > > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > > > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > > > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > > > > > > but the cost should be acceptable.
> > > > > > > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > > > > > > >
> > > > > > > >
> > > > > > > > Even if we allow the transport driver to synchornize through
> > > > > > > > synchronize_irq() we still need a check in the vring_interrupt().
> > > > > > > >
> > > > > > > > We do something like the following previously:
> > > > > > > >
> > > > > > > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > > > > > > >                 return IRQ_NONE;
> > > > > > > >
> > > > > > > > But it looks like a bug since speculative read can be done before the check
> > > > > > > > where the interrupt handler can't see the uncommitted setup which is done by
> > > > > > > > the driver.
> > > > > > >
> > > > > > > I don't think so - if you sync after setting the value then
> > > > > > > you are guaranteed that any handler running afterwards
> > > > > > > will see the new value.
> > > > > >
> > > > > > The problem is not disabled but the enable.
> > > > >
> > > > > So a misbehaving device can lose interrupts? That's not a problem at all
> > > > > imo.
> > > >
> > > > It's the interrupt raised before setting irq_soft_enabled to true:
> > > >
> > > > CPU 0 probe) driver specific setup (not commited)
> > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > CPU 0 probe) set irq_soft_enabled to true
> > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > CPU 1 IRQ handler) use the uninitialized variable
> > > >
> > > > Thanks
> > >
> > > Yea, it hurts if you do it.  So do not do it then ;).
> > >
> > > irq_soft_enabled (I think driver_ok or status is a better name)
> >
> > I can change it to driver_ok.
> >
> > > should be initialized to false *before* irq is requested.
> > >
> > > And requesting irq commits all memory otherwise all drivers would be
> > > broken,
> >
> > So I think we might talk different issues:
> >
> > 1) Whether request_irq() commits the previous setups, I think the
> > answer is yes, since the spin_unlock of desc->lock (release) can
> > guarantee this though there seems no documentation around
> > request_irq() to say this.
> >
> > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > using smp_wmb() before the request_irq().
> >
> > And even if write is ordered we still need read to be ordered to be
> > paired with that.
> >
> > > if it doesn't it just needs to be fixed, not worked around in
> > > virtio.
> >
> > 2) virtio drivers might do a lot of setups between request_irq() and
> > virtio_device_ready():
> >
> > request_irq()
> > driver specific setups
> > virtio_device_ready()
> >
> > CPU 0 probe) request_irq()
> > CPU 1 IRQ handler) read the uninitialized variable
> > CPU 0 probe) driver specific setups
> > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > CPU 1 IRQ handler) read irq_soft_enable as true
> > CPU 1 IRQ handler) use the uninitialized variable
> >
> > Thanks
>
>
> As I said, virtio_device_ready needs to do synchronize_irq.
> That will guarantee all setup is visible to the specific IRQ,

Only the interrupt after synchronize_irq() returns.

>this
> is what it's point is.

What happens if an interrupt is raised in the middle like:

smp_store_release(dev->irq_soft_enabled, true)
IRQ handler
synchornize_irq()

If we don't enforce a reading order, the IRQ handler may still see the
uninitialized variable.

Thanks

>
>
> > >
> > >
> > > > >
> > > > > > We use smp_store_relase()
> > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > >
> > > > > > >
> > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > which surprises me.
> > > > > > >
> > > > > > > CC Paul to help make sure I'm right.
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > hardening is disabled by default.
> > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > are any buffers in any queues?
> > > > > > > >
> > > > > > > >
> > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > >
> > > > > > > > "
> > > > > > > >
> > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > "
> > > > > > > >
> > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > >
> > > > > > >
> > > > > > > This is only for config interrupt.
> > > > > >
> > > > > > Ok.
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > expensive.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > >
> > > > > > > > > > ---
> > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > +
> > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > +
> > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > >    * */
> > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > >   {
> > > > > > > > > > + /*
> > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > though it's most likely is ...
> > > > > > > >
> > > > > > > >
> > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > >
> > > > > > > > """
> > > > > > > >
> > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > >  * and NMI handlers.
> > > > > > > > """
> > > > > > > >
> > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > >
> > > > > > > > And it has the comment for explain the barrier:
> > > > > > > >
> > > > > > > > """
> > > > > > > >
> > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > """
> > > > > > > >
> > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > irq_soft_enabled as false.
> > > > > > > >
> > > > > > >
> > > > > > > You are right. So then
> > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > >    READ_ONCE should do.
> > > > > >
> > > > > > See above.
> > > > > >
> > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > >
> > > > > >
> > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > > > +  */
> > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > +
> > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > >   }
> > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > >
> > > > > > > >
> > > > > > > > Ok.
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > +
> > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > >
> > > > > > > >
> > > > > > > > Do you mean:
> > > > > > > >
> > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > interrupt handlers
> > > > > > >
> > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > users at the moment, even less than probe.
> > > > > >
> > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > spinlock or others.
> > > > > >
> > > > > > >
> > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > for old hypervisors
> > > > > > >
> > > > > > >
> > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > >
> > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > codes. They look all fine since day0.
> > > > > >
> > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > in the debug build).
> > > > > >
> > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > device. I think it can be done but in a separate series.
> > > > > >
> > > > > > >
> > > > > > > And going down from there, how about we cache status in the
> > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > speeding boot up a tiny bit.
> > > > > >
> > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > read back for validation in many cases.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > >   }
> > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > >   {
> > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > + }
> > > > > > > > > > +
> > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > >           bool failed;
> > > > > > > > > >           bool config_enabled;
> > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > >           struct device dev;
> > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > >   }
> > > > > > > > > > +/*
> > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > + * @vdev: the device
> > > > > > > > > > + */
> > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > +{
> > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > +         return true;
> > > > > > > > > > +
> > > > > > > > > > + /*
> > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > paired
> > > > > > > >
> > > > > > > >
> > > > > > > > Will fix.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > +  */
> > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > +}
> > > > > > > > > > +
> > > > > > > > > >   /**
> > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > >    * @vdev: the device
> > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > + /*
> > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > +  */
> > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > +
> > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > >   }
> > > > > > > > > > --
> > > > > > > > > > 2.25.1
> > > > > > >
> > > > >
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-28  6:18             ` Re: Jason Wang
@ 2022-03-28 10:40               ` Michael S. Tsirkin
  2022-03-29  7:12                 ` Re: Jason Wang
  2022-03-29  8:35                 ` Re: Thomas Gleixner
  0 siblings, 2 replies; 1546+ messages in thread
From: Michael S. Tsirkin @ 2022-03-28 10:40 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Mon, Mar 28, 2022 at 02:18:22PM +0800, Jason Wang wrote:
> On Mon, Mar 28, 2022 at 1:59 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Mar 28, 2022 at 12:56:41PM +0800, Jason Wang wrote:
> > > On Fri, Mar 25, 2022 at 6:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Fri, Mar 25, 2022 at 05:20:19PM +0800, Jason Wang wrote:
> > > > > On Fri, Mar 25, 2022 at 5:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> > > > > > > On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > Bcc:
> > > > > > > > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > > > > > > > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > > > > > > > Reply-To:
> > > > > > > > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> > > > > > > >
> > > > > > > > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > > > > > > > >
> > > > > > > > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > > > > > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > > > > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > > > > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > > > > > > > >
> > > > > > > > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > > > > > > > >     that is used by some device such as virtio-blk
> > > > > > > > > > > 2) done only for PCI transport
> > > > > > > > > > >
> > > > > > > > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > > > > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > > > > > > > by introducing a global irq_soft_enabled variable for each
> > > > > > > > > > > virtio_device. Then we can to toggle it during
> > > > > > > > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > > > > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > > > > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > > > > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > > > > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > > > > > > > but the cost should be acceptable.
> > > > > > > > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Even if we allow the transport driver to synchornize through
> > > > > > > > > synchronize_irq() we still need a check in the vring_interrupt().
> > > > > > > > >
> > > > > > > > > We do something like the following previously:
> > > > > > > > >
> > > > > > > > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > > > > > > > >                 return IRQ_NONE;
> > > > > > > > >
> > > > > > > > > But it looks like a bug since speculative read can be done before the check
> > > > > > > > > where the interrupt handler can't see the uncommitted setup which is done by
> > > > > > > > > the driver.
> > > > > > > >
> > > > > > > > I don't think so - if you sync after setting the value then
> > > > > > > > you are guaranteed that any handler running afterwards
> > > > > > > > will see the new value.
> > > > > > >
> > > > > > > The problem is not disabled but the enable.
> > > > > >
> > > > > > So a misbehaving device can lose interrupts? That's not a problem at all
> > > > > > imo.
> > > > >
> > > > > It's the interrupt raised before setting irq_soft_enabled to true:
> > > > >
> > > > > CPU 0 probe) driver specific setup (not commited)
> > > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > > CPU 0 probe) set irq_soft_enabled to true
> > > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > > CPU 1 IRQ handler) use the uninitialized variable
> > > > >
> > > > > Thanks
> > > >
> > > > Yea, it hurts if you do it.  So do not do it then ;).
> > > >
> > > > irq_soft_enabled (I think driver_ok or status is a better name)
> > >
> > > I can change it to driver_ok.
> > >
> > > > should be initialized to false *before* irq is requested.
> > > >
> > > > And requesting irq commits all memory otherwise all drivers would be
> > > > broken,
> > >
> > > So I think we might talk different issues:
> > >
> > > 1) Whether request_irq() commits the previous setups, I think the
> > > answer is yes, since the spin_unlock of desc->lock (release) can
> > > guarantee this though there seems no documentation around
> > > request_irq() to say this.
> > >
> > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > > using smp_wmb() before the request_irq().
> > >
> > > And even if write is ordered we still need read to be ordered to be
> > > paired with that.

IMO it synchronizes with the CPU to which irq is
delivered. Otherwise basically all drivers would be broken,
wouldn't they be?
I don't know whether it's correct on all platforms, but if not
we need to fix request_irq.

> > >
> > > > if it doesn't it just needs to be fixed, not worked around in
> > > > virtio.
> > >
> > > 2) virtio drivers might do a lot of setups between request_irq() and
> > > virtio_device_ready():
> > >
> > > request_irq()
> > > driver specific setups
> > > virtio_device_ready()
> > >
> > > CPU 0 probe) request_irq()
> > > CPU 1 IRQ handler) read the uninitialized variable
> > > CPU 0 probe) driver specific setups
> > > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > CPU 1 IRQ handler) use the uninitialized variable
> > >
> > > Thanks
> >
> >
> > As I said, virtio_device_ready needs to do synchronize_irq.
> > That will guarantee all setup is visible to the specific IRQ,
> 
> Only the interrupt after synchronize_irq() returns.

Anything else is a buggy device though.

> >this
> > is what it's point is.
> 
> What happens if an interrupt is raised in the middle like:
> 
> smp_store_release(dev->irq_soft_enabled, true)
> IRQ handler
> synchornize_irq()
> 
> If we don't enforce a reading order, the IRQ handler may still see the
> uninitialized variable.
> 
> Thanks

IMHO variables should be initialized before request_irq
to a value meaning "not a valid interrupt".
Specifically driver_ok = false.
Handler in the scenario you describe will then see !driver_ok
and exit immediately.


> >
> >
> > > >
> > > >
> > > > > >
> > > > > > > We use smp_store_relase()
> > > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > > >
> > > > > > > >
> > > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > > which surprises me.
> > > > > > > >
> > > > > > > > CC Paul to help make sure I'm right.
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > > hardening is disabled by default.
> > > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > > are any buffers in any queues?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > > >
> > > > > > > > > "
> > > > > > > > >
> > > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > > "
> > > > > > > > >
> > > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > > >
> > > > > > > >
> > > > > > > > This is only for config interrupt.
> > > > > > >
> > > > > > > Ok.
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > > expensive.
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > >
> > > > > > > > > > > ---
> > > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > > +
> > > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > > +
> > > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > > >    * */
> > > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > > >   {
> > > > > > > > > > > + /*
> > > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > > though it's most likely is ...
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > > >
> > > > > > > > > """
> > > > > > > > >
> > > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > > >  * and NMI handlers.
> > > > > > > > > """
> > > > > > > > >
> > > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > > >
> > > > > > > > > And it has the comment for explain the barrier:
> > > > > > > > >
> > > > > > > > > """
> > > > > > > > >
> > > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > > """
> > > > > > > > >
> > > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > > irq_soft_enabled as false.
> > > > > > > > >
> > > > > > > >
> > > > > > > > You are right. So then
> > > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > > >    READ_ONCE should do.
> > > > > > >
> > > > > > > See above.
> > > > > > >
> > > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > > >
> > > > > > >
> > > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > +  */
> > > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > +
> > > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > > >   }
> > > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Ok.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > > +
> > > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Do you mean:
> > > > > > > > >
> > > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > > interrupt handlers
> > > > > > > >
> > > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > > users at the moment, even less than probe.
> > > > > > >
> > > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > > spinlock or others.
> > > > > > >
> > > > > > > >
> > > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > > for old hypervisors
> > > > > > > >
> > > > > > > >
> > > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > > >
> > > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > > codes. They look all fine since day0.
> > > > > > >
> > > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > > in the debug build).
> > > > > > >
> > > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > > device. I think it can be done but in a separate series.
> > > > > > >
> > > > > > > >
> > > > > > > > And going down from there, how about we cache status in the
> > > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > > speeding boot up a tiny bit.
> > > > > > >
> > > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > > read back for validation in many cases.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > > >   }
> > > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > > >   {
> > > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > > + }
> > > > > > > > > > > +
> > > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > > >           bool failed;
> > > > > > > > > > >           bool config_enabled;
> > > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > > >           struct device dev;
> > > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > > >   }
> > > > > > > > > > > +/*
> > > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > > + * @vdev: the device
> > > > > > > > > > > + */
> > > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > > +{
> > > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > > +         return true;
> > > > > > > > > > > +
> > > > > > > > > > > + /*
> > > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > > paired
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Will fix.
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > > +  */
> > > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > > +}
> > > > > > > > > > > +
> > > > > > > > > > >   /**
> > > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > > >    * @vdev: the device
> > > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > > + /*
> > > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > > +  */
> > > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > > +
> > > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > >   }
> > > > > > > > > > > --
> > > > > > > > > > > 2.25.1
> > > > > > > >
> > > > > >
> > > >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-28 10:40               ` Re: Michael S. Tsirkin
@ 2022-03-29  7:12                 ` Jason Wang
  2022-03-29 14:08                   ` Re: Michael S. Tsirkin
  2022-03-29  8:35                 ` Re: Thomas Gleixner
  1 sibling, 1 reply; 1546+ messages in thread
From: Jason Wang @ 2022-03-29  7:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Mon, Mar 28, 2022 at 6:41 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Mar 28, 2022 at 02:18:22PM +0800, Jason Wang wrote:
> > On Mon, Mar 28, 2022 at 1:59 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Mar 28, 2022 at 12:56:41PM +0800, Jason Wang wrote:
> > > > On Fri, Mar 25, 2022 at 6:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Fri, Mar 25, 2022 at 05:20:19PM +0800, Jason Wang wrote:
> > > > > > On Fri, Mar 25, 2022 at 5:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> > > > > > > > On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > Bcc:
> > > > > > > > > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > > > > > > > > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > > > > > > > > Reply-To:
> > > > > > > > > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> > > > > > > > >
> > > > > > > > > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > > > > > > > > >
> > > > > > > > > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > > > > > > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > > > > > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > > > > > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > > > > > > > > >
> > > > > > > > > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > > > > > > > > >     that is used by some device such as virtio-blk
> > > > > > > > > > > > 2) done only for PCI transport
> > > > > > > > > > > >
> > > > > > > > > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > > > > > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > > > > > > > > by introducing a global irq_soft_enabled variable for each
> > > > > > > > > > > > virtio_device. Then we can to toggle it during
> > > > > > > > > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > > > > > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > > > > > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > > > > > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > > > > > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > > > > > > > > but the cost should be acceptable.
> > > > > > > > > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Even if we allow the transport driver to synchornize through
> > > > > > > > > > synchronize_irq() we still need a check in the vring_interrupt().
> > > > > > > > > >
> > > > > > > > > > We do something like the following previously:
> > > > > > > > > >
> > > > > > > > > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > > > > > > > > >                 return IRQ_NONE;
> > > > > > > > > >
> > > > > > > > > > But it looks like a bug since speculative read can be done before the check
> > > > > > > > > > where the interrupt handler can't see the uncommitted setup which is done by
> > > > > > > > > > the driver.
> > > > > > > > >
> > > > > > > > > I don't think so - if you sync after setting the value then
> > > > > > > > > you are guaranteed that any handler running afterwards
> > > > > > > > > will see the new value.
> > > > > > > >
> > > > > > > > The problem is not disabled but the enable.
> > > > > > >
> > > > > > > So a misbehaving device can lose interrupts? That's not a problem at all
> > > > > > > imo.
> > > > > >
> > > > > > It's the interrupt raised before setting irq_soft_enabled to true:
> > > > > >
> > > > > > CPU 0 probe) driver specific setup (not commited)
> > > > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > > > CPU 0 probe) set irq_soft_enabled to true
> > > > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > > > CPU 1 IRQ handler) use the uninitialized variable
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > Yea, it hurts if you do it.  So do not do it then ;).
> > > > >
> > > > > irq_soft_enabled (I think driver_ok or status is a better name)
> > > >
> > > > I can change it to driver_ok.
> > > >
> > > > > should be initialized to false *before* irq is requested.
> > > > >
> > > > > And requesting irq commits all memory otherwise all drivers would be
> > > > > broken,
> > > >
> > > > So I think we might talk different issues:
> > > >
> > > > 1) Whether request_irq() commits the previous setups, I think the
> > > > answer is yes, since the spin_unlock of desc->lock (release) can
> > > > guarantee this though there seems no documentation around
> > > > request_irq() to say this.
> > > >
> > > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > > > using smp_wmb() before the request_irq().
> > > >
> > > > And even if write is ordered we still need read to be ordered to be
> > > > paired with that.
>
> IMO it synchronizes with the CPU to which irq is
> delivered. Otherwise basically all drivers would be broken,
> wouldn't they be?

I guess it's because most of the drivers don't care much about the
buggy/malicious device.  And most of the devices may require an extra
step to enable device IRQ after request_irq(). Or it's the charge of
the driver to do the synchronization.

> I don't know whether it's correct on all platforms, but if not
> we need to fix request_irq.
>
> > > >
> > > > > if it doesn't it just needs to be fixed, not worked around in
> > > > > virtio.
> > > >
> > > > 2) virtio drivers might do a lot of setups between request_irq() and
> > > > virtio_device_ready():
> > > >
> > > > request_irq()
> > > > driver specific setups
> > > > virtio_device_ready()
> > > >
> > > > CPU 0 probe) request_irq()
> > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > CPU 0 probe) driver specific setups
> > > > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > CPU 1 IRQ handler) use the uninitialized variable
> > > >
> > > > Thanks
> > >
> > >
> > > As I said, virtio_device_ready needs to do synchronize_irq.
> > > That will guarantee all setup is visible to the specific IRQ,
> >
> > Only the interrupt after synchronize_irq() returns.
>
> Anything else is a buggy device though.

Yes, but the goal of this patch is to prevent the possible attack from
buggy(malicious) devices.

>
> > >this
> > > is what it's point is.
> >
> > What happens if an interrupt is raised in the middle like:
> >
> > smp_store_release(dev->irq_soft_enabled, true)
> > IRQ handler
> > synchornize_irq()
> >
> > If we don't enforce a reading order, the IRQ handler may still see the
> > uninitialized variable.
> >
> > Thanks
>
> IMHO variables should be initialized before request_irq
> to a value meaning "not a valid interrupt".
> Specifically driver_ok = false.
> Handler in the scenario you describe will then see !driver_ok
> and exit immediately.

So just to make sure we're on the same page.

1) virtio_reset_device() will set the driver_ok to false;
2) virtio_device_ready() will set the driver_ok to true

So for virtio drivers, it often did:

1) virtio_reset_device()
2) find_vqs() which will call request_irq()
3) other driver specific setups
4) virtio_device_ready()

In virtio_device_ready(), the patch perform the following currently:

smp_store_release(driver_ok, true);
set_status(DRIVER_OK);

Per your suggestion, to add synchronize_irq() after
smp_store_release() so we had

smp_store_release(driver_ok, true);
synchornize_irq()
set_status(DRIVER_OK)

Suppose there's a interrupt raised before the synchronize_irq(), if we do:

if (READ_ONCE(driver_ok)) {
      vq->callback()
}

It will see the driver_ok as true but how can we make sure
vq->callback sees the driver specific setups (3) above?

And an example is virtio_scsi():

virtio_reset_device()
virtscsi_probe()
    virtscsi_init()
        virtio_find_vqs()
        ...
        virtscsi_init_vq(&vscsi->event_vq, vqs[1])
    ....
    virtio_device_ready()

In virtscsi_event_done():

virtscsi_event_done():
    virtscsi_vq_done(vscsi, &vscsi->event_vq, ...);

We need to make sure the even_done reads driver_ok before read vscsi->event_vq.

Thanks

>
>
> > >
> > >
> > > > >
> > > > >
> > > > > > >
> > > > > > > > We use smp_store_relase()
> > > > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > > > which surprises me.
> > > > > > > > >
> > > > > > > > > CC Paul to help make sure I'm right.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > > > hardening is disabled by default.
> > > > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > > > are any buffers in any queues?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > > > >
> > > > > > > > > > "
> > > > > > > > > >
> > > > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > > > "
> > > > > > > > > >
> > > > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > This is only for config interrupt.
> > > > > > > >
> > > > > > > > Ok.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > > > expensive.
> > > > > > > > > > > >
> > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > >
> > > > > > > > > > > > ---
> > > > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > > > >
> > > > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > > > +
> > > > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > > > +
> > > > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > > > >    * */
> > > > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > > > >   {
> > > > > > > > > > > > + /*
> > > > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > > > though it's most likely is ...
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > > > >
> > > > > > > > > > """
> > > > > > > > > >
> > > > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > > > >  * and NMI handlers.
> > > > > > > > > > """
> > > > > > > > > >
> > > > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > > > >
> > > > > > > > > > And it has the comment for explain the barrier:
> > > > > > > > > >
> > > > > > > > > > """
> > > > > > > > > >
> > > > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > > > """
> > > > > > > > > >
> > > > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > > > irq_soft_enabled as false.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > You are right. So then
> > > > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > > > >    READ_ONCE should do.
> > > > > > > >
> > > > > > > > See above.
> > > > > > > >
> > > > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > > > >
> > > > > > > >
> > > > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > +  */
> > > > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > > +
> > > > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > > > >   }
> > > > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Ok.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > > > +
> > > > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Do you mean:
> > > > > > > > > >
> > > > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > > > interrupt handlers
> > > > > > > > >
> > > > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > > > users at the moment, even less than probe.
> > > > > > > >
> > > > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > > > spinlock or others.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > > > for old hypervisors
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > > > >
> > > > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > > > codes. They look all fine since day0.
> > > > > > > >
> > > > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > > > in the debug build).
> > > > > > > >
> > > > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > > > device. I think it can be done but in a separate series.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > And going down from there, how about we cache status in the
> > > > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > > > speeding boot up a tiny bit.
> > > > > > > >
> > > > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > > > read back for validation in many cases.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > > > >   }
> > > > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > > > >   {
> > > > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > > > + }
> > > > > > > > > > > > +
> > > > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > > > >           bool failed;
> > > > > > > > > > > >           bool config_enabled;
> > > > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > > > >           struct device dev;
> > > > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > > > >   }
> > > > > > > > > > > > +/*
> > > > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > > > + * @vdev: the device
> > > > > > > > > > > > + */
> > > > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > > > +{
> > > > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > > > +         return true;
> > > > > > > > > > > > +
> > > > > > > > > > > > + /*
> > > > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > > > paired
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Will fix.
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > > > +  */
> > > > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > > > +}
> > > > > > > > > > > > +
> > > > > > > > > > > >   /**
> > > > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > > > >    * @vdev: the device
> > > > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > > > + /*
> > > > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > > > +  */
> > > > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > > > +
> > > > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > >   }
> > > > > > > > > > > > --
> > > > > > > > > > > > 2.25.1
> > > > > > > > >
> > > > > > >
> > > > >
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-28 10:40               ` Re: Michael S. Tsirkin
  2022-03-29  7:12                 ` Re: Jason Wang
@ 2022-03-29  8:35                 ` Thomas Gleixner
  2022-03-29 14:37                   ` Re: Michael S. Tsirkin
  2022-04-12  6:55                   ` Re: Michael S. Tsirkin
  1 sibling, 2 replies; 1546+ messages in thread
From: Thomas Gleixner @ 2022-03-29  8:35 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization

On Mon, Mar 28 2022 at 06:40, Michael S. Tsirkin wrote:
> On Mon, Mar 28, 2022 at 02:18:22PM +0800, Jason Wang wrote:
>> > > So I think we might talk different issues:
>> > >
>> > > 1) Whether request_irq() commits the previous setups, I think the
>> > > answer is yes, since the spin_unlock of desc->lock (release) can
>> > > guarantee this though there seems no documentation around
>> > > request_irq() to say this.
>> > >
>> > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
>> > > using smp_wmb() before the request_irq().

That's a complete bogus example especially as there is not a single
smp_rmb() which pairs with the smp_wmb().

>> > > And even if write is ordered we still need read to be ordered to be
>> > > paired with that.
>
> IMO it synchronizes with the CPU to which irq is
> delivered. Otherwise basically all drivers would be broken,
> wouldn't they be?
> I don't know whether it's correct on all platforms, but if not
> we need to fix request_irq.

There is nothing to fix:

request_irq()
   raw_spin_lock_irq(desc->lock);       // ACQUIRE
   ....
   raw_spin_unlock_irq(desc->lock);     // RELEASE

interrupt()
   raw_spin_lock(desc->lock);           // ACQUIRE
   set status to IN_PROGRESS
   raw_spin_unlock(desc->lock);         // RELEASE
   invoke handler()

So anything which the driver set up _before_ request_irq() is visible to
the interrupt handler. No?

>> What happens if an interrupt is raised in the middle like:
>> 
>> smp_store_release(dev->irq_soft_enabled, true)
>> IRQ handler
>> synchornize_irq()

This is bogus. The obvious order of things is:

    dev->ok = false;
    request_irq();

    moar_setup();
    synchronize_irq();  // ACQUIRE + RELEASE
    dev->ok = true;

The reverse operation on teardown:

    dev->ok = false;
    synchronize_irq();  // ACQUIRE + RELEASE

    teardown();

So in both cases a simple check in the handler is sufficient:

handler()
    if (!dev->ok)
    	return;

I'm not understanding what you folks are trying to "fix" here. If any
driver does this in the wrong order, then the driver is broken.

Sure, you can do the same with:

    dev->ok = false;
    request_irq();
    moar_setup();
    smp_wmb();
    dev->ok = true;

for the price of a smp_rmb() in the interrupt handler:

handler()
    if (!dev->ok)
    	return;
    smp_rmb();

but that's only working for the setup case correctly and not for
teardown.

Thanks,

        tglx
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-29  7:12                 ` Re: Jason Wang
@ 2022-03-29 14:08                   ` Michael S. Tsirkin
  2022-03-30  2:40                     ` Re: Jason Wang
  0 siblings, 1 reply; 1546+ messages in thread
From: Michael S. Tsirkin @ 2022-03-29 14:08 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Tue, Mar 29, 2022 at 03:12:14PM +0800, Jason Wang wrote:
> > > > > > And requesting irq commits all memory otherwise all drivers would be
> > > > > > broken,
> > > > >
> > > > > So I think we might talk different issues:
> > > > >
> > > > > 1) Whether request_irq() commits the previous setups, I think the
> > > > > answer is yes, since the spin_unlock of desc->lock (release) can
> > > > > guarantee this though there seems no documentation around
> > > > > request_irq() to say this.
> > > > >
> > > > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > > > > using smp_wmb() before the request_irq().
> > > > >
> > > > > And even if write is ordered we still need read to be ordered to be
> > > > > paired with that.
> >
> > IMO it synchronizes with the CPU to which irq is
> > delivered. Otherwise basically all drivers would be broken,
> > wouldn't they be?
> 
> I guess it's because most of the drivers don't care much about the
> buggy/malicious device.  And most of the devices may require an extra
> step to enable device IRQ after request_irq(). Or it's the charge of
> the driver to do the synchronization.

It is true that the use-case of malicious devices is somewhat boutique.
But I think most drivers do want to have their hotplug routines to be
robust, yes.

> > I don't know whether it's correct on all platforms, but if not
> > we need to fix request_irq.
> >
> > > > >
> > > > > > if it doesn't it just needs to be fixed, not worked around in
> > > > > > virtio.
> > > > >
> > > > > 2) virtio drivers might do a lot of setups between request_irq() and
> > > > > virtio_device_ready():
> > > > >
> > > > > request_irq()
> > > > > driver specific setups
> > > > > virtio_device_ready()
> > > > >
> > > > > CPU 0 probe) request_irq()
> > > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > > CPU 0 probe) driver specific setups
> > > > > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > > CPU 1 IRQ handler) use the uninitialized variable
> > > > >
> > > > > Thanks
> > > >
> > > >
> > > > As I said, virtio_device_ready needs to do synchronize_irq.
> > > > That will guarantee all setup is visible to the specific IRQ,
> > >
> > > Only the interrupt after synchronize_irq() returns.
> >
> > Anything else is a buggy device though.
> 
> Yes, but the goal of this patch is to prevent the possible attack from
> buggy(malicious) devices.

Right. However if a driver of a *buggy* device somehow sees driver_ok =
false even though it's actually initialized, that is not a deal breaker
as that does not open us up to an attack.

> >
> > > >this
> > > > is what it's point is.
> > >
> > > What happens if an interrupt is raised in the middle like:
> > >
> > > smp_store_release(dev->irq_soft_enabled, true)
> > > IRQ handler
> > > synchornize_irq()
> > >
> > > If we don't enforce a reading order, the IRQ handler may still see the
> > > uninitialized variable.
> > >
> > > Thanks
> >
> > IMHO variables should be initialized before request_irq
> > to a value meaning "not a valid interrupt".
> > Specifically driver_ok = false.
> > Handler in the scenario you describe will then see !driver_ok
> > and exit immediately.
> 
> So just to make sure we're on the same page.
> 
> 1) virtio_reset_device() will set the driver_ok to false;
> 2) virtio_device_ready() will set the driver_ok to true
> 
> So for virtio drivers, it often did:
> 
> 1) virtio_reset_device()
> 2) find_vqs() which will call request_irq()
> 3) other driver specific setups
> 4) virtio_device_ready()
> 
> In virtio_device_ready(), the patch perform the following currently:
> 
> smp_store_release(driver_ok, true);
> set_status(DRIVER_OK);
> 
> Per your suggestion, to add synchronize_irq() after
> smp_store_release() so we had
> 
> smp_store_release(driver_ok, true);
> synchornize_irq()
> set_status(DRIVER_OK)
> 
> Suppose there's a interrupt raised before the synchronize_irq(), if we do:
> 
> if (READ_ONCE(driver_ok)) {
>       vq->callback()
> }
> 
> It will see the driver_ok as true but how can we make sure
> vq->callback sees the driver specific setups (3) above?
> 
> And an example is virtio_scsi():
> 
> virtio_reset_device()
> virtscsi_probe()
>     virtscsi_init()
>         virtio_find_vqs()
>         ...
>         virtscsi_init_vq(&vscsi->event_vq, vqs[1])
>     ....
>     virtio_device_ready()
> 
> In virtscsi_event_done():
> 
> virtscsi_event_done():
>     virtscsi_vq_done(vscsi, &vscsi->event_vq, ...);
> 
> We need to make sure the even_done reads driver_ok before read vscsi->event_vq.
> 
> Thanks


See response by Thomas. A simple if (!dev->driver_ok) should be enough,
it's all under a lock.

> >
> >
> > > >
> > > >
> > > > > >
> > > > > >
> > > > > > > >
> > > > > > > > > We use smp_store_relase()
> > > > > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > > > > which surprises me.
> > > > > > > > > >
> > > > > > > > > > CC Paul to help make sure I'm right.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > > > > hardening is disabled by default.
> > > > > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > > > > are any buffers in any queues?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > > > > >
> > > > > > > > > > > "
> > > > > > > > > > >
> > > > > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > > > > "
> > > > > > > > > > >
> > > > > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > This is only for config interrupt.
> > > > > > > > >
> > > > > > > > > Ok.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > > > > expensive.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > > >
> > > > > > > > > > > > > ---
> > > > > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > > > > >
> > > > > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > > > > +
> > > > > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > > > > +
> > > > > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > > > > >    * */
> > > > > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > > > > >   {
> > > > > > > > > > > > > + /*
> > > > > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > > > > though it's most likely is ...
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > > > > >
> > > > > > > > > > > """
> > > > > > > > > > >
> > > > > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > > > > >  * and NMI handlers.
> > > > > > > > > > > """
> > > > > > > > > > >
> > > > > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > > > > >
> > > > > > > > > > > And it has the comment for explain the barrier:
> > > > > > > > > > >
> > > > > > > > > > > """
> > > > > > > > > > >
> > > > > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > > > > """
> > > > > > > > > > >
> > > > > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > > > > irq_soft_enabled as false.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > You are right. So then
> > > > > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > > > > >    READ_ONCE should do.
> > > > > > > > >
> > > > > > > > > See above.
> > > > > > > > >
> > > > > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > +  */
> > > > > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > > > +
> > > > > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > > > > >   }
> > > > > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Ok.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > > > > +
> > > > > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Do you mean:
> > > > > > > > > > >
> > > > > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > > > > interrupt handlers
> > > > > > > > > >
> > > > > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > > > > users at the moment, even less than probe.
> > > > > > > > >
> > > > > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > > > > spinlock or others.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > > > > for old hypervisors
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > > > > >
> > > > > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > > > > codes. They look all fine since day0.
> > > > > > > > >
> > > > > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > > > > in the debug build).
> > > > > > > > >
> > > > > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > > > > device. I think it can be done but in a separate series.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > And going down from there, how about we cache status in the
> > > > > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > > > > speeding boot up a tiny bit.
> > > > > > > > >
> > > > > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > > > > read back for validation in many cases.
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > > > > >   }
> > > > > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > > > > >   {
> > > > > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > > > > + }
> > > > > > > > > > > > > +
> > > > > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > > > > >           bool failed;
> > > > > > > > > > > > >           bool config_enabled;
> > > > > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > > > > >           struct device dev;
> > > > > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > > > > >   }
> > > > > > > > > > > > > +/*
> > > > > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > > > > + * @vdev: the device
> > > > > > > > > > > > > + */
> > > > > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > > > > +{
> > > > > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > > > > +         return true;
> > > > > > > > > > > > > +
> > > > > > > > > > > > > + /*
> > > > > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > > > > paired
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Will fix.
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > > > > +  */
> > > > > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > > > > +}
> > > > > > > > > > > > > +
> > > > > > > > > > > > >   /**
> > > > > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > > > > >    * @vdev: the device
> > > > > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > > > > + /*
> > > > > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > > > > +  */
> > > > > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > > > > +
> > > > > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > >   }
> > > > > > > > > > > > > --
> > > > > > > > > > > > > 2.25.1
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-29  8:35                 ` Re: Thomas Gleixner
@ 2022-03-29 14:37                   ` Michael S. Tsirkin
  2022-03-29 18:13                     ` Re: Thomas Gleixner
  2022-04-12  6:55                   ` Re: Michael S. Tsirkin
  1 sibling, 1 reply; 1546+ messages in thread
From: Michael S. Tsirkin @ 2022-03-29 14:37 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization

On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> On Mon, Mar 28 2022 at 06:40, Michael S. Tsirkin wrote:
> > On Mon, Mar 28, 2022 at 02:18:22PM +0800, Jason Wang wrote:
> >> > > So I think we might talk different issues:
> >> > >
> >> > > 1) Whether request_irq() commits the previous setups, I think the
> >> > > answer is yes, since the spin_unlock of desc->lock (release) can
> >> > > guarantee this though there seems no documentation around
> >> > > request_irq() to say this.
> >> > >
> >> > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> >> > > using smp_wmb() before the request_irq().
> 
> That's a complete bogus example especially as there is not a single
> smp_rmb() which pairs with the smp_wmb().
> 
> >> > > And even if write is ordered we still need read to be ordered to be
> >> > > paired with that.
> >
> > IMO it synchronizes with the CPU to which irq is
> > delivered. Otherwise basically all drivers would be broken,
> > wouldn't they be?
> > I don't know whether it's correct on all platforms, but if not
> > we need to fix request_irq.
> 
> There is nothing to fix:
> 
> request_irq()
>    raw_spin_lock_irq(desc->lock);       // ACQUIRE
>    ....
>    raw_spin_unlock_irq(desc->lock);     // RELEASE
> 
> interrupt()
>    raw_spin_lock(desc->lock);           // ACQUIRE
>    set status to IN_PROGRESS
>    raw_spin_unlock(desc->lock);         // RELEASE
>    invoke handler()
> 
> So anything which the driver set up _before_ request_irq() is visible to
> the interrupt handler. No?
> >> What happens if an interrupt is raised in the middle like:
> >> 
> >> smp_store_release(dev->irq_soft_enabled, true)
> >> IRQ handler
> >> synchornize_irq()
> 
> This is bogus. The obvious order of things is:
> 
>     dev->ok = false;
>     request_irq();
> 
>     moar_setup();
>     synchronize_irq();  // ACQUIRE + RELEASE
>     dev->ok = true;
> 
> The reverse operation on teardown:
> 
>     dev->ok = false;
>     synchronize_irq();  // ACQUIRE + RELEASE
> 
>     teardown();
> 
> So in both cases a simple check in the handler is sufficient:
> 
> handler()
>     if (!dev->ok)
>     	return;


Thanks a lot for the analysis Thomas. This is more or less what I was
thinking.

> 
> I'm not understanding what you folks are trying to "fix" here.

We are trying to fix the driver since at the moment it does not
have the dev->ok flag at all.


And I suspect virtio is not alone in that.
So it would have been nice if there was a standard flag
replacing the driver-specific dev->ok above, and ideally
would also handle the case of an interrupt triggering
too early by deferring the interrupt until the flag is set.

And in fact, it does kind of exist: IRQF_NO_AUTOEN, and you would call
enable_irq instead of dev->ok = true, except
- it doesn't work with affinity managed IRQs
- it does not work with shared IRQs

So using dev->ok as you propose above seems better at this point.

> If any
> driver does this in the wrong order, then the driver is broken.

I agree, however:
$ git grep synchronize_irq `git grep -l request_irq drivers/net/`|wc -l
113
$ git grep -l request_irq drivers/net/|wc -l
397

I suspect there are more drivers which in theory need the
synchronize_irq dance but in practice do not execute it.


> Sure, you can do the same with:
> 
>     dev->ok = false;
>     request_irq();
>     moar_setup();
>     smp_wmb();
>     dev->ok = true;
> 
> for the price of a smp_rmb() in the interrupt handler:
> 
> handler()
>     if (!dev->ok)
>     	return;
>     smp_rmb();
> 
> but that's only working for the setup case correctly and not for
> teardown.
> 
> Thanks,
> 
>         tglx

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-29 14:37                   ` Re: Michael S. Tsirkin
@ 2022-03-29 18:13                     ` Thomas Gleixner
  2022-03-29 22:04                       ` Re: Michael S. Tsirkin
  0 siblings, 1 reply; 1546+ messages in thread
From: Thomas Gleixner @ 2022-03-29 18:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization

On Tue, Mar 29 2022 at 10:37, Michael S. Tsirkin wrote:
> On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> We are trying to fix the driver since at the moment it does not
> have the dev->ok flag at all.
>
> And I suspect virtio is not alone in that.
> So it would have been nice if there was a standard flag
> replacing the driver-specific dev->ok above, and ideally
> would also handle the case of an interrupt triggering
> too early by deferring the interrupt until the flag is set.
>
> And in fact, it does kind of exist: IRQF_NO_AUTOEN, and you would call
> enable_irq instead of dev->ok = true, except
> - it doesn't work with affinity managed IRQs
> - it does not work with shared IRQs
>
> So using dev->ok as you propose above seems better at this point.

Unless there is a big enough amount of drivers which could make use of a
generic mechanism for that.

>> If any driver does this in the wrong order, then the driver is
>> broken.
> 
> I agree, however:
> $ git grep synchronize_irq `git grep -l request_irq drivers/net/`|wc -l
> 113
> $ git grep -l request_irq drivers/net/|wc -l
> 397
>
> I suspect there are more drivers which in theory need the
> synchronize_irq dance but in practice do not execute it.

That really depends on when the driver requests the interrupt, when
it actually enables the interrupt in the device itself and how the
interrupt service routine works.

So just doing that grep dance does not tell much. You really have to do
a case by case analysis.

Thanks,

        tglx

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-29 18:13                     ` Re: Thomas Gleixner
@ 2022-03-29 22:04                       ` Michael S. Tsirkin
  2022-03-30  2:38                         ` Re: Jason Wang
  0 siblings, 1 reply; 1546+ messages in thread
From: Michael S. Tsirkin @ 2022-03-29 22:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization

On Tue, Mar 29, 2022 at 08:13:57PM +0200, Thomas Gleixner wrote:
> On Tue, Mar 29 2022 at 10:37, Michael S. Tsirkin wrote:
> > On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> > We are trying to fix the driver since at the moment it does not
> > have the dev->ok flag at all.
> >
> > And I suspect virtio is not alone in that.
> > So it would have been nice if there was a standard flag
> > replacing the driver-specific dev->ok above, and ideally
> > would also handle the case of an interrupt triggering
> > too early by deferring the interrupt until the flag is set.
> >
> > And in fact, it does kind of exist: IRQF_NO_AUTOEN, and you would call
> > enable_irq instead of dev->ok = true, except
> > - it doesn't work with affinity managed IRQs
> > - it does not work with shared IRQs
> >
> > So using dev->ok as you propose above seems better at this point.
> 
> Unless there is a big enough amount of drivers which could make use of a
> generic mechanism for that.
> 
> >> If any driver does this in the wrong order, then the driver is
> >> broken.
> > 
> > I agree, however:
> > $ git grep synchronize_irq `git grep -l request_irq drivers/net/`|wc -l
> > 113
> > $ git grep -l request_irq drivers/net/|wc -l
> > 397
> >
> > I suspect there are more drivers which in theory need the
> > synchronize_irq dance but in practice do not execute it.
> 
> That really depends on when the driver requests the interrupt, when
> it actually enables the interrupt in the device itself

This last point does not matter since we are talking about protecting
against buggy/malicious devices. They can inject the interrupt anyway
even if driver did not configure it.

> and how the
> interrupt service routine works.
> 
> So just doing that grep dance does not tell much. You really have to do
> a case by case analysis.
> 
> Thanks,
> 
>         tglx


I agree. In fact, at least for network the standard approach is to
request interrupts in the open call, virtio net is unusual
in doing it in probe. We should consider changing that.
Jason?

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-29 22:04                       ` Re: Michael S. Tsirkin
@ 2022-03-30  2:38                         ` Jason Wang
  2022-03-30  5:09                           ` Re: Michael S. Tsirkin
  0 siblings, 1 reply; 1546+ messages in thread
From: Jason Wang @ 2022-03-30  2:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Wed, Mar 30, 2022 at 6:04 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Mar 29, 2022 at 08:13:57PM +0200, Thomas Gleixner wrote:
> > On Tue, Mar 29 2022 at 10:37, Michael S. Tsirkin wrote:
> > > On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> > > We are trying to fix the driver since at the moment it does not
> > > have the dev->ok flag at all.
> > >
> > > And I suspect virtio is not alone in that.
> > > So it would have been nice if there was a standard flag
> > > replacing the driver-specific dev->ok above, and ideally
> > > would also handle the case of an interrupt triggering
> > > too early by deferring the interrupt until the flag is set.
> > >
> > > And in fact, it does kind of exist: IRQF_NO_AUTOEN, and you would call
> > > enable_irq instead of dev->ok = true, except
> > > - it doesn't work with affinity managed IRQs
> > > - it does not work with shared IRQs
> > >
> > > So using dev->ok as you propose above seems better at this point.
> >
> > Unless there is a big enough amount of drivers which could make use of a
> > generic mechanism for that.
> >
> > >> If any driver does this in the wrong order, then the driver is
> > >> broken.
> > >
> > > I agree, however:
> > > $ git grep synchronize_irq `git grep -l request_irq drivers/net/`|wc -l
> > > 113
> > > $ git grep -l request_irq drivers/net/|wc -l
> > > 397
> > >
> > > I suspect there are more drivers which in theory need the
> > > synchronize_irq dance but in practice do not execute it.
> >
> > That really depends on when the driver requests the interrupt, when
> > it actually enables the interrupt in the device itself
>
> This last point does not matter since we are talking about protecting
> against buggy/malicious devices. They can inject the interrupt anyway
> even if driver did not configure it.
>
> > and how the
> > interrupt service routine works.
> >
> > So just doing that grep dance does not tell much. You really have to do
> > a case by case analysis.
> >
> > Thanks,
> >
> >         tglx
>
>
> I agree. In fact, at least for network the standard approach is to
> request interrupts in the open call, virtio net is unusual
> in doing it in probe. We should consider changing that.
> Jason?

This probably works only for virtio-net and it looks like not trivial
since we don't have a specific core API to request interrupts.

Thanks

>
> --
> MST
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-29 14:08                   ` Re: Michael S. Tsirkin
@ 2022-03-30  2:40                     ` Jason Wang
  2022-03-30  5:14                       ` Re: Michael S. Tsirkin
  0 siblings, 1 reply; 1546+ messages in thread
From: Jason Wang @ 2022-03-30  2:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Tue, Mar 29, 2022 at 10:09 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Mar 29, 2022 at 03:12:14PM +0800, Jason Wang wrote:
> > > > > > > And requesting irq commits all memory otherwise all drivers would be
> > > > > > > broken,
> > > > > >
> > > > > > So I think we might talk different issues:
> > > > > >
> > > > > > 1) Whether request_irq() commits the previous setups, I think the
> > > > > > answer is yes, since the spin_unlock of desc->lock (release) can
> > > > > > guarantee this though there seems no documentation around
> > > > > > request_irq() to say this.
> > > > > >
> > > > > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > > > > > using smp_wmb() before the request_irq().
> > > > > >
> > > > > > And even if write is ordered we still need read to be ordered to be
> > > > > > paired with that.
> > >
> > > IMO it synchronizes with the CPU to which irq is
> > > delivered. Otherwise basically all drivers would be broken,
> > > wouldn't they be?
> >
> > I guess it's because most of the drivers don't care much about the
> > buggy/malicious device.  And most of the devices may require an extra
> > step to enable device IRQ after request_irq(). Or it's the charge of
> > the driver to do the synchronization.
>
> It is true that the use-case of malicious devices is somewhat boutique.
> But I think most drivers do want to have their hotplug routines to be
> robust, yes.
>
> > > I don't know whether it's correct on all platforms, but if not
> > > we need to fix request_irq.
> > >
> > > > > >
> > > > > > > if it doesn't it just needs to be fixed, not worked around in
> > > > > > > virtio.
> > > > > >
> > > > > > 2) virtio drivers might do a lot of setups between request_irq() and
> > > > > > virtio_device_ready():
> > > > > >
> > > > > > request_irq()
> > > > > > driver specific setups
> > > > > > virtio_device_ready()
> > > > > >
> > > > > > CPU 0 probe) request_irq()
> > > > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > > > CPU 0 probe) driver specific setups
> > > > > > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > > > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > > > CPU 1 IRQ handler) use the uninitialized variable
> > > > > >
> > > > > > Thanks
> > > > >
> > > > >
> > > > > As I said, virtio_device_ready needs to do synchronize_irq.
> > > > > That will guarantee all setup is visible to the specific IRQ,
> > > >
> > > > Only the interrupt after synchronize_irq() returns.
> > >
> > > Anything else is a buggy device though.
> >
> > Yes, but the goal of this patch is to prevent the possible attack from
> > buggy(malicious) devices.
>
> Right. However if a driver of a *buggy* device somehow sees driver_ok =
> false even though it's actually initialized, that is not a deal breaker
> as that does not open us up to an attack.
>
> > >
> > > > >this
> > > > > is what it's point is.
> > > >
> > > > What happens if an interrupt is raised in the middle like:
> > > >
> > > > smp_store_release(dev->irq_soft_enabled, true)
> > > > IRQ handler
> > > > synchornize_irq()
> > > >
> > > > If we don't enforce a reading order, the IRQ handler may still see the
> > > > uninitialized variable.
> > > >
> > > > Thanks
> > >
> > > IMHO variables should be initialized before request_irq
> > > to a value meaning "not a valid interrupt".
> > > Specifically driver_ok = false.
> > > Handler in the scenario you describe will then see !driver_ok
> > > and exit immediately.
> >
> > So just to make sure we're on the same page.
> >
> > 1) virtio_reset_device() will set the driver_ok to false;
> > 2) virtio_device_ready() will set the driver_ok to true
> >
> > So for virtio drivers, it often did:
> >
> > 1) virtio_reset_device()
> > 2) find_vqs() which will call request_irq()
> > 3) other driver specific setups
> > 4) virtio_device_ready()
> >
> > In virtio_device_ready(), the patch perform the following currently:
> >
> > smp_store_release(driver_ok, true);
> > set_status(DRIVER_OK);
> >
> > Per your suggestion, to add synchronize_irq() after
> > smp_store_release() so we had
> >
> > smp_store_release(driver_ok, true);
> > synchornize_irq()
> > set_status(DRIVER_OK)
> >
> > Suppose there's a interrupt raised before the synchronize_irq(), if we do:
> >
> > if (READ_ONCE(driver_ok)) {
> >       vq->callback()
> > }
> >
> > It will see the driver_ok as true but how can we make sure
> > vq->callback sees the driver specific setups (3) above?
> >
> > And an example is virtio_scsi():
> >
> > virtio_reset_device()
> > virtscsi_probe()
> >     virtscsi_init()
> >         virtio_find_vqs()
> >         ...
> >         virtscsi_init_vq(&vscsi->event_vq, vqs[1])
> >     ....
> >     virtio_device_ready()
> >
> > In virtscsi_event_done():
> >
> > virtscsi_event_done():
> >     virtscsi_vq_done(vscsi, &vscsi->event_vq, ...);
> >
> > We need to make sure the even_done reads driver_ok before read vscsi->event_vq.
> >
> > Thanks
>
>
> See response by Thomas. A simple if (!dev->driver_ok) should be enough,
> it's all under a lock.

Ordered through ACQUIRE+RELEASE actually since the irq handler is not
running under the lock.

Another question, for synchronize_irq() do you prefer

1) transport specific callbacks
or
2) a simple synchornize_rcu()

Thanks

>
> > >
> > >
> > > > >
> > > > >
> > > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > > > We use smp_store_relase()
> > > > > > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > > > > > which surprises me.
> > > > > > > > > > >
> > > > > > > > > > > CC Paul to help make sure I'm right.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > > > > > hardening is disabled by default.
> > > > > > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > > > > > are any buffers in any queues?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > > > > > >
> > > > > > > > > > > > "
> > > > > > > > > > > >
> > > > > > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > > > > > "
> > > > > > > > > > > >
> > > > > > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > This is only for config interrupt.
> > > > > > > > > >
> > > > > > > > > > Ok.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > > > > > expensive.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > > > >
> > > > > > > > > > > > > > ---
> > > > > > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > > > > > >    * */
> > > > > > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > > > > > though it's most likely is ...
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > > > > > >
> > > > > > > > > > > > """
> > > > > > > > > > > >
> > > > > > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > > > > > >  * and NMI handlers.
> > > > > > > > > > > > """
> > > > > > > > > > > >
> > > > > > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > > > > > >
> > > > > > > > > > > > And it has the comment for explain the barrier:
> > > > > > > > > > > >
> > > > > > > > > > > > """
> > > > > > > > > > > >
> > > > > > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > > > > > """
> > > > > > > > > > > >
> > > > > > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > > > > > irq_soft_enabled as false.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > You are right. So then
> > > > > > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > > > > > >    READ_ONCE should do.
> > > > > > > > > >
> > > > > > > > > > See above.
> > > > > > > > > >
> > > > > > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > > > > > >   }
> > > > > > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Ok.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Do you mean:
> > > > > > > > > > > >
> > > > > > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > > > > > interrupt handlers
> > > > > > > > > > >
> > > > > > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > > > > > users at the moment, even less than probe.
> > > > > > > > > >
> > > > > > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > > > > > spinlock or others.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > > > > > for old hypervisors
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > > > > > >
> > > > > > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > > > > > codes. They look all fine since day0.
> > > > > > > > > >
> > > > > > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > > > > > in the debug build).
> > > > > > > > > >
> > > > > > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > > > > > device. I think it can be done but in a separate series.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > And going down from there, how about we cache status in the
> > > > > > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > > > > > speeding boot up a tiny bit.
> > > > > > > > > >
> > > > > > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > > > > > read back for validation in many cases.
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > > > > > + }
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > > > > > >           bool failed;
> > > > > > > > > > > > > >           bool config_enabled;
> > > > > > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > > > > > >           struct device dev;
> > > > > > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > +/*
> > > > > > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > > > > > + * @vdev: the device
> > > > > > > > > > > > > > + */
> > > > > > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > > > > > +{
> > > > > > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > > > > > +         return true;
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > > > > > paired
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Will fix.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > > > > > +}
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > >   /**
> > > > > > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > > > > > >    * @vdev: the device
> > > > > > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > 2.25.1
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-30  2:38                         ` Re: Jason Wang
@ 2022-03-30  5:09                           ` Michael S. Tsirkin
  2022-03-30  5:53                             ` Re: Jason Wang
  0 siblings, 1 reply; 1546+ messages in thread
From: Michael S. Tsirkin @ 2022-03-30  5:09 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Wed, Mar 30, 2022 at 10:38:06AM +0800, Jason Wang wrote:
> On Wed, Mar 30, 2022 at 6:04 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Mar 29, 2022 at 08:13:57PM +0200, Thomas Gleixner wrote:
> > > On Tue, Mar 29 2022 at 10:37, Michael S. Tsirkin wrote:
> > > > On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> > > > We are trying to fix the driver since at the moment it does not
> > > > have the dev->ok flag at all.
> > > >
> > > > And I suspect virtio is not alone in that.
> > > > So it would have been nice if there was a standard flag
> > > > replacing the driver-specific dev->ok above, and ideally
> > > > would also handle the case of an interrupt triggering
> > > > too early by deferring the interrupt until the flag is set.
> > > >
> > > > And in fact, it does kind of exist: IRQF_NO_AUTOEN, and you would call
> > > > enable_irq instead of dev->ok = true, except
> > > > - it doesn't work with affinity managed IRQs
> > > > - it does not work with shared IRQs
> > > >
> > > > So using dev->ok as you propose above seems better at this point.
> > >
> > > Unless there is a big enough amount of drivers which could make use of a
> > > generic mechanism for that.
> > >
> > > >> If any driver does this in the wrong order, then the driver is
> > > >> broken.
> > > >
> > > > I agree, however:
> > > > $ git grep synchronize_irq `git grep -l request_irq drivers/net/`|wc -l
> > > > 113
> > > > $ git grep -l request_irq drivers/net/|wc -l
> > > > 397
> > > >
> > > > I suspect there are more drivers which in theory need the
> > > > synchronize_irq dance but in practice do not execute it.
> > >
> > > That really depends on when the driver requests the interrupt, when
> > > it actually enables the interrupt in the device itself
> >
> > This last point does not matter since we are talking about protecting
> > against buggy/malicious devices. They can inject the interrupt anyway
> > even if driver did not configure it.
> >
> > > and how the
> > > interrupt service routine works.
> > >
> > > So just doing that grep dance does not tell much. You really have to do
> > > a case by case analysis.
> > >
> > > Thanks,
> > >
> > >         tglx
> >
> >
> > I agree. In fact, at least for network the standard approach is to
> > request interrupts in the open call, virtio net is unusual
> > in doing it in probe. We should consider changing that.
> > Jason?
> 
> This probably works only for virtio-net and it looks like not trivial
> since we don't have a specific core API to request interrupts.
> 
> Thanks

We'll need a new API, for sure. E.g.  find vqs with no
callback on probe, and then virtio_request_vq_callbacks separately.

The existing API that specifies callbacks during find vqs
can be used by other drivers.

> >
> > --
> > MST
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-30  2:40                     ` Re: Jason Wang
@ 2022-03-30  5:14                       ` Michael S. Tsirkin
  2022-03-30  5:53                         ` Re: Jason Wang
  0 siblings, 1 reply; 1546+ messages in thread
From: Michael S. Tsirkin @ 2022-03-30  5:14 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Wed, Mar 30, 2022 at 10:40:59AM +0800, Jason Wang wrote:
> On Tue, Mar 29, 2022 at 10:09 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Mar 29, 2022 at 03:12:14PM +0800, Jason Wang wrote:
> > > > > > > > And requesting irq commits all memory otherwise all drivers would be
> > > > > > > > broken,
> > > > > > >
> > > > > > > So I think we might talk different issues:
> > > > > > >
> > > > > > > 1) Whether request_irq() commits the previous setups, I think the
> > > > > > > answer is yes, since the spin_unlock of desc->lock (release) can
> > > > > > > guarantee this though there seems no documentation around
> > > > > > > request_irq() to say this.
> > > > > > >
> > > > > > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > > > > > > using smp_wmb() before the request_irq().
> > > > > > >
> > > > > > > And even if write is ordered we still need read to be ordered to be
> > > > > > > paired with that.
> > > >
> > > > IMO it synchronizes with the CPU to which irq is
> > > > delivered. Otherwise basically all drivers would be broken,
> > > > wouldn't they be?
> > >
> > > I guess it's because most of the drivers don't care much about the
> > > buggy/malicious device.  And most of the devices may require an extra
> > > step to enable device IRQ after request_irq(). Or it's the charge of
> > > the driver to do the synchronization.
> >
> > It is true that the use-case of malicious devices is somewhat boutique.
> > But I think most drivers do want to have their hotplug routines to be
> > robust, yes.
> >
> > > > I don't know whether it's correct on all platforms, but if not
> > > > we need to fix request_irq.
> > > >
> > > > > > >
> > > > > > > > if it doesn't it just needs to be fixed, not worked around in
> > > > > > > > virtio.
> > > > > > >
> > > > > > > 2) virtio drivers might do a lot of setups between request_irq() and
> > > > > > > virtio_device_ready():
> > > > > > >
> > > > > > > request_irq()
> > > > > > > driver specific setups
> > > > > > > virtio_device_ready()
> > > > > > >
> > > > > > > CPU 0 probe) request_irq()
> > > > > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > > > > CPU 0 probe) driver specific setups
> > > > > > > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > > > > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > > > > CPU 1 IRQ handler) use the uninitialized variable
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > >
> > > > > > As I said, virtio_device_ready needs to do synchronize_irq.
> > > > > > That will guarantee all setup is visible to the specific IRQ,
> > > > >
> > > > > Only the interrupt after synchronize_irq() returns.
> > > >
> > > > Anything else is a buggy device though.
> > >
> > > Yes, but the goal of this patch is to prevent the possible attack from
> > > buggy(malicious) devices.
> >
> > Right. However if a driver of a *buggy* device somehow sees driver_ok =
> > false even though it's actually initialized, that is not a deal breaker
> > as that does not open us up to an attack.
> >
> > > >
> > > > > >this
> > > > > > is what it's point is.
> > > > >
> > > > > What happens if an interrupt is raised in the middle like:
> > > > >
> > > > > smp_store_release(dev->irq_soft_enabled, true)
> > > > > IRQ handler
> > > > > synchornize_irq()
> > > > >
> > > > > If we don't enforce a reading order, the IRQ handler may still see the
> > > > > uninitialized variable.
> > > > >
> > > > > Thanks
> > > >
> > > > IMHO variables should be initialized before request_irq
> > > > to a value meaning "not a valid interrupt".
> > > > Specifically driver_ok = false.
> > > > Handler in the scenario you describe will then see !driver_ok
> > > > and exit immediately.
> > >
> > > So just to make sure we're on the same page.
> > >
> > > 1) virtio_reset_device() will set the driver_ok to false;
> > > 2) virtio_device_ready() will set the driver_ok to true
> > >
> > > So for virtio drivers, it often did:
> > >
> > > 1) virtio_reset_device()
> > > 2) find_vqs() which will call request_irq()
> > > 3) other driver specific setups
> > > 4) virtio_device_ready()
> > >
> > > In virtio_device_ready(), the patch perform the following currently:
> > >
> > > smp_store_release(driver_ok, true);
> > > set_status(DRIVER_OK);
> > >
> > > Per your suggestion, to add synchronize_irq() after
> > > smp_store_release() so we had
> > >
> > > smp_store_release(driver_ok, true);
> > > synchornize_irq()
> > > set_status(DRIVER_OK)
> > >
> > > Suppose there's a interrupt raised before the synchronize_irq(), if we do:
> > >
> > > if (READ_ONCE(driver_ok)) {
> > >       vq->callback()
> > > }
> > >
> > > It will see the driver_ok as true but how can we make sure
> > > vq->callback sees the driver specific setups (3) above?
> > >
> > > And an example is virtio_scsi():
> > >
> > > virtio_reset_device()
> > > virtscsi_probe()
> > >     virtscsi_init()
> > >         virtio_find_vqs()
> > >         ...
> > >         virtscsi_init_vq(&vscsi->event_vq, vqs[1])
> > >     ....
> > >     virtio_device_ready()
> > >
> > > In virtscsi_event_done():
> > >
> > > virtscsi_event_done():
> > >     virtscsi_vq_done(vscsi, &vscsi->event_vq, ...);
> > >
> > > We need to make sure the even_done reads driver_ok before read vscsi->event_vq.
> > >
> > > Thanks
> >
> >
> > See response by Thomas. A simple if (!dev->driver_ok) should be enough,
> > it's all under a lock.
> 
> Ordered through ACQUIRE+RELEASE actually since the irq handler is not
> running under the lock.
> 
> Another question, for synchronize_irq() do you prefer
> 
> 1) transport specific callbacks
> or
> 2) a simple synchornize_rcu()
> 
> Thanks


1) I think, and I'd add a wrapper so we can switch to 2 if we really
want to. But for now synchronizing the specific irq is obviously designed to
make any changes to memory visible to this irq. that
seems cleaner and easier to understand than memory ordering tricks
and relying on side effects of synchornize_rcu, even though
internally this all boils down to memory ordering since
memory is what's used to implement locks :).
Not to mention, synchronize_irq just scales much better from performance
POV.


> >
> > > >
> > > >
> > > > > >
> > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > We use smp_store_relase()
> > > > > > > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > > > > > > which surprises me.
> > > > > > > > > > > >
> > > > > > > > > > > > CC Paul to help make sure I'm right.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > > > > > > hardening is disabled by default.
> > > > > > > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > > > > > > are any buffers in any queues?
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > > > > > > >
> > > > > > > > > > > > > "
> > > > > > > > > > > > >
> > > > > > > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > > > > > > "
> > > > > > > > > > > > >
> > > > > > > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > This is only for config interrupt.
> > > > > > > > > > >
> > > > > > > > > > > Ok.
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > > > > > > expensive.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > > > > > > >    * */
> > > > > > > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > > > > > > though it's most likely is ...
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > > > > > > >
> > > > > > > > > > > > > """
> > > > > > > > > > > > >
> > > > > > > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > > > > > > >  * and NMI handlers.
> > > > > > > > > > > > > """
> > > > > > > > > > > > >
> > > > > > > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > > > > > > >
> > > > > > > > > > > > > And it has the comment for explain the barrier:
> > > > > > > > > > > > >
> > > > > > > > > > > > > """
> > > > > > > > > > > > >
> > > > > > > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > > > > > > """
> > > > > > > > > > > > >
> > > > > > > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > > > > > > irq_soft_enabled as false.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > You are right. So then
> > > > > > > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > > > > > > >    READ_ONCE should do.
> > > > > > > > > > >
> > > > > > > > > > > See above.
> > > > > > > > > > >
> > > > > > > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ok.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Do you mean:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > > > > > > interrupt handlers
> > > > > > > > > > > >
> > > > > > > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > > > > > > users at the moment, even less than probe.
> > > > > > > > > > >
> > > > > > > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > > > > > > spinlock or others.
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > > > > > > for old hypervisors
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > > > > > > >
> > > > > > > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > > > > > > codes. They look all fine since day0.
> > > > > > > > > > >
> > > > > > > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > > > > > > in the debug build).
> > > > > > > > > > >
> > > > > > > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > > > > > > device. I think it can be done but in a separate series.
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > And going down from there, how about we cache status in the
> > > > > > > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > > > > > > speeding boot up a tiny bit.
> > > > > > > > > > >
> > > > > > > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > > > > > > read back for validation in many cases.
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > > > > > > + }
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > > > > > > >           bool failed;
> > > > > > > > > > > > > > >           bool config_enabled;
> > > > > > > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > > > > > > >           struct device dev;
> > > > > > > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > +/*
> > > > > > > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > > > > > > + * @vdev: the device
> > > > > > > > > > > > > > > + */
> > > > > > > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > > > > > > +{
> > > > > > > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > > > > > > +         return true;
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > > > > > > paired
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Will fix.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > > > > > > +}
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > >   /**
> > > > > > > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > > > > > > >    * @vdev: the device
> > > > > > > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > 2.25.1
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-30  5:09                           ` Re: Michael S. Tsirkin
@ 2022-03-30  5:53                             ` Jason Wang
  0 siblings, 0 replies; 1546+ messages in thread
From: Jason Wang @ 2022-03-30  5:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Wed, Mar 30, 2022 at 1:09 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Mar 30, 2022 at 10:38:06AM +0800, Jason Wang wrote:
> > On Wed, Mar 30, 2022 at 6:04 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Mar 29, 2022 at 08:13:57PM +0200, Thomas Gleixner wrote:
> > > > On Tue, Mar 29 2022 at 10:37, Michael S. Tsirkin wrote:
> > > > > On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> > > > > We are trying to fix the driver since at the moment it does not
> > > > > have the dev->ok flag at all.
> > > > >
> > > > > And I suspect virtio is not alone in that.
> > > > > So it would have been nice if there was a standard flag
> > > > > replacing the driver-specific dev->ok above, and ideally
> > > > > would also handle the case of an interrupt triggering
> > > > > too early by deferring the interrupt until the flag is set.
> > > > >
> > > > > And in fact, it does kind of exist: IRQF_NO_AUTOEN, and you would call
> > > > > enable_irq instead of dev->ok = true, except
> > > > > - it doesn't work with affinity managed IRQs
> > > > > - it does not work with shared IRQs
> > > > >
> > > > > So using dev->ok as you propose above seems better at this point.
> > > >
> > > > Unless there is a big enough amount of drivers which could make use of a
> > > > generic mechanism for that.
> > > >
> > > > >> If any driver does this in the wrong order, then the driver is
> > > > >> broken.
> > > > >
> > > > > I agree, however:
> > > > > $ git grep synchronize_irq `git grep -l request_irq drivers/net/`|wc -l
> > > > > 113
> > > > > $ git grep -l request_irq drivers/net/|wc -l
> > > > > 397
> > > > >
> > > > > I suspect there are more drivers which in theory need the
> > > > > synchronize_irq dance but in practice do not execute it.
> > > >
> > > > That really depends on when the driver requests the interrupt, when
> > > > it actually enables the interrupt in the device itself
> > >
> > > This last point does not matter since we are talking about protecting
> > > against buggy/malicious devices. They can inject the interrupt anyway
> > > even if driver did not configure it.
> > >
> > > > and how the
> > > > interrupt service routine works.
> > > >
> > > > So just doing that grep dance does not tell much. You really have to do
> > > > a case by case analysis.
> > > >
> > > > Thanks,
> > > >
> > > >         tglx
> > >
> > >
> > > I agree. In fact, at least for network the standard approach is to
> > > request interrupts in the open call, virtio net is unusual
> > > in doing it in probe. We should consider changing that.
> > > Jason?
> >
> > This probably works only for virtio-net and it looks like not trivial
> > since we don't have a specific core API to request interrupts.
> >
> > Thanks
>
> We'll need a new API, for sure. E.g.  find vqs with no
> callback on probe, and then virtio_request_vq_callbacks separately.
>
> The existing API that specifies callbacks during find vqs
> can be used by other drivers.

Ok, I will do it.

Thanks

>
> > >
> > > --
> > > MST
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-30  5:14                       ` Re: Michael S. Tsirkin
@ 2022-03-30  5:53                         ` Jason Wang
  0 siblings, 0 replies; 1546+ messages in thread
From: Jason Wang @ 2022-03-30  5:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Wed, Mar 30, 2022 at 1:14 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Mar 30, 2022 at 10:40:59AM +0800, Jason Wang wrote:
> > On Tue, Mar 29, 2022 at 10:09 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Mar 29, 2022 at 03:12:14PM +0800, Jason Wang wrote:
> > > > > > > > > And requesting irq commits all memory otherwise all drivers would be
> > > > > > > > > broken,
> > > > > > > >
> > > > > > > > So I think we might talk different issues:
> > > > > > > >
> > > > > > > > 1) Whether request_irq() commits the previous setups, I think the
> > > > > > > > answer is yes, since the spin_unlock of desc->lock (release) can
> > > > > > > > guarantee this though there seems no documentation around
> > > > > > > > request_irq() to say this.
> > > > > > > >
> > > > > > > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > > > > > > > using smp_wmb() before the request_irq().
> > > > > > > >
> > > > > > > > And even if write is ordered we still need read to be ordered to be
> > > > > > > > paired with that.
> > > > >
> > > > > IMO it synchronizes with the CPU to which irq is
> > > > > delivered. Otherwise basically all drivers would be broken,
> > > > > wouldn't they be?
> > > >
> > > > I guess it's because most of the drivers don't care much about the
> > > > buggy/malicious device.  And most of the devices may require an extra
> > > > step to enable device IRQ after request_irq(). Or it's the charge of
> > > > the driver to do the synchronization.
> > >
> > > It is true that the use-case of malicious devices is somewhat boutique.
> > > But I think most drivers do want to have their hotplug routines to be
> > > robust, yes.
> > >
> > > > > I don't know whether it's correct on all platforms, but if not
> > > > > we need to fix request_irq.
> > > > >
> > > > > > > >
> > > > > > > > > if it doesn't it just needs to be fixed, not worked around in
> > > > > > > > > virtio.
> > > > > > > >
> > > > > > > > 2) virtio drivers might do a lot of setups between request_irq() and
> > > > > > > > virtio_device_ready():
> > > > > > > >
> > > > > > > > request_irq()
> > > > > > > > driver specific setups
> > > > > > > > virtio_device_ready()
> > > > > > > >
> > > > > > > > CPU 0 probe) request_irq()
> > > > > > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > > > > > CPU 0 probe) driver specific setups
> > > > > > > > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > > > > > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > > > > > CPU 1 IRQ handler) use the uninitialized variable
> > > > > > > >
> > > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > As I said, virtio_device_ready needs to do synchronize_irq.
> > > > > > > That will guarantee all setup is visible to the specific IRQ,
> > > > > >
> > > > > > Only the interrupt after synchronize_irq() returns.
> > > > >
> > > > > Anything else is a buggy device though.
> > > >
> > > > Yes, but the goal of this patch is to prevent the possible attack from
> > > > buggy(malicious) devices.
> > >
> > > Right. However if a driver of a *buggy* device somehow sees driver_ok =
> > > false even though it's actually initialized, that is not a deal breaker
> > > as that does not open us up to an attack.
> > >
> > > > >
> > > > > > >this
> > > > > > > is what it's point is.
> > > > > >
> > > > > > What happens if an interrupt is raised in the middle like:
> > > > > >
> > > > > > smp_store_release(dev->irq_soft_enabled, true)
> > > > > > IRQ handler
> > > > > > synchornize_irq()
> > > > > >
> > > > > > If we don't enforce a reading order, the IRQ handler may still see the
> > > > > > uninitialized variable.
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > IMHO variables should be initialized before request_irq
> > > > > to a value meaning "not a valid interrupt".
> > > > > Specifically driver_ok = false.
> > > > > Handler in the scenario you describe will then see !driver_ok
> > > > > and exit immediately.
> > > >
> > > > So just to make sure we're on the same page.
> > > >
> > > > 1) virtio_reset_device() will set the driver_ok to false;
> > > > 2) virtio_device_ready() will set the driver_ok to true
> > > >
> > > > So for virtio drivers, it often did:
> > > >
> > > > 1) virtio_reset_device()
> > > > 2) find_vqs() which will call request_irq()
> > > > 3) other driver specific setups
> > > > 4) virtio_device_ready()
> > > >
> > > > In virtio_device_ready(), the patch perform the following currently:
> > > >
> > > > smp_store_release(driver_ok, true);
> > > > set_status(DRIVER_OK);
> > > >
> > > > Per your suggestion, to add synchronize_irq() after
> > > > smp_store_release() so we had
> > > >
> > > > smp_store_release(driver_ok, true);
> > > > synchornize_irq()
> > > > set_status(DRIVER_OK)
> > > >
> > > > Suppose there's a interrupt raised before the synchronize_irq(), if we do:
> > > >
> > > > if (READ_ONCE(driver_ok)) {
> > > >       vq->callback()
> > > > }
> > > >
> > > > It will see the driver_ok as true but how can we make sure
> > > > vq->callback sees the driver specific setups (3) above?
> > > >
> > > > And an example is virtio_scsi():
> > > >
> > > > virtio_reset_device()
> > > > virtscsi_probe()
> > > >     virtscsi_init()
> > > >         virtio_find_vqs()
> > > >         ...
> > > >         virtscsi_init_vq(&vscsi->event_vq, vqs[1])
> > > >     ....
> > > >     virtio_device_ready()
> > > >
> > > > In virtscsi_event_done():
> > > >
> > > > virtscsi_event_done():
> > > >     virtscsi_vq_done(vscsi, &vscsi->event_vq, ...);
> > > >
> > > > We need to make sure the even_done reads driver_ok before read vscsi->event_vq.
> > > >
> > > > Thanks
> > >
> > >
> > > See response by Thomas. A simple if (!dev->driver_ok) should be enough,
> > > it's all under a lock.
> >
> > Ordered through ACQUIRE+RELEASE actually since the irq handler is not
> > running under the lock.
> >
> > Another question, for synchronize_irq() do you prefer
> >
> > 1) transport specific callbacks
> > or
> > 2) a simple synchornize_rcu()
> >
> > Thanks
>
>
> 1) I think, and I'd add a wrapper so we can switch to 2 if we really
> want to. But for now synchronizing the specific irq is obviously designed to
> make any changes to memory visible to this irq. that
> seems cleaner and easier to understand than memory ordering tricks
> and relying on side effects of synchornize_rcu, even though
> internally this all boils down to memory ordering since
> memory is what's used to implement locks :).
> Not to mention, synchronize_irq just scales much better from performance
> POV.

Ok. Let me try to do that in V2.

Thanks

>
>
> > >
> > > > >
> > > > >
> > > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > We use smp_store_relase()
> > > > > > > > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > > > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > > > > > > > which surprises me.
> > > > > > > > > > > > >
> > > > > > > > > > > > > CC Paul to help make sure I'm right.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > > > > > > > hardening is disabled by default.
> > > > > > > > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > > > > > > > are any buffers in any queues?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > "
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > > > > > > > "
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > This is only for config interrupt.
> > > > > > > > > > > >
> > > > > > > > > > > > Ok.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > > > > > > > expensive.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > > > > > > > >    * */
> > > > > > > > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > > > > > > > though it's most likely is ...
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > > > > > > > >  * and NMI handlers.
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > And it has the comment for explain the barrier:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > > > > > > > irq_soft_enabled as false.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > You are right. So then
> > > > > > > > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > > > > > > > >    READ_ONCE should do.
> > > > > > > > > > > >
> > > > > > > > > > > > See above.
> > > > > > > > > > > >
> > > > > > > > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Ok.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Do you mean:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > > > > > > > interrupt handlers
> > > > > > > > > > > > >
> > > > > > > > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > > > > > > > users at the moment, even less than probe.
> > > > > > > > > > > >
> > > > > > > > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > > > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > > > > > > > spinlock or others.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > > > > > > > for old hypervisors
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > > > > > > > >
> > > > > > > > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > > > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > > > > > > > codes. They look all fine since day0.
> > > > > > > > > > > >
> > > > > > > > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > > > > > > > in the debug build).
> > > > > > > > > > > >
> > > > > > > > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > > > > > > > device. I think it can be done but in a separate series.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > And going down from there, how about we cache status in the
> > > > > > > > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > > > > > > > speeding boot up a tiny bit.
> > > > > > > > > > > >
> > > > > > > > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > > > > > > > read back for validation in many cases.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > > > > > > > + }
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > > > > > > > >           bool failed;
> > > > > > > > > > > > > > > >           bool config_enabled;
> > > > > > > > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > > > > > > > >           struct device dev;
> > > > > > > > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > > +/*
> > > > > > > > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > > > > > > > + * @vdev: the device
> > > > > > > > > > > > > > > > + */
> > > > > > > > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > > > > > > > +{
> > > > > > > > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > > > > > > > +         return true;
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > > > > > > > paired
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Will fix.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > > > > > > > +}
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >   /**
> > > > > > > > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > > > > > > > >    * @vdev: the device
> > > > > > > > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > 2.25.1
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-03-29  8:35                 ` Re: Thomas Gleixner
  2022-03-29 14:37                   ` Re: Michael S. Tsirkin
@ 2022-04-12  6:55                   ` Michael S. Tsirkin
  1 sibling, 0 replies; 1546+ messages in thread
From: Michael S. Tsirkin @ 2022-04-12  6:55 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization

On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> On Mon, Mar 28 2022 at 06:40, Michael S. Tsirkin wrote:
> > On Mon, Mar 28, 2022 at 02:18:22PM +0800, Jason Wang wrote:
> >> > > So I think we might talk different issues:
> >> > >
> >> > > 1) Whether request_irq() commits the previous setups, I think the
> >> > > answer is yes, since the spin_unlock of desc->lock (release) can
> >> > > guarantee this though there seems no documentation around
> >> > > request_irq() to say this.
> >> > >
> >> > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> >> > > using smp_wmb() before the request_irq().
> 
> That's a complete bogus example especially as there is not a single
> smp_rmb() which pairs with the smp_wmb().
> 
> >> > > And even if write is ordered we still need read to be ordered to be
> >> > > paired with that.
> >
> > IMO it synchronizes with the CPU to which irq is
> > delivered. Otherwise basically all drivers would be broken,
> > wouldn't they be?
> > I don't know whether it's correct on all platforms, but if not
> > we need to fix request_irq.
> 
> There is nothing to fix:
> 
> request_irq()
>    raw_spin_lock_irq(desc->lock);       // ACQUIRE
>    ....
>    raw_spin_unlock_irq(desc->lock);     // RELEASE
> 
> interrupt()
>    raw_spin_lock(desc->lock);           // ACQUIRE
>    set status to IN_PROGRESS
>    raw_spin_unlock(desc->lock);         // RELEASE
>    invoke handler()
> 
> So anything which the driver set up _before_ request_irq() is visible to
> the interrupt handler. No?
> 
> >> What happens if an interrupt is raised in the middle like:
> >> 
> >> smp_store_release(dev->irq_soft_enabled, true)
> >> IRQ handler
> >> synchornize_irq()
> 
> This is bogus. The obvious order of things is:
> 
>     dev->ok = false;
>     request_irq();
> 
>     moar_setup();
>     synchronize_irq();  // ACQUIRE + RELEASE
>     dev->ok = true;
> 
> The reverse operation on teardown:
> 
>     dev->ok = false;
>     synchronize_irq();  // ACQUIRE + RELEASE
> 
>     teardown();
> 
> So in both cases a simple check in the handler is sufficient:
> 
> handler()
>     if (!dev->ok)
>     	return;

Does this need to be if (!READ_ONCE(dev->ok)) ?



> I'm not understanding what you folks are trying to "fix" here. If any
> driver does this in the wrong order, then the driver is broken.
> 
> Sure, you can do the same with:
> 
>     dev->ok = false;
>     request_irq();
>     moar_setup();
>     smp_wmb();
>     dev->ok = true;
> 
> for the price of a smp_rmb() in the interrupt handler:
> 
> handler()
>     if (!dev->ok)
>     	return;
>     smp_rmb();
> 
> but that's only working for the setup case correctly and not for
> teardown.
> 
> Thanks,
> 
>         tglx

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-04-17 17:43 ` [PATCH v3 06/60] target/arm: Change CPUArchState.aarch64 to bool Richard Henderson
@ 2022-04-19 11:17   ` Alex Bennée
  0 siblings, 0 replies; 1546+ messages in thread
From: Alex Bennée @ 2022-04-19 11:17 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm


Richard Henderson <richard.henderson@linaro.org> writes:

> Bool is a more appropriate type for this value.
> Adjust the assignments to use true/false.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-04-13  5:11         ` Nicholas Piggin
@ 2022-04-22 15:53             ` Thomas Gleixner
  0 siblings, 0 replies; 1546+ messages in thread
From: Thomas Gleixner @ 2022-04-22 15:53 UTC (permalink / raw)
  To: Nicholas Piggin, Michael Ellerman, paulmck, Zhouyi Zhou
  Cc: Viresh Kumar, Daniel Lezcano, linux-kernel, rcu, Miguel Ojeda,
	linuxppc-dev

On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
> So we traced the problem down to possibly a misunderstanding between 
> decrementer clock event device and core code.
>
> The decrementer is only oneshot*ish*. It actually needs to either be 
> reprogrammed or shut down otherwise it just continues to cause 
> interrupts.

I always thought that PPC had sane timers. That's really disillusioning.

> Before commit 35de589cb879, it was sort of two-shot. The initial 
> interrupt at the programmed time would set its internal next_tb variable 
> to ~0 and call the ->event_handler(). If that did not set_next_event or 
> stop the timer, the interrupt will fire again immediately, notice 
> next_tb is ~0, and only then stop the decrementer interrupt.
>
> So that was already kind of ugly, this patch just turned it into a hang.
>
> The problem happens when the tick is stopped with an event still 
> pending, then tick_nohz_handler() is called, but it bails out because 
> tick_stopped == 1 so the device never gets programmed again, and so it 
> keeps firing.
>
> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
> really oneshot, but we would like to avoid doing that because it requires 
> additional programming of the hardware on each timer interrupt. We have 
> the ONESHOT_STOPPED state which seems to be just about what we want.
>
> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
> we don't stop it here? This patch seems to fix the hang (not heavily
> tested though).

This was definitely overlooked, but it's arguable it is is not required
for real oneshot clockevent devices. This should only handle the case
where the interrupt was already pending.

The ONESHOT_STOPPED state was introduced to handle the case where the
last timer gets canceled, so the already programmed event does not fire.

It was not necessarily meant to "fix" clockevent devices which are
pretending to be ONESHOT, but keep firing over and over.

That, said. I'm fine with the change along with a big fat comment why
this is required.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2022-04-22 15:53             ` Thomas Gleixner
  0 siblings, 0 replies; 1546+ messages in thread
From: Thomas Gleixner @ 2022-04-22 15:53 UTC (permalink / raw)
  To: Nicholas Piggin, Michael Ellerman, paulmck, Zhouyi Zhou
  Cc: linuxppc-dev, Miguel Ojeda, rcu, Daniel Lezcano, linux-kernel,
	Viresh Kumar

On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
> So we traced the problem down to possibly a misunderstanding between 
> decrementer clock event device and core code.
>
> The decrementer is only oneshot*ish*. It actually needs to either be 
> reprogrammed or shut down otherwise it just continues to cause 
> interrupts.

I always thought that PPC had sane timers. That's really disillusioning.

> Before commit 35de589cb879, it was sort of two-shot. The initial 
> interrupt at the programmed time would set its internal next_tb variable 
> to ~0 and call the ->event_handler(). If that did not set_next_event or 
> stop the timer, the interrupt will fire again immediately, notice 
> next_tb is ~0, and only then stop the decrementer interrupt.
>
> So that was already kind of ugly, this patch just turned it into a hang.
>
> The problem happens when the tick is stopped with an event still 
> pending, then tick_nohz_handler() is called, but it bails out because 
> tick_stopped == 1 so the device never gets programmed again, and so it 
> keeps firing.
>
> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
> really oneshot, but we would like to avoid doing that because it requires 
> additional programming of the hardware on each timer interrupt. We have 
> the ONESHOT_STOPPED state which seems to be just about what we want.
>
> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
> we don't stop it here? This patch seems to fix the hang (not heavily
> tested though).

This was definitely overlooked, but it's arguable it is is not required
for real oneshot clockevent devices. This should only handle the case
where the interrupt was already pending.

The ONESHOT_STOPPED state was introduced to handle the case where the
last timer gets canceled, so the already programmed event does not fire.

It was not necessarily meant to "fix" clockevent devices which are
pretending to be ONESHOT, but keep firing over and over.

That, said. I'm fine with the change along with a big fat comment why
this is required.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-04-22 15:53             ` Re: Thomas Gleixner
@ 2022-04-23  2:29               ` Nicholas Piggin
  -1 siblings, 0 replies; 1546+ messages in thread
From: Nicholas Piggin @ 2022-04-23  2:29 UTC (permalink / raw)
  To: Michael Ellerman, paulmck, Thomas Gleixner, Zhouyi Zhou
  Cc: Viresh,  Kumar, Daniel,  Lezcano, linux-kernel, rcu, Miguel Ojeda,
	linuxppc-dev

Excerpts from Thomas Gleixner's message of April 23, 2022 1:53 am:
> On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
>> So we traced the problem down to possibly a misunderstanding between 
>> decrementer clock event device and core code.
>>
>> The decrementer is only oneshot*ish*. It actually needs to either be 
>> reprogrammed or shut down otherwise it just continues to cause 
>> interrupts.
> 
> I always thought that PPC had sane timers. That's really disillusioning.

My comment was probably a bit misleading explanation of the whole
situation. This weirdness is actually in software in the powerpc
clock event driver due to a recent change I made assuming the clock 
event goes to oneshot-stopped.

The hardware is relatively sane I think, global synchronized constant
rate high frequency clock distributed to the CPUs so reads don't
go off-core. And per-CPU "decrementer" event interrupt at the same
frequency as the clock -- program it to a +ve value and it decrements
until zero then creates basically a level triggered interrupt.

Before my change, the decrementer interrupt would always clear the
interrupt at entry. The event_handler usually programs another
timer in so I tried to avoid that first clear counting on the
oneshot_stopped callback to clear the interrupt if there was no
other timer.

>> Before commit 35de589cb879, it was sort of two-shot. The initial 
>> interrupt at the programmed time would set its internal next_tb variable 
>> to ~0 and call the ->event_handler(). If that did not set_next_event or 
>> stop the timer, the interrupt will fire again immediately, notice 
>> next_tb is ~0, and only then stop the decrementer interrupt.
>>
>> So that was already kind of ugly, this patch just turned it into a hang.
>>
>> The problem happens when the tick is stopped with an event still 
>> pending, then tick_nohz_handler() is called, but it bails out because 
>> tick_stopped == 1 so the device never gets programmed again, and so it 
>> keeps firing.
>>
>> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
>> really oneshot, but we would like to avoid doing that because it requires 
>> additional programming of the hardware on each timer interrupt. We have 
>> the ONESHOT_STOPPED state which seems to be just about what we want.
>>
>> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
>> we don't stop it here? This patch seems to fix the hang (not heavily
>> tested though).
> 
> This was definitely overlooked, but it's arguable it is is not required
> for real oneshot clockevent devices. This should only handle the case
> where the interrupt was already pending.
> 
> The ONESHOT_STOPPED state was introduced to handle the case where the
> last timer gets canceled, so the already programmed event does not fire.
> 
> It was not necessarily meant to "fix" clockevent devices which are
> pretending to be ONESHOT, but keep firing over and over.
> 
> That, said. I'm fine with the change along with a big fat comment why
> this is required.

Thanks for taking a look and confirming. I just sent a patch with a
comment and what looks like another missed case. Hopefully it's okay.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2022-04-23  2:29               ` Nicholas Piggin
  0 siblings, 0 replies; 1546+ messages in thread
From: Nicholas Piggin @ 2022-04-23  2:29 UTC (permalink / raw)
  To: Michael Ellerman, paulmck, Thomas Gleixner, Zhouyi Zhou
  Cc: Daniel,  Lezcano, linux-kernel, linuxppc-dev, Miguel Ojeda, rcu,
	Viresh,  Kumar

Excerpts from Thomas Gleixner's message of April 23, 2022 1:53 am:
> On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
>> So we traced the problem down to possibly a misunderstanding between 
>> decrementer clock event device and core code.
>>
>> The decrementer is only oneshot*ish*. It actually needs to either be 
>> reprogrammed or shut down otherwise it just continues to cause 
>> interrupts.
> 
> I always thought that PPC had sane timers. That's really disillusioning.

My comment was probably a bit misleading explanation of the whole
situation. This weirdness is actually in software in the powerpc
clock event driver due to a recent change I made assuming the clock 
event goes to oneshot-stopped.

The hardware is relatively sane I think, global synchronized constant
rate high frequency clock distributed to the CPUs so reads don't
go off-core. And per-CPU "decrementer" event interrupt at the same
frequency as the clock -- program it to a +ve value and it decrements
until zero then creates basically a level triggered interrupt.

Before my change, the decrementer interrupt would always clear the
interrupt at entry. The event_handler usually programs another
timer in so I tried to avoid that first clear counting on the
oneshot_stopped callback to clear the interrupt if there was no
other timer.

>> Before commit 35de589cb879, it was sort of two-shot. The initial 
>> interrupt at the programmed time would set its internal next_tb variable 
>> to ~0 and call the ->event_handler(). If that did not set_next_event or 
>> stop the timer, the interrupt will fire again immediately, notice 
>> next_tb is ~0, and only then stop the decrementer interrupt.
>>
>> So that was already kind of ugly, this patch just turned it into a hang.
>>
>> The problem happens when the tick is stopped with an event still 
>> pending, then tick_nohz_handler() is called, but it bails out because 
>> tick_stopped == 1 so the device never gets programmed again, and so it 
>> keeps firing.
>>
>> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
>> really oneshot, but we would like to avoid doing that because it requires 
>> additional programming of the hardware on each timer interrupt. We have 
>> the ONESHOT_STOPPED state which seems to be just about what we want.
>>
>> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
>> we don't stop it here? This patch seems to fix the hang (not heavily
>> tested though).
> 
> This was definitely overlooked, but it's arguable it is is not required
> for real oneshot clockevent devices. This should only handle the case
> where the interrupt was already pending.
> 
> The ONESHOT_STOPPED state was introduced to handle the case where the
> last timer gets canceled, so the already programmed event does not fire.
> 
> It was not necessarily meant to "fix" clockevent devices which are
> pretending to be ONESHOT, but keep firing over and over.
> 
> That, said. I'm fine with the change along with a big fat comment why
> this is required.

Thanks for taking a look and confirming. I just sent a patch with a
comment and what looks like another missed case. Hopefully it's okay.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-06-06  5:33 Fenil Jain
@ 2022-06-06  5:51 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 1546+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-06  5:51 UTC (permalink / raw)
  To: Fenil Jain; +Cc: Shuah Khan, stable

On Mon, Jun 06, 2022 at 11:03:24AM +0530, Fenil Jain wrote:
> On Fri, Jun 03, 2022 at 07:43:01PM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.18.2 release.
> > There are 67 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sun, 05 Jun 2022 17:38:05 +0000.
> > Anything received after that time might be too late.
> 
> Hey Greg,
> 
> Ran tests and boot tested on my system, no regression found
> 
> Tested-by: Fenil Jain<fkjainco@gmail.com>

Thanks for the testing, but something went wrong with your email client
and it lost the Subject: line, making this impossible to be picked up by
our tools.

Also, please include an extra ' ' before the '<' character in your
tested-by line.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-08-26 22:03 Zach O'Keefe
@ 2022-08-31 21:47 ` Yang Shi
  2022-09-01  0:24   ` Re: Zach O'Keefe
  0 siblings, 1 reply; 1546+ messages in thread
From: Yang Shi @ 2022-08-31 21:47 UTC (permalink / raw)
  To: Zach O'Keefe
  Cc: linux-mm, Andrew Morton, linux-api, Axel Rasmussen,
	James Houghton, Hugh Dickins, Miaohe Lin, David Hildenbrand,
	David Rientjes, Matthew Wilcox, Pasha Tatashin, Peter Xu,
	Rongwei Wang, SeongJae Park, Song Liu, Vlastimil Babka,
	Chris Kennelly, Kirill A. Shutemov, Minchan Kim, Patrick Xia

Hi Zach,

I did a quick look at the series, basically no show stopper to me. But
I didn't find time to review them thoroughly yet, quite busy on
something else. Just a heads up, I didn't mean to ignore you. I will
review them when I find some time.

Thanks,
Yang

On Fri, Aug 26, 2022 at 3:03 PM Zach O'Keefe <zokeefe@google.com> wrote:
>
> Subject: [PATCH mm-unstable v2 0/9] mm: add file/shmem support to MADV_COLLAPSE
>
> v2 Forward
>
> Mostly a RESEND: rebase on latest mm-unstable + minor bug fixes from
> kernel test robot.
> --------------------------------
>
> This series builds on top of the previous "mm: userspace hugepage collapse"
> series which introduced the MADV_COLLAPSE madvise mode and added support
> for private, anonymous mappings[1], by adding support for file and shmem
> backed memory to CONFIG_READ_ONLY_THP_FOR_FS=y kernels.
>
> File and shmem support have been added with effort to align with existing
> MADV_COLLAPSE semantics and policy decisions[2].  Collapse of shmem-backed
> memory ignores kernel-guiding directives and heuristics including all
> sysfs settings (transparent_hugepage/shmem_enabled), and tmpfs huge= mount
> options (shmem always supports large folios).  Like anonymous mappings, on
> successful return of MADV_COLLAPSE on file/shmem memory, the contents of
> memory mapped by the addresses provided will be synchronously pmd-mapped
> THPs.
>
> This functionality unlocks two important uses:
>
> (1)     Immediately back executable text by THPs.  Current support provided
>         by CONFIG_READ_ONLY_THP_FOR_FS may take a long time on a large
>         system which might impair services from serving at their full rated
>         load after (re)starting.  Tricks like mremap(2)'ing text onto
>         anonymous memory to immediately realize iTLB performance prevents
>         page sharing and demand paging, both of which increase steady state
>         memory footprint.  Now, we can have the best of both worlds: Peak
>         upfront performance and lower RAM footprints.
>
> (2)     userfaultfd-based live migration of virtual machines satisfy UFFD
>         faults by fetching native-sized pages over the network (to avoid
>         latency of transferring an entire hugepage).  However, after guest
>         memory has been fully copied to the new host, MADV_COLLAPSE can
>         be used to immediately increase guest performance.
>
> khugepaged has received a small improvement by association and can now
> detect and collapse pte-mapped THPs.  However, there is still work to be
> done along the file collapse path.  Compound pages of arbitrary order still
> needs to be supported and THP collapse needs to be converted to using
> folios in general.  Eventually, we'd like to move away from the read-only
> and executable-mapped constraints currently imposed on eligible files and
> support any inode claiming huge folio support.  That said, I think the
> series as-is covers enough to claim that MADV_COLLAPSE supports file/shmem
> memory.
>
> Patches 1-3     Implement the guts of the series.
> Patch 4         Is a tracepoint for debugging.
> Patches 5-8     Refactor existing khugepaged selftests to work with new
>                 memory types.
> Patch 9         Adds a userfaultfd selftest mode to mimic a functional test
>                 of UFFDIO_REGISTER_MODE_MINOR+MADV_COLLAPSE live migration.
>
> Applies against mm-unstable.
>
> [1] https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/
> [2] https://lore.kernel.org/linux-mm/YtBmhaiPHUTkJml8@google.com/
>
> v1 -> v2:
> - Add missing definition for khugepaged_add_pte_mapped_thp() in
>   !CONFIG_SHEM builds, in "mm/khugepaged: attempt to map
>   file/shmem-backed pte-mapped THPs by pmds"
> - Minor bugfixes in "mm/madvise: add file and shmem support to
>   MADV_COLLAPSE" for !CONFIG_SHMEM, !CONFIG_TRANSPARENT_HUGEPAGE and some
>   compiler settings.
> - Rebased on latest mm-unstable
>
> Zach O'Keefe (9):
>   mm/shmem: add flag to enforce shmem THP in hugepage_vma_check()
>   mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by
>     pmds
>   mm/madvise: add file and shmem support to MADV_COLLAPSE
>   mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
>   selftests/vm: dedup THP helpers
>   selftests/vm: modularize thp collapse memory operations
>   selftests/vm: add thp collapse file and tmpfs testing
>   selftests/vm: add thp collapse shmem testing
>   selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
>
>  include/linux/khugepaged.h                    |  13 +-
>  include/linux/shmem_fs.h                      |  10 +-
>  include/trace/events/huge_memory.h            |  36 +
>  kernel/events/uprobes.c                       |   2 +-
>  mm/huge_memory.c                              |   2 +-
>  mm/khugepaged.c                               | 289 ++++--
>  mm/shmem.c                                    |  18 +-
>  tools/testing/selftests/vm/Makefile           |   2 +
>  tools/testing/selftests/vm/khugepaged.c       | 828 ++++++++++++------
>  tools/testing/selftests/vm/soft-dirty.c       |   2 +-
>  .../selftests/vm/split_huge_page_test.c       |  12 +-
>  tools/testing/selftests/vm/userfaultfd.c      | 171 +++-
>  tools/testing/selftests/vm/vm_util.c          |  36 +-
>  tools/testing/selftests/vm/vm_util.h          |   5 +-
>  14 files changed, 1040 insertions(+), 386 deletions(-)
>
> --
> 2.37.2.672.g94769d06f0-goog
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-08-31 21:47 ` Yang Shi
@ 2022-09-01  0:24   ` Zach O'Keefe
  0 siblings, 0 replies; 1546+ messages in thread
From: Zach O'Keefe @ 2022-09-01  0:24 UTC (permalink / raw)
  To: Yang Shi
  Cc: linux-mm, Andrew Morton, linux-api, Axel Rasmussen,
	James Houghton, Hugh Dickins, Miaohe Lin, David Hildenbrand,
	David Rientjes, Matthew Wilcox, Pasha Tatashin, Peter Xu,
	Rongwei Wang, SeongJae Park, Song Liu, Vlastimil Babka,
	Chris Kennelly, Kirill A. Shutemov, Minchan Kim, Patrick Xia

On Wed, Aug 31, 2022 at 2:47 PM Yang Shi <shy828301@gmail.com> wrote:
>
> Hi Zach,
>
> I did a quick look at the series, basically no show stopper to me. But
> I didn't find time to review them thoroughly yet, quite busy on
> something else. Just a heads up, I didn't mean to ignore you. I will
> review them when I find some time.
>
> Thanks,
> Yang

Hey Yang,

Thanks for taking the time to look through, and thanks for giving me a
heads up, and no rush!

In the last day or so, while porting this series around, I encountered
some subtle edge cases I wanted to clean up / address - so it's good
you didn't do a thorough review yet. I was *hoping* to have a v3 out
last night (which evidently did not happen) and it does not seem like
it will happen today, so I'll leave this message as a request for
reviewers to hold off on a thorough review until v3.

Thanks for your time as always,
Zach

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-08-28 21:01 Nick Neumann
@ 2022-09-01 17:44 ` Nick Neumann
  0 siblings, 0 replies; 1546+ messages in thread
From: Nick Neumann @ 2022-09-01 17:44 UTC (permalink / raw)
  To: fio

PR for this is up now.

On Sun, Aug 28, 2022 at 4:01 PM Nick Neumann <nick@pcpartpicker.com> wrote:
>
> I've filed the issue on github, but just thought I'd mention here too.
> In real-world use it appears to be intermittent. I"m not yet sure how
> intermittent, but I could see it being used in production and not
> caught right away. I got lucky and stumbled on it when looking at
> graphs of runs and noticed 15 seconds of no activity.
>
> https://github.com/axboe/fio/issues/1457
>
> With the null ioengine, I can make it reproduce very reliably, which
> is encouraging as I move to debug.
>
> I had just moved to using log compression as it is really powerful,
> and the only way to store per I/O logs for a long run without pushing
> up against the amount of physical memory in a system.
>
> (Without compression, a GB of sequential writes at 128K block size is
> on the order of 245KB of memory per log, so a TB is 245MB per log. Now
> run a job to fill a 20TB drive and you're at 4.9GB for one log file.
> If you record all 3 latency numbers too, you're talking close to
> 20GB.)

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-09-12 12:36 Christian König
@ 2022-09-13  2:04 ` Alex Deucher
  0 siblings, 0 replies; 1546+ messages in thread
From: Alex Deucher @ 2022-09-13  2:04 UTC (permalink / raw)
  To: Christian König; +Cc: alexander.deucher, amd-gfx

On Mon, Sep 12, 2022 at 8:36 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Hey Alex,
>
> I've decided to split this patch set into two because we still can't
> figure out where the VCN regressions come from.
>
> Ruijing tested them and confirmed that they don't regress VCN.
>
> Can you and maybe Felix take a look and review them?

Looks good to me.  Series is:
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

>
> Thanks,
> Christian.
>
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-09-14 13:12 Amjad Ouled-Ameur
@ 2022-09-14 13:18   ` Amjad Ouled-Ameur
  0 siblings, 0 replies; 1546+ messages in thread
From: Amjad Ouled-Ameur @ 2022-09-14 13:18 UTC (permalink / raw)
  To: Rob Herring
  Cc: Krzysztof Kozlowski, Matthias Brugger, devicetree,
	linux-arm-kernel, linux-mediatek, linux-kernel

Hi,

The subject has not been parsed correctly, I resent a proper patch here:

https://patchwork.kernel.org/project/linux-mediatek/patch/20220914131339.18348-1-aouledameur@baylibre.com/


Sorry for the noise.

Regards,

Amjad

On 9/14/22 15:12, Amjad Ouled-Ameur wrote:
> Subject: [PATCH] arm64: dts: mediatek: mt8183: remove thermal zones without
>   trips.
>
> Thermal zones without trip point are not registered by thermal core.
>
> tzts1 ~ tzts6 zones of mt8183 were intially introduced for test-purpose
> only but are not supposed to remain on DT.
>
> Remove the zones above and keep only cpu_thermal.
>
> Signed-off-by: Amjad Ouled-Ameur <aouledameur@baylibre.com>
> ---
>   arch/arm64/boot/dts/mediatek/mt8183.dtsi | 57 ------------------------
>   1 file changed, 57 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> index 9d32871973a2..f65fae8939de 100644
> --- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> +++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> @@ -1182,63 +1182,6 @@ THERMAL_NO_LIMIT
>   					};
>   				};
>   			};
> -
> -			/* The tzts1 ~ tzts6 don't need to polling */
> -			/* The tzts1 ~ tzts6 don't need to thermal throttle */
> -
> -			tzts1: tzts1 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 1>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts2: tzts2 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 2>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts3: tzts3 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 3>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts4: tzts4 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 4>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts5: tzts5 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 5>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tztsABB: tztsABB {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 6>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
>   		};
>   
>   		pwm0: pwm@1100e000 {


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2022-09-14 13:18   ` Amjad Ouled-Ameur
  0 siblings, 0 replies; 1546+ messages in thread
From: Amjad Ouled-Ameur @ 2022-09-14 13:18 UTC (permalink / raw)
  To: Rob Herring
  Cc: Krzysztof Kozlowski, Matthias Brugger, devicetree,
	linux-arm-kernel, linux-mediatek, linux-kernel

Hi,

The subject has not been parsed correctly, I resent a proper patch here:

https://patchwork.kernel.org/project/linux-mediatek/patch/20220914131339.18348-1-aouledameur@baylibre.com/


Sorry for the noise.

Regards,

Amjad

On 9/14/22 15:12, Amjad Ouled-Ameur wrote:
> Subject: [PATCH] arm64: dts: mediatek: mt8183: remove thermal zones without
>   trips.
>
> Thermal zones without trip point are not registered by thermal core.
>
> tzts1 ~ tzts6 zones of mt8183 were intially introduced for test-purpose
> only but are not supposed to remain on DT.
>
> Remove the zones above and keep only cpu_thermal.
>
> Signed-off-by: Amjad Ouled-Ameur <aouledameur@baylibre.com>
> ---
>   arch/arm64/boot/dts/mediatek/mt8183.dtsi | 57 ------------------------
>   1 file changed, 57 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> index 9d32871973a2..f65fae8939de 100644
> --- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> +++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> @@ -1182,63 +1182,6 @@ THERMAL_NO_LIMIT
>   					};
>   				};
>   			};
> -
> -			/* The tzts1 ~ tzts6 don't need to polling */
> -			/* The tzts1 ~ tzts6 don't need to thermal throttle */
> -
> -			tzts1: tzts1 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 1>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts2: tzts2 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 2>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts3: tzts3 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 3>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts4: tzts4 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 4>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts5: tzts5 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 5>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tztsABB: tztsABB {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 6>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
>   		};
>   
>   		pwm0: pwm@1100e000 {

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-11-09 14:34 Denis Arefev
@ 2022-11-09 14:44 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 1546+ messages in thread
From: Greg Kroah-Hartman @ 2022-11-09 14:44 UTC (permalink / raw)
  To: Denis Arefev
  Cc: David Airlie, Daniel Vetter, stable, Alexey Khoroshilov,
	ldv-project, trufanov, vfh

On Wed, Nov 09, 2022 at 05:34:13PM +0300, Denis Arefev wrote:
> Date: Wed, 9 Nov 2022 16:52:17 +0300
> Subject: [PATCH 5.10] nbio_v7_4: Add pointer check
> 
> Return value of a function 'amdgpu_ras_find_obj' is dereferenced at nbio_v7_4.c:325 without checking for null
> 
> Found by Linux Verification Center (linuxtesting.org) with SVACE.
> 
> Signed-off-by: Denis Arefev <arefev@swemel.ru>
> ---
>  drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
> index eadc9526d33f..d2627a610e48 100644
> --- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
> +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
> @@ -303,6 +303,9 @@ static void nbio_v7_4_handle_ras_controller_intr_no_bifring(struct amdgpu_device
>  	struct ras_manager *obj = amdgpu_ras_find_obj(adev, adev->nbio.ras_if);
>  	struct ras_err_data err_data = {0, 0, 0, NULL};
>  	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
> 
> +	if (!obj)
> +		return;
>  
>  	bif_doorbell_intr_cntl = RREG32_SOC15(NBIO, 0, mmBIF_DOORBELL_INT_CNTL);
>  	if (REG_GET_FIELD(bif_doorbell_intr_cntl,
> -- 
> 2.25.1
> 


<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-11-18  2:00 Jiamei Xie
@ 2022-11-18  7:47 ` Michal Orzel
  2022-11-18  9:02   ` Re: Julien Grall
  0 siblings, 1 reply; 1546+ messages in thread
From: Michal Orzel @ 2022-11-18  7:47 UTC (permalink / raw)
  To: Jiamei Xie, xen-devel
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Wei Chen

Hi Jimaei,

On 18/11/2022 03:00, Jiamei Xie wrote:
> 
> 
> Date: Thu, 17 Nov 2022 11:07:12 +0800
> Subject: [PATCH] xen/arm: vpl011: Make access to DMACR write-ignore
> 
> When the guest kernel enables DMA engine with "CONFIG_DMA_ENGINE=y",
> Linux SBSA PL011 driver will access PL011 DMACR register in some
> functions. As chapter "B Generic UART" in "ARM Server Base System
> Architecture"[1] documentation describes, SBSA UART doesn't support
> DMA. In current code, when the kernel tries to access DMACR register,
> Xen will inject a data abort:
> Unhandled fault at 0xffffffc00944d048
> Mem abort info:
>   ESR = 0x96000000
>   EC = 0x25: DABT (current EL), IL = 32 bits
>   SET = 0, FnV = 0
>   EA = 0, S1PTW = 0
>   FSC = 0x00: ttbr address size fault
> Data abort info:
>   ISV = 0, ISS = 0x00000000
>   CM = 0, WnR = 0
> swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000020e2e000
> [ffffffc00944d048] pgd=100000003ffff803, p4d=100000003ffff803, pud=100000003ffff803, pmd=100000003fffa803, pte=006800009c090f13
> Internal error: ttbr address size fault: 96000000 [#1] PREEMPT SMP
> ...
> Call trace:
>  pl011_stop_rx+0x70/0x80
>  tty_port_shutdown+0x7c/0xb4
>  tty_port_close+0x60/0xcc
>  uart_close+0x34/0x8c
>  tty_release+0x144/0x4c0
>  __fput+0x78/0x220
>  ____fput+0x1c/0x30
>  task_work_run+0x88/0xc0
>  do_notify_resume+0x8d0/0x123c
>  el0_svc+0xa8/0xc0
>  el0t_64_sync_handler+0xa4/0x130
>  el0t_64_sync+0x1a0/0x1a4
> Code: b9000083 b901f001 794038a0 8b000042 (b9000041)
> ---[ end trace 83dd93df15c3216f ]---
> note: bootlogd[132] exited with preempt_count 1
> /etc/rcS.d/S07bootlogd: line 47: 132 Segmentation fault start-stop-daemon
> 
> As discussed in [2], this commit makes the access to DMACR register
> write-ignore as an improvement.
As discussed earlier, if we decide to improve vpl011 (for now only Stefano shared his opinion),
then we need to mark *all* the PL011 registers that are not part of SBSA ar RAZ/WI. So handling
DMACR and only for writes is not beneficial (it is only fixing current Linux issue, but what we
really want is to improve the code in general).

> 
> [1] https://developer.arm.com/documentation/den0094/c/?lang=en
> [2] https://lore.kernel.org/xen-devel/alpine.DEB.2.22.394.2211161552420.4020@ubuntu-linux-20-04-desktop/
> 
> Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>
> ---
>  xen/arch/arm/vpl011.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c
> index 43522d48fd..80d00b3052 100644
> --- a/xen/arch/arm/vpl011.c
> +++ b/xen/arch/arm/vpl011.c
> @@ -463,6 +463,7 @@ static int vpl011_mmio_write(struct vcpu *v,
>      case FR:
>      case RIS:
>      case MIS:
> +    case DMACR:
>          goto write_ignore;
> 
>      case IMSC:
> --
> 2.25.1
> 
> 

~Michal


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-11-18  7:47 ` Michal Orzel
@ 2022-11-18  9:02   ` Julien Grall
  0 siblings, 0 replies; 1546+ messages in thread
From: Julien Grall @ 2022-11-18  9:02 UTC (permalink / raw)
  To: Michal Orzel, Jiamei Xie, xen-devel
  Cc: Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk, Wei Chen



On 18/11/2022 07:47, Michal Orzel wrote:
> On 18/11/2022 03:00, Jiamei Xie wrote:
>>
>>
>> Date: Thu, 17 Nov 2022 11:07:12 +0800
>> Subject: [PATCH] xen/arm: vpl011: Make access to DMACR write-ignore
>>
>> When the guest kernel enables DMA engine with "CONFIG_DMA_ENGINE=y",
>> Linux SBSA PL011 driver will access PL011 DMACR register in some
>> functions. As chapter "B Generic UART" in "ARM Server Base System
>> Architecture"[1] documentation describes, SBSA UART doesn't support
>> DMA. In current code, when the kernel tries to access DMACR register,
>> Xen will inject a data abort:
>> Unhandled fault at 0xffffffc00944d048
>> Mem abort info:
>>    ESR = 0x96000000
>>    EC = 0x25: DABT (current EL), IL = 32 bits
>>    SET = 0, FnV = 0
>>    EA = 0, S1PTW = 0
>>    FSC = 0x00: ttbr address size fault
>> Data abort info:
>>    ISV = 0, ISS = 0x00000000
>>    CM = 0, WnR = 0
>> swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000020e2e000
>> [ffffffc00944d048] pgd=100000003ffff803, p4d=100000003ffff803, pud=100000003ffff803, pmd=100000003fffa803, pte=006800009c090f13
>> Internal error: ttbr address size fault: 96000000 [#1] PREEMPT SMP
>> ...
>> Call trace:
>>   pl011_stop_rx+0x70/0x80
>>   tty_port_shutdown+0x7c/0xb4
>>   tty_port_close+0x60/0xcc
>>   uart_close+0x34/0x8c
>>   tty_release+0x144/0x4c0
>>   __fput+0x78/0x220
>>   ____fput+0x1c/0x30
>>   task_work_run+0x88/0xc0
>>   do_notify_resume+0x8d0/0x123c
>>   el0_svc+0xa8/0xc0
>>   el0t_64_sync_handler+0xa4/0x130
>>   el0t_64_sync+0x1a0/0x1a4
>> Code: b9000083 b901f001 794038a0 8b000042 (b9000041)
>> ---[ end trace 83dd93df15c3216f ]---
>> note: bootlogd[132] exited with preempt_count 1
>> /etc/rcS.d/S07bootlogd: line 47: 132 Segmentation fault start-stop-daemon
>>
>> As discussed in [2], this commit makes the access to DMACR register
>> write-ignore as an improvement.
> As discussed earlier, if we decide to improve vpl011 (for now only Stefano shared his opinion),
> then we need to mark *all* the PL011 registers that are not part of SBSA ar RAZ/WI.

I would be fine to that. But I would like us to print a message using 
XENLOG_G_DEBUG to catch any OS that would touch those registers.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2022-11-18 18:11 Mr. JAMES
  0 siblings, 0 replies; 1546+ messages in thread
From: Mr. JAMES @ 2022-11-18 18:11 UTC (permalink / raw)
  To: devicetree

Hello, 

I'm James, an Entrepreneur, Venture Capitalist & Private Lender. I represent a group of Ultra High Net Worth Donors worldwide. Kindly let me know if you can be trusted to distribute charitable items which include Cash, Food Items and Clothing in your region.

Thank you
James.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2022-11-18 19:33 Mr. JAMES
  0 siblings, 0 replies; 1546+ messages in thread
From: Mr. JAMES @ 2022-11-18 19:33 UTC (permalink / raw)
  To: git

Hello, 

I'm James, an Entrepreneur, Venture Capitalist & Private Lender. I represent a group of Ultra High Net Worth Donors worldwide. Kindly let me know if you can be trusted to distribute charitable items which include Cash, Food Items and Clothing in your region.

Thank you
James.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2022-11-21 11:11 Denis Arefev
@ 2022-11-21 14:28 ` Jason Yan
  0 siblings, 0 replies; 1546+ messages in thread
From: Jason Yan @ 2022-11-21 14:28 UTC (permalink / raw)
  To: Denis Arefev, Anil Gurumurthy
  Cc: Sudarsana Kalluru, James E.J. Bottomley, Martin K. Petersen,
	linux-scsi, linux-kernel, trufanov, vfh

You may need a real subject, not a subject text in the email.

type "git help send-email" if you don't know how to use it.

On 2022/11/21 19:11, Denis Arefev wrote:
> Date: Mon, 21 Nov 2022 13:29:03 +0300
> Subject: [PATCH] scsi:bfa: Eliminated buffer overflow
> 
> Buffer 'cmd->adapter_hwpath' of size 32 accessed at
> bfad_bsg.c:101:103 can overflow, since its index 'i'
> can have value 32 that is out of range.
> 
> Signed-off-by: Denis Arefev <arefev@swemel.ru>
> ---
>   drivers/scsi/bfa/bfad_bsg.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/bfa/bfad_bsg.c b/drivers/scsi/bfa/bfad_bsg.c
> index be8dfbe13e90..78615ffc62ef 100644
> --- a/drivers/scsi/bfa/bfad_bsg.c
> +++ b/drivers/scsi/bfa/bfad_bsg.c
> @@ -98,9 +98,9 @@ bfad_iocmd_ioc_get_info(struct bfad_s *bfad, void *cmd)
>   
>   	/* set adapter hw path */
>   	strcpy(iocmd->adapter_hwpath, bfad->pci_name);
> -	for (i = 0; iocmd->adapter_hwpath[i] != ':' && i < BFA_STRING_32; i++)
> +	for (i = 0; iocmd->adapter_hwpath[i] != ':' && i < BFA_STRING_32-2; i++)
>   		;
> -	for (; iocmd->adapter_hwpath[++i] != ':' && i < BFA_STRING_32; )
> +	for (; iocmd->adapter_hwpath[++i] != ':' && i < BFA_STRING_32-1; )
>   		;
>   	iocmd->adapter_hwpath[i] = '\0';
>   	iocmd->status = BFA_STATUS_OK;
> 

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <20230122193117.GA28689@Debian-50-lenny-64-minimal>
@ 2023-01-22 21:42 ` Alejandro Colomar
  2023-01-24 20:01   ` Re: Helge Kreutzmann
  0 siblings, 1 reply; 1546+ messages in thread
From: Alejandro Colomar @ 2023-01-22 21:42 UTC (permalink / raw)
  To: Helge Kreutzmann; +Cc: mario.blaettermann, linux-man


[-- Attachment #1.1: Type: text/plain, Size: 205 bytes --]

Hi Helge,

On 1/22/23 20:31, Helge Kreutzmann wrote:
> Without further ado, the following was found:

Empty report.  An accident? :)

Cheers,

Alex
> 

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-01-22 21:42 ` Re: Alejandro Colomar
@ 2023-01-24 20:01   ` Helge Kreutzmann
  0 siblings, 0 replies; 1546+ messages in thread
From: Helge Kreutzmann @ 2023-01-24 20:01 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: mario.blaettermann, linux-man

[-- Attachment #1: Type: text/plain, Size: 661 bytes --]

Helo Alex,
On Sun, Jan 22, 2023 at 10:42:54PM +0100, Alejandro Colomar wrote:
> Hi Helge,
> 
> On 1/22/23 20:31, Helge Kreutzmann wrote:
> > Without further ado, the following was found:
> 
> Empty report.  An accident? :)

I tried to figure out what happend - but I don't know.

Sorry for the empty report, please disregard.

Greetings

         Helge

-- 
      Dr. Helge Kreutzmann                     debian@helgefjell.de
           Dipl.-Phys.                   http://www.helgefjell.de/debian.php
        64bit GNU powered                     gpg signed mail preferred
           Help keep free software "libre": http://www.ffii.de/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-01-27  1:59 ` Dan Williams
@ 2023-01-27 16:10   ` Alison Schofield
  2023-01-27 19:16     ` Re: Dan Williams
  0 siblings, 1 reply; 1546+ messages in thread
From: Alison Schofield @ 2023-01-27 16:10 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Vishal Verma, Dave Jiang, Ben Widawsky, Steven Rostedt,
	linux-cxl, linux-kernel

On Thu, Jan 26, 2023 at 05:59:03PM -0800, Dan Williams wrote:
> alison.schofield@ wrote:
> > From: Alison Schofield <alison.schofield@intel.com>
> > 
> > Subject: [PATCH v5 0/5] CXL Poison List Retrieval & Tracing
> > 
> > Changes in v5:
> > - Rebase on cxl/next 
> > - Use struct_size() to calc mbox cmd payload .min_out
> > - s/INTERNAL/INJECTED mocked poison record source
> > - Added Jonathan Reviewed-by tag on Patch 3
> > 
> > Link to v4:
> > https://lore.kernel.org/linux-cxl/cover.1671135967.git.alison.schofield@intel.com/
> > 
> > Add support for retrieving device poison lists and store the returned
> > error records as kernel trace events.
> > 
> > The handling of the poison list is guided by the CXL 3.0 Specification
> > Section 8.2.9.8.4.1. [1] 
> > 
> > Example, triggered by memdev:
> > $ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
> > cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> 
> I think the pcidev= field wants to be called something like "host" or
> "parent", because there is no strict requirement that a 'struct
> cxl_memdev' is related to a 'struct pci_dev'. In fact in that example
> "cxl_mem.3" is a 'struct platform_device'. Now that I think about it, I
> think all CXL device events should be emitting the PCIe serial number
> for the memdev.
]

Will do, 'host' and add PCIe serial no.

> 
> I will look in the implementation, but do region= and region_uuid= get
> populated when mem3 is a member of the region?

Not always.
In the case above, where the trigger was by memdev, no.
Region= and region_uuid= (and in the follow-on patch, hpa=) only get
populated if the poison was triggered by region, like the case below.

It could be looked up for the by memdev cases. Is that wanted?

Thanks for the reviews Dan!
> 
> > 
> > Example, triggered by region:
> > $ echo 1 > /sys/bus/cxl/devices/region5/trigger_poison_list
> > cxl_poison: memdev=mem0 pcidev=cxl_mem.0 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > cxl_poison: memdev=mem1 pcidev=cxl_mem.1 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > 
> > [1]: https://www.computeexpresslink.org/download-the-specification
> > 
> > Alison Schofield (5):
> >   cxl/mbox: Add GET_POISON_LIST mailbox command
> >   cxl/trace: Add TRACE support for CXL media-error records
> >   cxl/memdev: Add trigger_poison_list sysfs attribute
> >   cxl/region: Add trigger_poison_list sysfs attribute
> >   tools/testing/cxl: Mock support for Get Poison List
> > 
> >  Documentation/ABI/testing/sysfs-bus-cxl | 28 +++++++++
> >  drivers/cxl/core/mbox.c                 | 78 +++++++++++++++++++++++
> >  drivers/cxl/core/memdev.c               | 45 ++++++++++++++
> >  drivers/cxl/core/region.c               | 33 ++++++++++
> >  drivers/cxl/core/trace.h                | 83 +++++++++++++++++++++++++
> >  drivers/cxl/cxlmem.h                    | 69 +++++++++++++++++++-
> >  drivers/cxl/pci.c                       |  4 ++
> >  tools/testing/cxl/test/mem.c            | 42 +++++++++++++
> >  8 files changed, 381 insertions(+), 1 deletion(-)
> > 
> > 
> > base-commit: 589c3357370a596ef7c99c00baca8ac799fce531
> > -- 
> > 2.37.3
> > 
> 
> 

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-01-27 16:10   ` Alison Schofield
@ 2023-01-27 19:16     ` Dan Williams
  2023-01-27 21:36       ` Re: Alison Schofield
  0 siblings, 1 reply; 1546+ messages in thread
From: Dan Williams @ 2023-01-27 19:16 UTC (permalink / raw)
  To: Alison Schofield, Dan Williams
  Cc: Ira Weiny, Vishal Verma, Dave Jiang, Ben Widawsky, Steven Rostedt,
	linux-cxl, linux-kernel

Alison Schofield wrote:
> On Thu, Jan 26, 2023 at 05:59:03PM -0800, Dan Williams wrote:
> > alison.schofield@ wrote:
> > > From: Alison Schofield <alison.schofield@intel.com>
> > > 
> > > Subject: [PATCH v5 0/5] CXL Poison List Retrieval & Tracing
> > > 
> > > Changes in v5:
> > > - Rebase on cxl/next 
> > > - Use struct_size() to calc mbox cmd payload .min_out
> > > - s/INTERNAL/INJECTED mocked poison record source
> > > - Added Jonathan Reviewed-by tag on Patch 3
> > > 
> > > Link to v4:
> > > https://lore.kernel.org/linux-cxl/cover.1671135967.git.alison.schofield@intel.com/
> > > 
> > > Add support for retrieving device poison lists and store the returned
> > > error records as kernel trace events.
> > > 
> > > The handling of the poison list is guided by the CXL 3.0 Specification
> > > Section 8.2.9.8.4.1. [1] 
> > > 
> > > Example, triggered by memdev:
> > > $ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
> > > cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > 
> > I think the pcidev= field wants to be called something like "host" or
> > "parent", because there is no strict requirement that a 'struct
> > cxl_memdev' is related to a 'struct pci_dev'. In fact in that example
> > "cxl_mem.3" is a 'struct platform_device'. Now that I think about it, I
> > think all CXL device events should be emitting the PCIe serial number
> > for the memdev.
> ]
> 
> Will do, 'host' and add PCIe serial no.
> 
> > 
> > I will look in the implementation, but do region= and region_uuid= get
> > populated when mem3 is a member of the region?
> 
> Not always.
> In the case above, where the trigger was by memdev, no.
> Region= and region_uuid= (and in the follow-on patch, hpa=) only get
> populated if the poison was triggered by region, like the case below.
> 
> It could be looked up for the by memdev cases. Is that wanted?

Just trying to understand the semantics. However, I do think it makes sense
for a memdev trigger to lookup information on all impacted regions
across all of the device's DPA and the region trigger makes sense to
lookup all memdevs, but bounded by the DPA that contributes to that
region. I just want to avoid someone having to trigger the region to get
extra information that was readily available from a memdev listing.

> 
> Thanks for the reviews Dan!
> > 
> > > 
> > > Example, triggered by region:
> > > $ echo 1 > /sys/bus/cxl/devices/region5/trigger_poison_list
> > > cxl_poison: memdev=mem0 pcidev=cxl_mem.0 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > cxl_poison: memdev=mem1 pcidev=cxl_mem.1 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > 
> > > [1]: https://www.computeexpresslink.org/download-the-specification
> > > 
> > > Alison Schofield (5):
> > >   cxl/mbox: Add GET_POISON_LIST mailbox command
> > >   cxl/trace: Add TRACE support for CXL media-error records
> > >   cxl/memdev: Add trigger_poison_list sysfs attribute
> > >   cxl/region: Add trigger_poison_list sysfs attribute
> > >   tools/testing/cxl: Mock support for Get Poison List
> > > 
> > >  Documentation/ABI/testing/sysfs-bus-cxl | 28 +++++++++
> > >  drivers/cxl/core/mbox.c                 | 78 +++++++++++++++++++++++
> > >  drivers/cxl/core/memdev.c               | 45 ++++++++++++++
> > >  drivers/cxl/core/region.c               | 33 ++++++++++
> > >  drivers/cxl/core/trace.h                | 83 +++++++++++++++++++++++++
> > >  drivers/cxl/cxlmem.h                    | 69 +++++++++++++++++++-
> > >  drivers/cxl/pci.c                       |  4 ++
> > >  tools/testing/cxl/test/mem.c            | 42 +++++++++++++
> > >  8 files changed, 381 insertions(+), 1 deletion(-)
> > > 
> > > 
> > > base-commit: 589c3357370a596ef7c99c00baca8ac799fce531
> > > -- 
> > > 2.37.3
> > > 
> > 
> > 



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-01-27 19:16     ` Re: Dan Williams
@ 2023-01-27 21:36       ` Alison Schofield
  2023-01-27 22:04         ` Re: Dan Williams
  0 siblings, 1 reply; 1546+ messages in thread
From: Alison Schofield @ 2023-01-27 21:36 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Vishal Verma, Dave Jiang, Ben Widawsky, Steven Rostedt,
	linux-cxl, linux-kernel

On Fri, Jan 27, 2023 at 11:16:49AM -0800, Dan Williams wrote:
> Alison Schofield wrote:
> > On Thu, Jan 26, 2023 at 05:59:03PM -0800, Dan Williams wrote:
> > > alison.schofield@ wrote:
> > > > From: Alison Schofield <alison.schofield@intel.com>
> > > > 
> > > > Subject: [PATCH v5 0/5] CXL Poison List Retrieval & Tracing
> > > > 
> > > > Changes in v5:
> > > > - Rebase on cxl/next 
> > > > - Use struct_size() to calc mbox cmd payload .min_out
> > > > - s/INTERNAL/INJECTED mocked poison record source
> > > > - Added Jonathan Reviewed-by tag on Patch 3
> > > > 
> > > > Link to v4:
> > > > https://lore.kernel.org/linux-cxl/cover.1671135967.git.alison.schofield@intel.com/
> > > > 
> > > > Add support for retrieving device poison lists and store the returned
> > > > error records as kernel trace events.
> > > > 
> > > > The handling of the poison list is guided by the CXL 3.0 Specification
> > > > Section 8.2.9.8.4.1. [1] 
> > > > 
> > > > Example, triggered by memdev:
> > > > $ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
> > > > cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > 
> > > I think the pcidev= field wants to be called something like "host" or
> > > "parent", because there is no strict requirement that a 'struct
> > > cxl_memdev' is related to a 'struct pci_dev'. In fact in that example
> > > "cxl_mem.3" is a 'struct platform_device'. Now that I think about it, I
> > > think all CXL device events should be emitting the PCIe serial number
> > > for the memdev.
> > ]
> > 
> > Will do, 'host' and add PCIe serial no.
> > 
> > > 
> > > I will look in the implementation, but do region= and region_uuid= get
> > > populated when mem3 is a member of the region?
> > 
> > Not always.
> > In the case above, where the trigger was by memdev, no.
> > Region= and region_uuid= (and in the follow-on patch, hpa=) only get
> > populated if the poison was triggered by region, like the case below.
> > 
> > It could be looked up for the by memdev cases. Is that wanted?
> 
> Just trying to understand the semantics. However, I do think it makes sense
> for a memdev trigger to lookup information on all impacted regions
> across all of the device's DPA and the region trigger makes sense to
> lookup all memdevs, but bounded by the DPA that contributes to that
> region. I just want to avoid someone having to trigger the region to get
> extra information that was readily available from a memdev listing.
> 

Dan - 

Confirming my take-away from this email, and our chat:

Remove the by-region trigger_poison_list option entirely. User space
needs to trigger by-memdev the memdevs participating in the region and
filter those events by region.

Add the region info (region name, uuid) to the TRACE_EVENTs when the
poisoned DPA is part of any region.

Alison

> > 
> > Thanks for the reviews Dan!
> > > 
> > > > 
> > > > Example, triggered by region:
> > > > $ echo 1 > /sys/bus/cxl/devices/region5/trigger_poison_list
> > > > cxl_poison: memdev=mem0 pcidev=cxl_mem.0 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > > cxl_poison: memdev=mem1 pcidev=cxl_mem.1 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > > 
> > > > [1]: https://www.computeexpresslink.org/download-the-specification
> > > > 
> > > > Alison Schofield (5):
> > > >   cxl/mbox: Add GET_POISON_LIST mailbox command
> > > >   cxl/trace: Add TRACE support for CXL media-error records
> > > >   cxl/memdev: Add trigger_poison_list sysfs attribute
> > > >   cxl/region: Add trigger_poison_list sysfs attribute
> > > >   tools/testing/cxl: Mock support for Get Poison List
> > > > 
> > > >  Documentation/ABI/testing/sysfs-bus-cxl | 28 +++++++++
> > > >  drivers/cxl/core/mbox.c                 | 78 +++++++++++++++++++++++
> > > >  drivers/cxl/core/memdev.c               | 45 ++++++++++++++
> > > >  drivers/cxl/core/region.c               | 33 ++++++++++
> > > >  drivers/cxl/core/trace.h                | 83 +++++++++++++++++++++++++
> > > >  drivers/cxl/cxlmem.h                    | 69 +++++++++++++++++++-
> > > >  drivers/cxl/pci.c                       |  4 ++
> > > >  tools/testing/cxl/test/mem.c            | 42 +++++++++++++
> > > >  8 files changed, 381 insertions(+), 1 deletion(-)
> > > > 
> > > > 
> > > > base-commit: 589c3357370a596ef7c99c00baca8ac799fce531
> > > > -- 
> > > > 2.37.3
> > > > 
> > > 
> > > 
> 
> 

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-01-27 21:36       ` Re: Alison Schofield
@ 2023-01-27 22:04         ` Dan Williams
  0 siblings, 0 replies; 1546+ messages in thread
From: Dan Williams @ 2023-01-27 22:04 UTC (permalink / raw)
  To: Alison Schofield, Dan Williams
  Cc: Ira Weiny, Vishal Verma, Dave Jiang, Ben Widawsky, Steven Rostedt,
	linux-cxl, linux-kernel

Alison Schofield wrote:
> On Fri, Jan 27, 2023 at 11:16:49AM -0800, Dan Williams wrote:
> > Alison Schofield wrote:
> > > On Thu, Jan 26, 2023 at 05:59:03PM -0800, Dan Williams wrote:
> > > > alison.schofield@ wrote:
> > > > > From: Alison Schofield <alison.schofield@intel.com>
> > > > > 
> > > > > Subject: [PATCH v5 0/5] CXL Poison List Retrieval & Tracing
> > > > > 
> > > > > Changes in v5:
> > > > > - Rebase on cxl/next 
> > > > > - Use struct_size() to calc mbox cmd payload .min_out
> > > > > - s/INTERNAL/INJECTED mocked poison record source
> > > > > - Added Jonathan Reviewed-by tag on Patch 3
> > > > > 
> > > > > Link to v4:
> > > > > https://lore.kernel.org/linux-cxl/cover.1671135967.git.alison.schofield@intel.com/
> > > > > 
> > > > > Add support for retrieving device poison lists and store the returned
> > > > > error records as kernel trace events.
> > > > > 
> > > > > The handling of the poison list is guided by the CXL 3.0 Specification
> > > > > Section 8.2.9.8.4.1. [1] 
> > > > > 
> > > > > Example, triggered by memdev:
> > > > > $ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
> > > > > cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > > 
> > > > I think the pcidev= field wants to be called something like "host" or
> > > > "parent", because there is no strict requirement that a 'struct
> > > > cxl_memdev' is related to a 'struct pci_dev'. In fact in that example
> > > > "cxl_mem.3" is a 'struct platform_device'. Now that I think about it, I
> > > > think all CXL device events should be emitting the PCIe serial number
> > > > for the memdev.
> > > ]
> > > 
> > > Will do, 'host' and add PCIe serial no.
> > > 
> > > > 
> > > > I will look in the implementation, but do region= and region_uuid= get
> > > > populated when mem3 is a member of the region?
> > > 
> > > Not always.
> > > In the case above, where the trigger was by memdev, no.
> > > Region= and region_uuid= (and in the follow-on patch, hpa=) only get
> > > populated if the poison was triggered by region, like the case below.
> > > 
> > > It could be looked up for the by memdev cases. Is that wanted?
> > 
> > Just trying to understand the semantics. However, I do think it makes sense
> > for a memdev trigger to lookup information on all impacted regions
> > across all of the device's DPA and the region trigger makes sense to
> > lookup all memdevs, but bounded by the DPA that contributes to that
> > region. I just want to avoid someone having to trigger the region to get
> > extra information that was readily available from a memdev listing.
> > 
> 
> Dan - 
> 
> Confirming my take-away from this email, and our chat:
> 
> Remove the by-region trigger_poison_list option entirely. User space
> needs to trigger by-memdev the memdevs participating in the region and
> filter those events by region.
> 
> Add the region info (region name, uuid) to the TRACE_EVENTs when the
> poisoned DPA is part of any region.

That's what I was thinking, yes. So the internals of
cxl_mem_get_poison() will take the cxl_region_rwsem for read and compare
the device's endpoint decoder settings against the media error records
to do the region (and later HPA) lookup.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2023-02-28  6:32 Mahmut Akten
  0 siblings, 0 replies; 1546+ messages in thread
From: Mahmut Akten @ 2023-02-28  6:32 UTC (permalink / raw)
  To: stable

Hello

I need your urgent response to a transaction request attached to your name/email stable@vger.kernel.org I would like to discuss with you now. 

Thank You
Mahmut Akten
Vice Chairman
Garanti BBVA Bank (Turkey)
www.garantibbva.com.tr

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-03-27 13:54 ` Yaroslav Furman
@ 2023-03-27 14:19   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 1546+ messages in thread
From: Greg Kroah-Hartman @ 2023-03-27 14:19 UTC (permalink / raw)
  To: Yaroslav Furman; +Cc: Alan Stern, linux-usb, usb-storage, linux-kernel

On Mon, Mar 27, 2023 at 04:54:22PM +0300, Yaroslav Furman wrote:
> 
> Will this patch get ported to LTS trees? It applies cleanly.
> Would love to see it in 6.1 and 5.15 trees.

What patch?

confused,

greg k-h

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-05-11 12:58 Ryan Roberts
@ 2023-05-11 13:13 ` Ryan Roberts
  0 siblings, 0 replies; 1546+ messages in thread
From: Ryan Roberts @ 2023-05-11 13:13 UTC (permalink / raw)
  To: Andrew Morton, Matthew Wilcox (Oracle), Kirill A. Shutemov,
	SeongJae Park
  Cc: linux-kernel, linux-mm, damon

My appologies for the noise: A blank line between Cc and Subject has broken the
subject and grouping in lore.

Please Ignore this, I will resend.


On 11/05/2023 13:58, Ryan Roberts wrote:
> Date: Thu, 11 May 2023 11:38:28 +0100
> Subject: [PATCH v1 0/5] Encapsulate PTE contents from non-arch code
> 
> Hi All,
> 
> This series improves the encapsulation of pte entries by disallowing non-arch
> code from directly dereferencing pte_t pointers. Instead code must use a new
> helper, `pte_t ptep_deref(pte_t *ptep)`. By default, this helper does a direct
> dereference of the pointer, so generated code should be exactly the same. But
> it's presence sets us up for arch code being able to override the default to
> "virtualize" the ptes without needing to maintain a shadow table.
> 
> I intend to take advantage of this for arm64 to enable use of its "contiguous
> bit" to coalesce multiple ptes into a single tlb entry, reducing pressure and
> improving performance. I have an RFC for the first part of this work at [1]. The
> cover letter there also explains the second part, which this series is enabling.
> 
> I intend to post an RFC for the contpte changes in due course, but it would be
> good to get the ball rolling on this enabler.
> 
> There are 2 reasons that I need the encapsulation:
> 
>   - Prevent leaking the arch-private PTE_CONT bit to the core code. If the core
>     code reads a pte that contains this bit, it could end up calling
>     set_pte_at() with the bit set which would confuse the implementation. So we
>     can always clear PTE_CONT in ptep_deref() (and ptep_get()) to avoid a leaky
>     abstraction.
>   - Contiguous ptes have a single access and dirty bit for the contiguous range.
>     So we need to "mix-in" those bits when the core is dereferencing a pte that
>     lies in the contig range. There is code that dereferences the pte then takes
>     different actions based on access/dirty (see e.g. write_protect_page()).
> 
> While ptep_get() and ptep_get_lockless() already exist, both of them are
> implemented using READ_ONCE() by default. While we could use ptep_get() instead
> of the new ptep_deref(), I didn't want to risk performance regression.
> Alternatively, all call sites that currently use ptep_get() that need the
> lockless behaviour could be upgraded to ptep_get_lockless() and ptep_get() could
> be downgraded to a simple dereference. That would be cleanest, but is a much
> bigger (and likely error prone) change because all the arch code would need to
> be updated for the new definitions of ptep_get().
> 
> The series is split up as follows:
> 
> patchs 1-2: Fix bugs where code was _setting_ ptes directly, rather than using
>             set_pte_at() and friends.
> patch 3:    Fix highmem unmapping issue I spotted while doing the work.
> patch 4:    Introduce the new ptep_deref() helper with default implementation.
> patch 5:    Convert all direct dereferences to use ptep_deref().
> 
> [1] https://lore.kernel.org/linux-mm/20230414130303.2345383-1-ryan.roberts@arm.com/
> 
> Thanks,
> Ryan
> 
> 
> Ryan Roberts (5):
>   mm: vmalloc must set pte via arch code
>   mm: damon must atomically clear young on ptes and pmds
>   mm: Fix failure to unmap pte on highmem systems
>   mm: Add new ptep_deref() helper to fully encapsulate pte_t
>   mm: ptep_deref() conversion
> 
>  .../drm/i915/gem/selftests/i915_gem_mman.c    |   8 +-
>  drivers/misc/sgi-gru/grufault.c               |   2 +-
>  drivers/vfio/vfio_iommu_type1.c               |   7 +-
>  drivers/xen/privcmd.c                         |   2 +-
>  fs/proc/task_mmu.c                            |  33 +++---
>  fs/userfaultfd.c                              |   6 +-
>  include/linux/hugetlb.h                       |   2 +-
>  include/linux/mm_inline.h                     |   2 +-
>  include/linux/pgtable.h                       |  13 ++-
>  kernel/events/uprobes.c                       |   2 +-
>  mm/damon/ops-common.c                         |  18 ++-
>  mm/damon/ops-common.h                         |   4 +-
>  mm/damon/paddr.c                              |   6 +-
>  mm/damon/vaddr.c                              |  14 ++-
>  mm/filemap.c                                  |   2 +-
>  mm/gup.c                                      |  21 ++--
>  mm/highmem.c                                  |  12 +-
>  mm/hmm.c                                      |   2 +-
>  mm/huge_memory.c                              |   4 +-
>  mm/hugetlb.c                                  |   2 +-
>  mm/hugetlb_vmemmap.c                          |   6 +-
>  mm/kasan/init.c                               |   9 +-
>  mm/kasan/shadow.c                             |  10 +-
>  mm/khugepaged.c                               |  24 ++--
>  mm/ksm.c                                      |  22 ++--
>  mm/madvise.c                                  |   6 +-
>  mm/mapping_dirty_helpers.c                    |   4 +-
>  mm/memcontrol.c                               |   4 +-
>  mm/memory-failure.c                           |   6 +-
>  mm/memory.c                                   | 103 +++++++++---------
>  mm/mempolicy.c                                |   6 +-
>  mm/migrate.c                                  |  14 ++-
>  mm/migrate_device.c                           |  14 ++-
>  mm/mincore.c                                  |   2 +-
>  mm/mlock.c                                    |   6 +-
>  mm/mprotect.c                                 |   8 +-
>  mm/mremap.c                                   |   2 +-
>  mm/page_table_check.c                         |   4 +-
>  mm/page_vma_mapped.c                          |  26 +++--
>  mm/pgtable-generic.c                          |   2 +-
>  mm/rmap.c                                     |  32 +++---
>  mm/sparse-vmemmap.c                           |   8 +-
>  mm/swap_state.c                               |   4 +-
>  mm/swapfile.c                                 |  16 +--
>  mm/userfaultfd.c                              |   4 +-
>  mm/vmalloc.c                                  |  11 +-
>  mm/vmscan.c                                   |  14 ++-
>  virt/kvm/kvm_main.c                           |   9 +-
>  48 files changed, 302 insertions(+), 236 deletions(-)
> 
> --
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-05-20  9:47 ` Ze Gao
@ 2023-05-21  3:58   ` Yonghong Song
  2023-05-21 15:10     ` Re: Ze Gao
  2023-05-21  8:08   ` Re: Jiri Olsa
  1 sibling, 1 reply; 1546+ messages in thread
From: Yonghong Song @ 2023-05-21  3:58 UTC (permalink / raw)
  To: Ze Gao, jolsa
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
	John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
	Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao



On 5/20/23 2:47 AM, Ze Gao wrote:
> 
> Hi Jiri,
> 
> Would you like to consider to add rcu_is_watching check in
> to solve this from the viewpoint of kprobe_multi_link_prog_run
> itself? And accounting of missed runs can be added as well
> to imporve observability.
> 
> Regards,
> Ze
> 
> 
> -----------------
>  From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> From: Ze Gao <zegao@tencent.com>
> Date: Sat, 20 May 2023 17:32:05 +0800
> Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> 
>  From the perspective of kprobe_multi_link_prog_run, any traceable
> functions can be attached while bpf progs need specical care and
> ought to be under rcu protection. To solve the likely rcu lockdep
> warns once for good, when (future) functions in idle path were
> attached accidentally, we better paying some cost to check at least
> in kernel-side, and return when rcu is not watching, which helps
> to avoid any unpredictable results.

kprobe_multi/fprobe share the same set of attachments with fentry.
Currently, fentry does not filter with !rcu_is_watching, maybe
because this is an extreme corner case. Not sure whether it is
worthwhile or not.

Maybe if you can give a concrete example (e.g., attachment point)
with current code base to show what the issue you encountered and
it will make it easier to judge whether adding !rcu_is_watching()
is necessary or not.

> 
> Signed-off-by: Ze Gao <zegao@tencent.com>
> ---
>   kernel/trace/bpf_trace.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 9a050e36dc6c..3e6ea7274765 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
>   	struct bpf_run_ctx *old_run_ctx;
>   	int err;
>   
> -	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> +	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
>   		err = 0;
>   		goto out;
>   	}

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-05-20  9:47 ` Ze Gao
  2023-05-21  3:58   ` Yonghong Song
@ 2023-05-21  8:08   ` Jiri Olsa
  2023-05-21 10:09     ` Re: Masami Hiramatsu
  1 sibling, 1 reply; 1546+ messages in thread
From: Jiri Olsa @ 2023-05-21  8:08 UTC (permalink / raw)
  To: Ze Gao
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
	John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
	Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
> 
> Hi Jiri,
> 
> Would you like to consider to add rcu_is_watching check in
> to solve this from the viewpoint of kprobe_multi_link_prog_run

I think this was discussed in here:
  https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/

and was considered a bug, there's fix mentioned later in the thread

there's also this recent patchset:
  https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/

that solves related problems

> itself? And accounting of missed runs can be added as well
> to imporve observability.

right, we count fprobe->nmissed but it's not exposed, we should allow
to get 'missed' stats from both fprobe and kprobe_multi later, which
is missing now, will check

thanks,
jirka

> 
> Regards,
> Ze
> 
> 
> -----------------
> From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> From: Ze Gao <zegao@tencent.com>
> Date: Sat, 20 May 2023 17:32:05 +0800
> Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> 
> From the perspective of kprobe_multi_link_prog_run, any traceable
> functions can be attached while bpf progs need specical care and
> ought to be under rcu protection. To solve the likely rcu lockdep
> warns once for good, when (future) functions in idle path were
> attached accidentally, we better paying some cost to check at least
> in kernel-side, and return when rcu is not watching, which helps
> to avoid any unpredictable results.
> 
> Signed-off-by: Ze Gao <zegao@tencent.com>
> ---
>  kernel/trace/bpf_trace.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 9a050e36dc6c..3e6ea7274765 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
>  	struct bpf_run_ctx *old_run_ctx;
>  	int err;
>  
> -	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> +	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
>  		err = 0;
>  		goto out;
>  	}
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-05-21  8:08   ` Re: Jiri Olsa
@ 2023-05-21 10:09     ` Masami Hiramatsu
  2023-05-21 14:19       ` Re: Ze Gao
  0 siblings, 1 reply; 1546+ messages in thread
From: Masami Hiramatsu @ 2023-05-21 10:09 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ze Gao, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau,
	Masami Hiramatsu, Song Liu, Stanislav Fomichev, Steven Rostedt,
	Yonghong Song, bpf, linux-kernel, linux-trace-kernel, kafai,
	kpsingh, netdev, paulmck, songliubraving, Ze Gao

On Sun, 21 May 2023 10:08:46 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:

> On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
> > 
> > Hi Jiri,
> > 
> > Would you like to consider to add rcu_is_watching check in
> > to solve this from the viewpoint of kprobe_multi_link_prog_run
> 
> I think this was discussed in here:
>   https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/
> 
> and was considered a bug, there's fix mentioned later in the thread
> 
> there's also this recent patchset:
>   https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/
> 
> that solves related problems

I think this rcu_is_watching() is a bit different issue. This rcu_is_watching()
check is required if the kprobe_multi_link_prog_run() uses any RCU API.
E.g. rethook_try_get() is also checks rcu_is_watching() because it uses
call_rcu().

Thank you,

> 
> > itself? And accounting of missed runs can be added as well
> > to imporve observability.
> 
> right, we count fprobe->nmissed but it's not exposed, we should allow
> to get 'missed' stats from both fprobe and kprobe_multi later, which
> is missing now, will check
> 
> thanks,
> jirka
> 
> > 
> > Regards,
> > Ze
> > 
> > 
> > -----------------
> > From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> > From: Ze Gao <zegao@tencent.com>
> > Date: Sat, 20 May 2023 17:32:05 +0800
> > Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> > 
> > From the perspective of kprobe_multi_link_prog_run, any traceable
> > functions can be attached while bpf progs need specical care and
> > ought to be under rcu protection. To solve the likely rcu lockdep
> > warns once for good, when (future) functions in idle path were
> > attached accidentally, we better paying some cost to check at least
> > in kernel-side, and return when rcu is not watching, which helps
> > to avoid any unpredictable results.
> > 
> > Signed-off-by: Ze Gao <zegao@tencent.com>
> > ---
> >  kernel/trace/bpf_trace.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 9a050e36dc6c..3e6ea7274765 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
> >  	struct bpf_run_ctx *old_run_ctx;
> >  	int err;
> >  
> > -	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> > +	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
> >  		err = 0;
> >  		goto out;
> >  	}
> > -- 
> > 2.40.1
> > 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-05-21 10:09     ` Re: Masami Hiramatsu
@ 2023-05-21 14:19       ` Ze Gao
  0 siblings, 0 replies; 1546+ messages in thread
From: Ze Gao @ 2023-05-21 14:19 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Jiri Olsa, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau, Song Liu,
	Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Sun, May 21, 2023 at 6:09 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Sun, 21 May 2023 10:08:46 +0200
> Jiri Olsa <olsajiri@gmail.com> wrote:
>
> > On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
> > >
> > > Hi Jiri,
> > >
> > > Would you like to consider to add rcu_is_watching check in
> > > to solve this from the viewpoint of kprobe_multi_link_prog_run
> >
> > I think this was discussed in here:
> >   https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/
> >
> > and was considered a bug, there's fix mentioned later in the thread
> >
> > there's also this recent patchset:
> >   https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/
> >
> > that solves related problems
>
> I think this rcu_is_watching() is a bit different issue. This rcu_is_watching()
> check is required if the kprobe_multi_link_prog_run() uses any RCU API.
> E.g. rethook_try_get() is also checks rcu_is_watching() because it uses
> call_rcu().

Yes, that's my point!

Regards,
Ze

>
> >
> > > itself? And accounting of missed runs can be added as well
> > > to imporve observability.
> >
> > right, we count fprobe->nmissed but it's not exposed, we should allow
> > to get 'missed' stats from both fprobe and kprobe_multi later, which
> > is missing now, will check
> >
> > thanks,
> > jirka
> >
> > >
> > > Regards,
> > > Ze
> > >
> > >
> > > -----------------
> > > From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> > > From: Ze Gao <zegao@tencent.com>
> > > Date: Sat, 20 May 2023 17:32:05 +0800
> > > Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> > >
> > > From the perspective of kprobe_multi_link_prog_run, any traceable
> > > functions can be attached while bpf progs need specical care and
> > > ought to be under rcu protection. To solve the likely rcu lockdep
> > > warns once for good, when (future) functions in idle path were
> > > attached accidentally, we better paying some cost to check at least
> > > in kernel-side, and return when rcu is not watching, which helps
> > > to avoid any unpredictable results.
> > >
> > > Signed-off-by: Ze Gao <zegao@tencent.com>
> > > ---
> > >  kernel/trace/bpf_trace.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > > index 9a050e36dc6c..3e6ea7274765 100644
> > > --- a/kernel/trace/bpf_trace.c
> > > +++ b/kernel/trace/bpf_trace.c
> > > @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
> > >     struct bpf_run_ctx *old_run_ctx;
> > >     int err;
> > >
> > > -   if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> > > +   if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
> > >             err = 0;
> > >             goto out;
> > >     }
> > > --
> > > 2.40.1
> > >
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-05-21  3:58   ` Yonghong Song
@ 2023-05-21 15:10     ` Ze Gao
  2023-05-21 20:26       ` Re: Jiri Olsa
  0 siblings, 1 reply; 1546+ messages in thread
From: Ze Gao @ 2023-05-21 15:10 UTC (permalink / raw)
  To: Yonghong Song
  Cc: jolsa, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau,
	Masami Hiramatsu, Song Liu, Stanislav Fomichev, Steven Rostedt,
	Yonghong Song, bpf, linux-kernel, linux-trace-kernel, kafai,
	kpsingh, netdev, paulmck, songliubraving, Ze Gao

> kprobe_multi/fprobe share the same set of attachments with fentry.
> Currently, fentry does not filter with !rcu_is_watching, maybe
> because this is an extreme corner case. Not sure whether it is
> worthwhile or not.

Agreed, it's rare, especially after Peter's patches which push narrow
down rcu eqs regions
in the idle path and reduce the chance of any traceable functions
happening in between.

However, from RCU's perspective, we ought to check if rcu_is_watching
theoretically
when there's a chance our code will run in the idle path and also we
need rcu to be alive,
And also we cannot simply make assumptions for any future changes in
the idle path.
You know, just like what was hit in the thread.

> Maybe if you can give a concrete example (e.g., attachment point)
> with current code base to show what the issue you encountered and
> it will make it easier to judge whether adding !rcu_is_watching()
> is necessary or not.

I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
traceable but not on the latest version
so far. But as I state above, in theory we need it. So here is a
gentle ping :) .

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-05-21 15:10     ` Re: Ze Gao
@ 2023-05-21 20:26       ` Jiri Olsa
  2023-05-22  1:36         ` Re: Masami Hiramatsu
  2023-05-22  2:07         ` Re: Ze Gao
  0 siblings, 2 replies; 1546+ messages in thread
From: Jiri Olsa @ 2023-05-21 20:26 UTC (permalink / raw)
  To: Ze Gao
  Cc: Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Sun, May 21, 2023 at 11:10:16PM +0800, Ze Gao wrote:
> > kprobe_multi/fprobe share the same set of attachments with fentry.
> > Currently, fentry does not filter with !rcu_is_watching, maybe
> > because this is an extreme corner case. Not sure whether it is
> > worthwhile or not.
> 
> Agreed, it's rare, especially after Peter's patches which push narrow
> down rcu eqs regions
> in the idle path and reduce the chance of any traceable functions
> happening in between.
> 
> However, from RCU's perspective, we ought to check if rcu_is_watching
> theoretically
> when there's a chance our code will run in the idle path and also we
> need rcu to be alive,
> And also we cannot simply make assumptions for any future changes in
> the idle path.
> You know, just like what was hit in the thread.
> 
> > Maybe if you can give a concrete example (e.g., attachment point)
> > with current code base to show what the issue you encountered and
> > it will make it easier to judge whether adding !rcu_is_watching()
> > is necessary or not.
> 
> I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
> traceable but not on the latest version
> so far. But as I state above, in theory we need it. So here is a
> gentle ping :) .

hum, this change [1] added rcu_is_watching check to ftrace_test_recursion_trylock,
which we use in fprobe_handler and is coming to fprobe_exit_handler in [2]

I might be missing something, but it seems like we don't need another
rcu_is_watching call on kprobe_multi level

jirka


[1] d099dbfd3306 cpuidle: tracing: Warn about !rcu_is_watching()
[2] https://lore.kernel.org/bpf/20230517034510.15639-4-zegao@tencent.com/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-05-21 20:26       ` Re: Jiri Olsa
@ 2023-05-22  1:36         ` Masami Hiramatsu
  2023-05-22  2:07         ` Re: Ze Gao
  1 sibling, 0 replies; 1546+ messages in thread
From: Masami Hiramatsu @ 2023-05-22  1:36 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ze Gao, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Sun, 21 May 2023 22:26:37 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:

> On Sun, May 21, 2023 at 11:10:16PM +0800, Ze Gao wrote:
> > > kprobe_multi/fprobe share the same set of attachments with fentry.
> > > Currently, fentry does not filter with !rcu_is_watching, maybe
> > > because this is an extreme corner case. Not sure whether it is
> > > worthwhile or not.
> > 
> > Agreed, it's rare, especially after Peter's patches which push narrow
> > down rcu eqs regions
> > in the idle path and reduce the chance of any traceable functions
> > happening in between.
> > 
> > However, from RCU's perspective, we ought to check if rcu_is_watching
> > theoretically
> > when there's a chance our code will run in the idle path and also we
> > need rcu to be alive,
> > And also we cannot simply make assumptions for any future changes in
> > the idle path.
> > You know, just like what was hit in the thread.
> > 
> > > Maybe if you can give a concrete example (e.g., attachment point)
> > > with current code base to show what the issue you encountered and
> > > it will make it easier to judge whether adding !rcu_is_watching()
> > > is necessary or not.
> > 
> > I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
> > traceable but not on the latest version
> > so far. But as I state above, in theory we need it. So here is a
> > gentle ping :) .
> 
> hum, this change [1] added rcu_is_watching check to ftrace_test_recursion_trylock,
> which we use in fprobe_handler and is coming to fprobe_exit_handler in [2]
> 
> I might be missing something, but it seems like we don't need another
> rcu_is_watching call on kprobe_multi level

Good point! OK, then it seems we don't need it. The rethook continues to
use the rcu_is_watching() because it is also used from kprobes, but the
kprobe_multi doesn't need it.

Thank you,

> 
> jirka
> 
> 
> [1] d099dbfd3306 cpuidle: tracing: Warn about !rcu_is_watching()
> [2] https://lore.kernel.org/bpf/20230517034510.15639-4-zegao@tencent.com/


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-05-21 20:26       ` Re: Jiri Olsa
  2023-05-22  1:36         ` Re: Masami Hiramatsu
@ 2023-05-22  2:07         ` Ze Gao
  2023-05-23  4:38           ` Re: Yonghong Song
  2023-05-23  5:30           ` Re: Masami Hiramatsu
  1 sibling, 2 replies; 1546+ messages in thread
From: Ze Gao @ 2023-05-22  2:07 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

Oops, I missed that. Thanks for pointing that out, which I thought is
conditional use of rcu_is_watching before.

One last point, I think we should double check on this
     "fentry does not filter with !rcu_is_watching"
as quoted from Yonghong and argue whether it needs
the same check for fentry as well.

Regards,
Ze

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-05-22  2:07         ` Re: Ze Gao
@ 2023-05-23  4:38           ` Yonghong Song
  2023-05-23  5:30           ` Re: Masami Hiramatsu
  1 sibling, 0 replies; 1546+ messages in thread
From: Yonghong Song @ 2023-05-23  4:38 UTC (permalink / raw)
  To: Ze Gao, Jiri Olsa
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
	John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
	Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao



On 5/21/23 7:07 PM, Ze Gao wrote:
> Oops, I missed that. Thanks for pointing that out, which I thought is
> conditional use of rcu_is_watching before.
> 
> One last point, I think we should double check on this
>       "fentry does not filter with !rcu_is_watching"
> as quoted from Yonghong and argue whether it needs
> the same check for fentry as well.

I would suggest that we address rcu_is_watching issue for fentry
only if we do have a reproducible case to show something goes wrong...

> 
> Regards,
> Ze

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-05-22  2:07         ` Re: Ze Gao
  2023-05-23  4:38           ` Re: Yonghong Song
@ 2023-05-23  5:30           ` Masami Hiramatsu
  2023-05-23  6:59             ` Re: Paul E. McKenney
  1 sibling, 1 reply; 1546+ messages in thread
From: Masami Hiramatsu @ 2023-05-23  5:30 UTC (permalink / raw)
  To: Ze Gao
  Cc: Jiri Olsa, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Mon, 22 May 2023 10:07:42 +0800
Ze Gao <zegao2021@gmail.com> wrote:

> Oops, I missed that. Thanks for pointing that out, which I thought is
> conditional use of rcu_is_watching before.
> 
> One last point, I think we should double check on this
>      "fentry does not filter with !rcu_is_watching"
> as quoted from Yonghong and argue whether it needs
> the same check for fentry as well.

rcu_is_watching() comment says;

 * if the current CPU is not in its idle loop or is in an interrupt or
 * NMI handler, return true.

Thus it returns *fault* if the current CPU is in the idle loop and not
any interrupt(including NMI) context. This means if any tracable function
is called from idle loop, it can be !rcu_is_watching(). I meant, this is
'context' based check, thus fentry can not filter out that some commonly
used functions is called from that context but it can be detected.

Thank you,

> 
> Regards,
> Ze


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-05-23  5:30           ` Re: Masami Hiramatsu
@ 2023-05-23  6:59             ` Paul E. McKenney
  2023-05-25  0:13               ` Re: Masami Hiramatsu
  0 siblings, 1 reply; 1546+ messages in thread
From: Paul E. McKenney @ 2023-05-23  6:59 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ze Gao, Jiri Olsa, Yonghong Song, Alexei Starovoitov,
	Andrii Nakryiko, Daniel Borkmann, Hao Luo, John Fastabend,
	KP Singh, Martin KaFai Lau, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, songliubraving,
	Ze Gao

On Tue, May 23, 2023 at 01:30:19PM +0800, Masami Hiramatsu wrote:
> On Mon, 22 May 2023 10:07:42 +0800
> Ze Gao <zegao2021@gmail.com> wrote:
> 
> > Oops, I missed that. Thanks for pointing that out, which I thought is
> > conditional use of rcu_is_watching before.
> > 
> > One last point, I think we should double check on this
> >      "fentry does not filter with !rcu_is_watching"
> > as quoted from Yonghong and argue whether it needs
> > the same check for fentry as well.
> 
> rcu_is_watching() comment says;
> 
>  * if the current CPU is not in its idle loop or is in an interrupt or
>  * NMI handler, return true.
> 
> Thus it returns *fault* if the current CPU is in the idle loop and not
> any interrupt(including NMI) context. This means if any tracable function
> is called from idle loop, it can be !rcu_is_watching(). I meant, this is
> 'context' based check, thus fentry can not filter out that some commonly
> used functions is called from that context but it can be detected.

It really does return false (rather than faulting?) if the current CPU
is deep within the idle loop.

In addition, the recent x86/entry rework (thank you Peter and
Thomas!) mean that the "idle loop" is quite restricted, as can be
seen by the invocations of ct_cpuidle_enter() and ct_cpuidle_exit().
For example, in default_idle_call(), these are immediately before and
after the call to arch_cpu_idle().

Would the following help?  Or am I missing your point?

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 1449cb69a0e0..fae9b4e29c93 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -679,10 +679,14 @@ static void rcu_disable_urgency_upon_qs(struct rcu_data *rdp)
 /**
  * rcu_is_watching - see if RCU thinks that the current CPU is not idle
  *
- * Return true if RCU is watching the running CPU, which means that this
- * CPU can safely enter RCU read-side critical sections.  In other words,
- * if the current CPU is not in its idle loop or is in an interrupt or
- * NMI handler, return true.
+ * Return @true if RCU is watching the running CPU and @false otherwise.
+ * An @true return means that this CPU can safely enter RCU read-side
+ * critical sections.
+ *
+ * More specifically, if the current CPU is not deep within its idle
+ * loop, return @true.  Note that rcu_is_watching() will return @true if
+ * invoked from an interrupt or NMI handler, even if that interrupt or
+ * NMI interrupted the CPU while it was deep within its idle loop.
  *
  * Make notrace because it can be called by the internal functions of
  * ftrace, and making this notrace removes unnecessary recursion calls.

^ permalink raw reply related	[flat|nested] 1546+ messages in thread

* Re:
  2023-05-23  6:59             ` Re: Paul E. McKenney
@ 2023-05-25  0:13               ` Masami Hiramatsu
  0 siblings, 0 replies; 1546+ messages in thread
From: Masami Hiramatsu @ 2023-05-25  0:13 UTC (permalink / raw)
  To: paulmck
  Cc: Ze Gao, Jiri Olsa, Yonghong Song, Alexei Starovoitov,
	Andrii Nakryiko, Daniel Borkmann, Hao Luo, John Fastabend,
	KP Singh, Martin KaFai Lau, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, songliubraving,
	Ze Gao

On Mon, 22 May 2023 23:59:28 -0700
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> On Tue, May 23, 2023 at 01:30:19PM +0800, Masami Hiramatsu wrote:
> > On Mon, 22 May 2023 10:07:42 +0800
> > Ze Gao <zegao2021@gmail.com> wrote:
> > 
> > > Oops, I missed that. Thanks for pointing that out, which I thought is
> > > conditional use of rcu_is_watching before.
> > > 
> > > One last point, I think we should double check on this
> > >      "fentry does not filter with !rcu_is_watching"
> > > as quoted from Yonghong and argue whether it needs
> > > the same check for fentry as well.
> > 
> > rcu_is_watching() comment says;
> > 
> >  * if the current CPU is not in its idle loop or is in an interrupt or
> >  * NMI handler, return true.
> > 
> > Thus it returns *fault* if the current CPU is in the idle loop and not
> > any interrupt(including NMI) context. This means if any tracable function
> > is called from idle loop, it can be !rcu_is_watching(). I meant, this is
> > 'context' based check, thus fentry can not filter out that some commonly
> > used functions is called from that context but it can be detected.
> 
> It really does return false (rather than faulting?) if the current CPU
> is deep within the idle loop.
> 
> In addition, the recent x86/entry rework (thank you Peter and
> Thomas!) mean that the "idle loop" is quite restricted, as can be
> seen by the invocations of ct_cpuidle_enter() and ct_cpuidle_exit().
> For example, in default_idle_call(), these are immediately before and
> after the call to arch_cpu_idle().

Thanks! I also found that the default_idle_call() is enough small and
it seems not happening on fentry because there are no commonly used
functions on that path.

> 
> Would the following help?  Or am I missing your point?

Yes, thank you for the update!

> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 1449cb69a0e0..fae9b4e29c93 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -679,10 +679,14 @@ static void rcu_disable_urgency_upon_qs(struct rcu_data *rdp)
>  /**
>   * rcu_is_watching - see if RCU thinks that the current CPU is not idle
>   *
> - * Return true if RCU is watching the running CPU, which means that this
> - * CPU can safely enter RCU read-side critical sections.  In other words,
> - * if the current CPU is not in its idle loop or is in an interrupt or
> - * NMI handler, return true.
> + * Return @true if RCU is watching the running CPU and @false otherwise.
> + * An @true return means that this CPU can safely enter RCU read-side
> + * critical sections.
> + *
> + * More specifically, if the current CPU is not deep within its idle
> + * loop, return @true.  Note that rcu_is_watching() will return @true if
> + * invoked from an interrupt or NMI handler, even if that interrupt or
> + * NMI interrupted the CPU while it was deep within its idle loop.
>   *
>   * Make notrace because it can be called by the internal functions of
>   * ftrace, and making this notrace removes unnecessary recursion calls.


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE;
@ 2023-05-30  1:31 Olena Shevchenko
  0 siblings, 0 replies; 1546+ messages in thread
From: Olena Shevchenko @ 2023-05-30  1:31 UTC (permalink / raw)
  To: soc

Hello,

I have funds for investment. Can we partner if you have a good business idea? 


Thank you
Mrs. Olena 

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE;
@ 2023-05-30  2:46 Olena Shevchenko
  0 siblings, 0 replies; 1546+ messages in thread
From: Olena Shevchenko @ 2023-05-30  2:46 UTC (permalink / raw)
  To: sparclinux

Hello,

I have funds for investment. Can we partner if you have a good business idea? 


Thank you
Mrs. Olena 

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <CAKEZqKKdQ9EhRobSmq0sV76arfpk6m5XqA-=XQP_M3VRG=M-eg@mail.gmail.com>
@ 2023-06-08  8:13 ` chenlei0x
  0 siblings, 0 replies; 1546+ messages in thread
From: chenlei0x @ 2023-06-08  8:13 UTC (permalink / raw)
  To: linux-xfs

unsubscribe linux-xfs

On Thu, Jun 8, 2023 at 4:11 PM chenlei0x <losemyheaven@gmail.com> wrote:
>
> unsubscribe

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]   ` <CAEhhANom-MGPCqEk5LXufMkxvnoY0YRUrr0r07s0_7F=eCQH5Q@mail.gmail.com>
@ 2023-06-08 10:51     ` Daniel Little
  0 siblings, 0 replies; 1546+ messages in thread
From: Daniel Little @ 2023-06-08 10:51 UTC (permalink / raw)
  To: linux-btrfs, support

[-- Attachment #1: Type: text/plain, Size: 2581 bytes --]

>>
>> Good Day,
>>
>> I’m sorry to message the developers this way. Im sure this is not the purpose of being able to contact developers, but I am pretty desperate here.
>>
>> I’m desperately seeking some "hands-on" assistance with my broken Rocstor setup. I have a lot of photos and videos on my drives that I cannot reproduce and would really like to retrieve.
>>
>> With my limited knowledge and skill I have tried as best I can to follow the suggestions made by Philip on my forum post (Disk Pool mounted, shared missing. many errors - #2 by phillxnet), but I’m no closer to success than when I started. Im sure its because Im not doing things right. If someone smarter than me is willing to offer their precious time to assist I am happy to set up remote access to the system for them to work/diagnose/troubleshoot directly. I will fit into your schedule whenever and whatever that may be. I’m willing to put on the dunce hat and be tarred and feathered and publicly mocked, so long as some kind souls help me to recover the data.
>>
>> I eagerly await and appreciate any assistance offered. I respectfully understand too if this is not something anyone wants to take on.
>>
>> SITREP:
>>
>> Rockstor 4.1.0-0 installed on a ESXI vm. Tried to get vmware-tools installed. followed a guide blindly. vm rebooted, all hell broke loose.
>>
>> “Parent transid verify failed… wanted 32616 found 32441”
>> Pool remounts automatically as read-only.
>>
>> OUTPUTS:
>>
>>
>>
>> uname -a
>>
>> Linux RocStor 5.3.18-150300.59.106-default #1 SMP Mon Dec 12 13:16:24 UTC 2022 (774239c) x86_64 x86_64 x86_64 GNU/Linux
>>
>>
>>
>> btrfs --version
>>
>> btrfs-progs v4.19.1
>>
>>
>>
>> btrfs fi show
>>
>> Label: ‘ROOT’     uuid: 4ac1b0f-afeb-4946-aad1-975a2a26c941
>>
>>                              Total devices 1 FS bytes used 4.65GiB
>>
>>                              Devid 1 size 47.93GiB used 5.80GiB path /dev/sda4
>>
>>
>>
>> Label: ‘DATA’      uuid: 8d3ee597-bddc-4de8-8fc0-23fde00e27f1
>>
>>                              Total devices 1FS bytes used 768.00KiB
>>
>>                              Devid 1 size 16.37TiB used 11.72TiB path /dev/sdb
>>
>>
>>
>> Inside DATA there are only two folders. DATASTORE and SyncThing. All the required data is in DATASTORE.
>>
>>
>>
>> Btrfs fi df /home
>>
>> Data, single: total=5.54GiB, used=4.55GiB
>>
>> System, single: total=32.00MiB, used=16.ooKiB
>>
>> Metadata, single:=232.00MiB, used=110.05MiB
>>
>> GlobalReserve, single: total=11.55MiB, used=0.00B

[-- Attachment #2: requested_logs.tgz --]
[-- Type: application/x-compressed, Size: 27311 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-06-27 11:10 Alvaro a-m
@ 2023-06-27 11:15 ` Michael Kjörling
  0 siblings, 0 replies; 1546+ messages in thread
From: Michael Kjörling @ 2023-06-27 11:15 UTC (permalink / raw)
  To: cryptsetup

On 27 Jun 2023 13:10 +0200, from alvaroam007@gmail.com (Alvaro a-m):
> Do you know any solution for this? Can I enable the touch screen
> before LUKS gets up?

That is a distribution issue; not a LUKS, dm-crypt or cryptsetup
issue. You should ask your question in a forum more geared toward
whatever distribution you are using.

I do hope that you will be able to find an answer, but the cryptsetup
mailing list is the wrong forum for your question.

-- 
Michael Kjörling                     🔗 https://michael.kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <64b09dbb.630a0220.e80b9.e2ed@mx.google.com>
@ 2023-07-14  8:05 ` Andy Shevchenko
  0 siblings, 0 replies; 1546+ messages in thread
From: Andy Shevchenko @ 2023-07-14  8:05 UTC (permalink / raw)
  To: luoruihong
  Cc: ilpo.jarvinen, gregkh, jirislaby, linux-kernel, linux-serial,
	luoruihong, weipengliang, wengjinfei

On Fri, Jul 14, 2023 at 08:58:29AM +0800, luoruihong wrote:
> On Thu, Jul 13, 2023 at 07:51:14PM +0300, Andy Shevchenko wrote:
> > On Thu, Jul 13, 2023 at 08:42:36AM +0800, Ruihong Luo wrote:
> > > Preserve the original value of the Divisor Latch Fraction (DLF) register.
> > > When the DLF register is modified without preservation, it can disrupt
> > > the baudrate settings established by firmware or bootloader, leading to
> > > data corruption and the generation of unreadable or distorted characters.
> >
> > You forgot to add my tag. Why? Do you think the name of variable warrants this?
> > Whatever,
> > Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> >
> > Next time if you don't pick up somebody's tag, care to explain in the changelog
> > why.
> >
> > > Fixes: 701c5e73b296 ("serial: 8250_dw: add fractional divisor support")
> > > Signed-off-by: Ruihong Luo <colorsu1922@gmail.com>
> 
> I'm sorry, I didn't know about this rule. Thank you for helping me add
> the missing tags back and for all your previous kind assistance.

For now no need to do anything, just wait for Ilpo's and/or Greg's answer(s),

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] <TXJgqLzlM6oCfTXKSqrSBk@txt.att.net>
@ 2023-08-09  5:12 ` Luna Jernberg
  0 siblings, 0 replies; 1546+ messages in thread
From: Luna Jernberg @ 2023-08-09  5:12 UTC (permalink / raw)
  To: 5598162950, Luna Jernberg; +Cc: git

What is the question?

Den ons 9 aug. 2023 kl 03:31 skrev <5598162950@mms.cricketwireless.net>:

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]               ` <875y2u5s8g.ffs@tglx>
@ 2023-10-25 22:11                 ` Mario Limonciello
  2023-10-26  9:27                   ` Re: Thomas Gleixner
  0 siblings, 1 reply; 1546+ messages in thread
From: Mario Limonciello @ 2023-10-25 22:11 UTC (permalink / raw)
  To: Thomas Gleixner, David Lazar
  Cc: Hans de Goede, kys, hpa, x86, LKML, Borislav Petkov,
	Rafael J. Wysocki, Linux kernel regressions list

On 10/25/2023 16:04, Thomas Gleixner wrote:
> David and a few others reported that on certain newer systems some legacy
> interrupts fail to work correctly.
> 
> Debugging revealed that the BIOS of these systems leaves the legacy PIC in
> uninitialized state which makes the PIC detection fail and the kernel
> switches to a dummy implementation.
> 
> Unfortunately this fallback causes quite some code to fail as it depends on
> checks for the number of legacy PIC interrupts or the availability of the
> real PIC.
> 
> In theory there is no reason to use the PIC on any modern system when
> IO/APIC is available, but the dependencies on the related checks cannot be
> resolved trivially and on short notice. This needs lots of analysis and
> rework.
> 
> The PIC detection has been added to avoid quirky checks and force selection
> of the dummy implementation all over the place, especially in VM guest
> scenarios. So it's not an option to revert the relevant commit as that
> would break a lot of other scenarios.
> 
> One solution would be to try to initialize the PIC on detection fail and
> retry the detection, but that puts the burden on everything which does not
> have a PIC.
> 
> Fortunately the ACPI/MADT table header has a flag field, which advertises
> in bit 0 that the system is PCAT compatible, which means it has a legacy
> 8259 PIC.
> 
> Evaluate that bit and if set avoid the detection routine and keep the real
> PIC installed, which then gets initialized (for nothing) and makes the rest
> of the code with all the dependencies work again.
> 
> Fixes: e179f6914152 ("x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately")
> Reported-by: David Lazar <dlazar@gmail.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: David Lazar <dlazar@gmail.com>
> Cc: stable@vger.kernel.org
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218003

s/Link/Closes/

Presumably you will add a proper subject when this is committed?

With adding title and fixing that tag:

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>

> ---
> ---
>   arch/x86/include/asm/i8259.h |    2 ++
>   arch/x86/kernel/acpi/boot.c  |    3 +++
>   arch/x86/kernel/i8259.c      |   38 ++++++++++++++++++++++++++++++--------
>   3 files changed, 35 insertions(+), 8 deletions(-)
> 
> --- a/arch/x86/include/asm/i8259.h
> +++ b/arch/x86/include/asm/i8259.h
> @@ -69,6 +69,8 @@ struct legacy_pic {
>   	void (*make_irq)(unsigned int irq);
>   };
>   
> +void legacy_pic_pcat_compat(void);
> +
>   extern struct legacy_pic *legacy_pic;
>   extern struct legacy_pic null_legacy_pic;
>   
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -148,6 +148,9 @@ static int __init acpi_parse_madt(struct
>   		pr_debug("Local APIC address 0x%08x\n", madt->address);
>   	}
>   
> +	if (madt->flags & ACPI_MADT_PCAT_COMPAT)
> +		legacy_pic_pcat_compat();
> +
>   	/* ACPI 6.3 and newer support the online capable bit. */
>   	if (acpi_gbl_FADT.header.revision > 6 ||
>   	    (acpi_gbl_FADT.header.revision == 6 &&
> --- a/arch/x86/kernel/i8259.c
> +++ b/arch/x86/kernel/i8259.c
> @@ -32,6 +32,7 @@
>    */
>   static void init_8259A(int auto_eoi);
>   
> +static bool pcat_compat __ro_after_init;
>   static int i8259A_auto_eoi;
>   DEFINE_RAW_SPINLOCK(i8259A_lock);
>   
> @@ -299,15 +300,32 @@ static void unmask_8259A(void)
>   
>   static int probe_8259A(void)
>   {
> +	unsigned char new_val, probe_val = ~(1 << PIC_CASCADE_IR);
>   	unsigned long flags;
> -	unsigned char probe_val = ~(1 << PIC_CASCADE_IR);
> -	unsigned char new_val;
> +
> +	/*
> +	 * If MADT has the PCAT_COMPAT flag set, then do not bother probing
> +	 * for the PIC. Some BIOSes leave the PIC uninitialized and probing
> +	 * fails.
> +	 *
> +	 * Right now this causes problems as quite some code depends on
> +	 * nr_legacy_irqs() > 0 or has_legacy_pic() == true. This is silly
> +	 * when the system has an IO/APIC because then PIC is not required
> +	 * at all, except for really old machines where the timer interrupt
> +	 * must be routed through the PIC. So just pretend that the PIC is
> +	 * there and let legacy_pic->init() initialize it for nothing.
> +	 *
> +	 * Alternatively this could just try to initialize the PIC and
> +	 * repeat the probe, but for cases where there is no PIC that's
> +	 * just pointless.
> +	 */
> +	if (pcat_compat)
> +		return nr_legacy_irqs();
> +
>   	/*
> -	 * Check to see if we have a PIC.
> -	 * Mask all except the cascade and read
> -	 * back the value we just wrote. If we don't
> -	 * have a PIC, we will read 0xff as opposed to the
> -	 * value we wrote.
> +	 * Check to see if we have a PIC.  Mask all except the cascade and
> +	 * read back the value we just wrote. If we don't have a PIC, we
> +	 * will read 0xff as opposed to the value we wrote.
>   	 */
>   	raw_spin_lock_irqsave(&i8259A_lock, flags);
>   
> @@ -429,5 +447,9 @@ static int __init i8259A_init_ops(void)
>   
>   	return 0;
>   }
> -
>   device_initcall(i8259A_init_ops);
> +
> +void __init legacy_pic_pcat_compat(void)
> +{
> +	pcat_compat = true;
> +}


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-10-25 22:11                 ` Re: Mario Limonciello
@ 2023-10-26  9:27                   ` Thomas Gleixner
  0 siblings, 0 replies; 1546+ messages in thread
From: Thomas Gleixner @ 2023-10-26  9:27 UTC (permalink / raw)
  To: Mario Limonciello, David Lazar
  Cc: Hans de Goede, kys, hpa, x86, LKML, Borislav Petkov,
	Rafael J. Wysocki, Linux kernel regressions list

On Wed, Oct 25 2023 at 17:11, Mario Limonciello wrote:
> On 10/25/2023 16:04, Thomas Gleixner wrote:
>> Cc: stable@vger.kernel.org
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218003
>
> s/Link/Closes/

Sure.

> Presumably you will add a proper subject when this is committed?

Bah, yes. I stopped replacing the subject line right after clearing it :(

> With adding title and fixing that tag:
>
> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-12-07  4:40 Emma Tebibyte
@ 2023-12-07  5:00 ` Christoph Anton Mitterer
  2023-12-07  5:29   ` Re: Lawrence Velázquez
  0 siblings, 1 reply; 1546+ messages in thread
From: Christoph Anton Mitterer @ 2023-12-07  5:00 UTC (permalink / raw)
  To: Emma Tebibyte, dash

On Wed, 2023-12-06 at 21:40 -0700, Emma Tebibyte wrote:
> I found a bug in dash version 0.5.12 where when shifting more than
> ?#,
> the shell exits before evaluating a logical OR operator.

AFAIU from POSIX this is perfectly valid behaviour:

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#shift

> EXIT STATUS
> If the n operand is invalid or is greater than "$#", this may be
> considered a syntax error and a non-interactive shell may exit; if
> the shell does not exit in this case, a non-zero exit status shall
> be returned. Otherwise, zero shall be returned.


Cheers,
Chris.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2023-12-07  5:00 ` Christoph Anton Mitterer
@ 2023-12-07  5:29   ` Lawrence Velázquez
  0 siblings, 0 replies; 1546+ messages in thread
From: Lawrence Velázquez @ 2023-12-07  5:29 UTC (permalink / raw)
  To: Christoph Anton Mitterer, Emma Tebibyte; +Cc: dash

On Thu, Dec 7, 2023, at 12:00 AM, Christoph Anton Mitterer wrote:
> On Wed, 2023-12-06 at 21:40 -0700, Emma Tebibyte wrote:
>> I found a bug in dash version 0.5.12 where when shifting more than
>> ?#,
>> the shell exits before evaluating a logical OR operator.
>
> AFAIU from POSIX this is perfectly valid behaviour:
>
> https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#shift
>
>> EXIT STATUS
>> If the n operand is invalid or is greater than "$#", this may be
>> considered a syntax error and a non-interactive shell may exit; if
>> the shell does not exit in this case, a non-zero exit status shall
>> be returned. Otherwise, zero shall be returned.

See also Section 2.8.1 [*], which states that interactive shells
shall not exit on special built-in utility errors and that:

	In all of the cases shown in the table where an interactive
	shell is required not to exit, the shell shall not perform
	any further processing of the command in which the error
	occurred.

[*] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_08_01

-- 
vq

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-01-16  6:46 meir elisha
@ 2024-01-16  7:05 ` Dan Carpenter
  0 siblings, 0 replies; 1546+ messages in thread
From: Dan Carpenter @ 2024-01-16  7:05 UTC (permalink / raw)
  To: meir elisha; +Cc: linux-staging

You have to send an email to linux-staging+subscribe@lists.linux.dev

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-01-22 10:13 ` Andi Kleen
@ 2024-01-22 11:53   ` Dave Chinner
  0 siblings, 0 replies; 1546+ messages in thread
From: Dave Chinner @ 2024-01-22 11:53 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-xfs, linux-mm

On Mon, Jan 22, 2024 at 02:13:23AM -0800, Andi Kleen wrote:
> Dave Chinner <david@fromorbit.com> writes:
> 
> > Thoughts, comments, etc?
> 
> The interesting part is if it will cause additional tail latencies
> allocating under fragmentation with direct reclaim, compaction
> etc. being triggered before it falls back to the base page path.

It's not like I don't know these problems exist with memory
allocation. Go have a look at xlog_kvmalloc() which is an open coded
kvmalloc() that allows the high order kmalloc allocations to
fail-fast without triggering all the expensive and unnecessary
direct reclaim overhead (e.g. compaction!) because we can fall back
to vmalloc without huge concerns. When high order allocations start
to fail, then we fall back to vmalloc and then we hit the long
standing vmalloc scalability problems before anything else in XFS or
the IO path becomes a bottleneck.

IOWs, we already know that fail-fast high-order allocation is a more
efficient and effective fast path than using vmalloc/vmap_ram() all
the time. As this is an RFC, I haven't implemented stuff like this
yet - I haven't seen anything in the profiles indicating that high
order folio allocation is failing and causing lots of reclaim
overhead, so I simply haven't added fail-fast behaviour yet...

> In fact it is highly likely it will, the question is just how bad it is.
> 
> Unfortunately benchmarking for that isn't that easy, it needs artificial
> memory fragmentation and then some high stress workload, and then
> instrumenting the transactions for individual latencies. 

I stress test and measure XFS metadata performance under sustained
memory pressure all the time. This change has not caused any
obvious regressions in the short time I've been testing it.

I still need to do perf testing on large directory block sizes. That
is where high-order allocations will get stressed - that's where
xlog_kvmalloc() starts dominating the profiles as it trips over
vmalloc scalability issues...

> I would in any case add a tunable for it in case people run into this.

No tunables. It either works or it doesn't. If we can't make
it work reliably by default, we throw it in the dumpster, light it
on fire and walk away.

> Tail latencies are a common concern on many IO workloads.

Yes, for user data operations it's a common concern. For metadata,
not so much - there's so many far worse long tail latencies in
metadata operations (like waiting for journal space) that memory
allocation latencies in the metadata IO path are largely noise....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found] ` <20240126173317.2779230-1-joshwash@google.com>
@ 2024-01-31 14:58   ` Ferruh Yigit
  0 siblings, 0 replies; 1546+ messages in thread
From: Ferruh Yigit @ 2024-01-31 14:58 UTC (permalink / raw)
  To: Joshua Washington; +Cc: dev, Rushil Gupta

On 1/26/2024 5:33 PM, Joshua Washington wrote:
> Subject: [PATCH v4 0/7] net/gve: RSS Support for GVE Driver
> 
> This patch series introduces RSS support for the GVE poll-mode driver.
> This series includes implementations of the following eth_dev_ops:
> 
> 1) rss_hash_update
> 2) rss_hash_conf_get
> 3) reta_query
> 4) reta_update
> 
> In rss_hash_update, the GVE driver supports the following RSS hash
> types:
> 
> * RTE_ETH_RSS_IPV4
> * RTE_ETH_RSS_NONFRAG_IPV4_TCP
> * RTE_ETH_RSS_NONFRAG_IPV4_UDP
> * RTE_ETH_RSS_IPV6
> * RTE_ETH_RSS_IPV6_EX
> * RTE_ETH_RSS_NONFRAG_IPV6_TCP
> * RTE_ETH_RSS_NONFRAG_IPV6_UDP
> * RTE_ETH_RSS_IPV6_TCP_EX
> * RTE_ETH_RSS_IPV6_UDP_EX
> 
> The hash key is 40B, and the lookup table has 128 entries. These values
> are not configurable in this implementation.
> 
> In general, the DPDK driver expects the RSS hash configuration to be set
> with a key before the redriection table is set up. When the RSS hash is
> configured, a default redirection table is generated based on the number
> of queues. When the device is re-configured, the redirection table is
> reset to the default value based on the queue count.
> 
> An important note is that the gVNIC device expects 32 bit integers for
> RSS redirection table entries, while the RTE API uses 16 bit integers.
> However, this is unlikely to be an issue, as these values represent
> receive queues, and the gVNIC device does not support anywhere near 64K
> queues.
> 
> This series also updates the corresponding feature matrix ertries and
> documentation as it pertains to RSS support in the GVE driver.
> 
> v2:
> Add commmit messages for patches with it missing, and other checkpatches
> fixes.
> 
> Note: There is a warning about complex macros being parenthesized that
> does not seem to be well-founded.
> 
> v3:
> Fix build warnings that come up on certain distros.
> 
> v4:
> Fix formatting in gve_adminq.c
> 
> Joshua Washington (7):
>   net/gve: fully expose RSS offload support in dev_info
>   net/gve: RSS adminq command changes
>   net/gve: add gve_rss library for handling RSS-related behaviors
>   net/gve: RSS configuration update support
>   net/gve: RSS redirection table update support
>   net/gve: update gve.ini with RSS capabilities
>   net/gve: update GVE documentation with RSS support
> 
> 

'./devtools/check-git-log.sh' script is giving warnings [1]:

Expected patch title format is:
net/gve: <verb> <object>

<verb> should start with lowercase.


[1]
- check-git-log:




Wrong headline format:




        net/gve: fully expose RSS offload support in dev_info




        net/gve: add gve_rss library for handling RSS-related behaviors




Wrong headline uppercase:




        net/gve: RSS adminq command changes




        net/gve: RSS configuration update support




        net/gve: RSS redirection table update support




Headline too long:




        net/gve: add gve_rss library for handling RSS-related behaviors


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-03-07  6:07 KR Kim
@ 2024-03-07  8:01 ` Miquel Raynal
  2024-03-08  1:27   ` Re: Kyeongrho.Kim
       [not found]   ` <SE2P216MB210205B301549661575720CC833A2@SE2P216MB2102.KORP216.PROD.OUTLOOK.COM>
  0 siblings, 2 replies; 1546+ messages in thread
From: Miquel Raynal @ 2024-03-07  8:01 UTC (permalink / raw)
  To: KR Kim
  Cc: richard, vigneshr, mmkurbanov, ddrokosov, gch981213, michael,
	broonie, mika.westerberg, acelan.kao, linux-kernel, linux-mtd,
	moh.sardi, changsub.shim

Hi,

kr.kim@skyhighmemory.com wrote on Thu,  7 Mar 2024 15:07:29 +0900:

> Feat: Add SkyHigh Memory Patch code
> 
> Add SPI Nand Patch code of SkyHigh Memory
> - Add company dependent code with 'skyhigh.c'
> - Insert into 'core.c' so that 'always ECC on'

Patch formatting is still messed up.

> commit 6061b97a830af8cb5fd0917e833e779451f9046a (HEAD -> master)
> Author: KR Kim <kr.kim@skyhighmemory.com>
> Date:   Thu Mar 7 13:24:11 2024 +0900
> 
>     SPI Nand Patch code of SkyHigh Momory
> 
>     Signed-off-by: KR Kim <kr.kim@skyhighmemory.com>
> 
> From 6061b97a830af8cb5fd0917e833e779451f9046a Mon Sep 17 00:00:00 2001
> From: KR Kim <kr.kim@skyhighmemory.com>
> Date: Thu, 7 Mar 2024 13:24:11 +0900
> Subject: [PATCH] SPI Nand Patch code of SkyHigh Memory
> 
> ---
>  drivers/mtd/nand/spi/Makefile  |   2 +-
>  drivers/mtd/nand/spi/core.c    |   7 +-
>  drivers/mtd/nand/spi/skyhigh.c | 155 +++++++++++++++++++++++++++++++++
>  include/linux/mtd/spinand.h    |   3 +
>  4 files changed, 165 insertions(+), 2 deletions(-)
>  mode change 100644 => 100755 drivers/mtd/nand/spi/Makefile
>  mode change 100644 => 100755 drivers/mtd/nand/spi/core.c
>  create mode 100644 drivers/mtd/nand/spi/skyhigh.c
>  mode change 100644 => 100755 include/linux/mtd/spinand.h
> 
> diff --git a/drivers/mtd/nand/spi/Makefile b/drivers/mtd/nand/spi/Makefile
> old mode 100644
> new mode 100755
> index 19cc77288ebb..1e61ab21893a
> --- a/drivers/mtd/nand/spi/Makefile
> +++ b/drivers/mtd/nand/spi/Makefile
> @@ -1,4 +1,4 @@
>  # SPDX-License-Identifier: GPL-2.0
>  spinand-objs := core.o alliancememory.o ato.o esmt.o foresee.o gigadevice.o macronix.o
> -spinand-objs += micron.o paragon.o toshiba.o winbond.o xtx.o
> +spinand-objs += micron.o paragon.o skyhigh.o toshiba.o winbond.o xtx.o
>  obj-$(CONFIG_MTD_SPI_NAND) += spinand.o
> diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
> old mode 100644
> new mode 100755
> index e0b6715e5dfe..e3f0a7544ba4
> --- a/drivers/mtd/nand/spi/core.c
> +++ b/drivers/mtd/nand/spi/core.c
> @@ -34,7 +34,7 @@ static int spinand_read_reg_op(struct spinand_device *spinand, u8 reg, u8 *val)
>  	return 0;
>  }
>  
> -static int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 val)
> +int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 val)

Please do this in a separate commit.

>  {
>  	struct spi_mem_op op = SPINAND_SET_FEATURE_OP(reg,
>  						      spinand->scratchbuf);
> @@ -196,6 +196,10 @@ static int spinand_init_quad_enable(struct spinand_device *spinand)
>  static int spinand_ecc_enable(struct spinand_device *spinand,
>  			      bool enable)
>  {
> +	/* SHM : always ECC enable */
> +	if (spinand->flags & SPINAND_ON_DIE_ECC_MANDATORY)
> +		return 0;

Silently always enabling ECC is not possible. If you cannot disable the
on-die engine, then:
- you should prevent any other engine type to be used
- you should error out if a raw access is requested
- these chips are broken, IMO

> +
>  	return spinand_upd_cfg(spinand, CFG_ECC_ENABLE,
>  			       enable ? CFG_ECC_ENABLE : 0);
>  }
> @@ -945,6 +949,7 @@ static const struct spinand_manufacturer *spinand_manufacturers[] = {
>  	&macronix_spinand_manufacturer,
>  	&micron_spinand_manufacturer,
>  	&paragon_spinand_manufacturer,
> +	&skyhigh_spinand_manufacturer,
>  	&toshiba_spinand_manufacturer,
>  	&winbond_spinand_manufacturer,
>  	&xtx_spinand_manufacturer,
> diff --git a/drivers/mtd/nand/spi/skyhigh.c b/drivers/mtd/nand/spi/skyhigh.c
> new file mode 100644
> index 000000000000..92e7572094ff
> --- /dev/null
> +++ b/drivers/mtd/nand/spi/skyhigh.c
> @@ -0,0 +1,155 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2022 SkyHigh Memory Limited
> + *
> + * Author: Takahiro Kuwano <takahiro.kuwano@infineon.com>
> + */
> +
> +#include <linux/device.h>
> +#include <linux/kernel.h>
> +#include <linux/mtd/spinand.h>
> +
> +#define SPINAND_MFR_SKYHIGH		0x01
> +
> +#define SKYHIGH_STATUS_ECC_1TO2_BITFLIPS	(1 << 4)
> +#define SKYHIGH_STATUS_ECC_3TO6_BITFLIPS	(2 << 4)
> +#define SKYHIGH_STATUS_ECC_UNCOR_ERROR  	(3 << 4)
> +
> +#define SKYHIGH_CONFIG_PROTECT_EN	BIT(1)
> +
> +static SPINAND_OP_VARIANTS(read_cache_variants,
> +		SPINAND_PAGE_READ_FROM_CACHE_QUADIO_OP(0, 4, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_X4_OP(0, 1, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_DUALIO_OP(0, 2, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_X2_OP(0, 1, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_OP(true, 0, 1, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_OP(false, 0, 1, NULL, 0));
> +
> +static SPINAND_OP_VARIANTS(write_cache_variants,
> +		SPINAND_PROG_LOAD_X4(true, 0, NULL, 0),
> +		SPINAND_PROG_LOAD(true, 0, NULL, 0));
> +
> +static SPINAND_OP_VARIANTS(update_cache_variants,
> +		SPINAND_PROG_LOAD_X4(false, 0, NULL, 0),
> +		SPINAND_PROG_LOAD(false, 0, NULL, 0));
> +
> +static int skyhigh_spinand_ooblayout_ecc(struct mtd_info *mtd, int section,
> +					 struct mtd_oob_region *region)
> +{
> +	if (section)
> +		return -ERANGE;
> +
> +	/* SkyHigh's ecc parity is stored in the internal hidden area and is not needed for them. */

		     ECC		     an

"needed" is wrong here. Just stop after "area"


> +	region->length = 0;
> +	region->offset = mtd->oobsize;
> +
> +	return 0;
> +}
> +
> +static int skyhigh_spinand_ooblayout_free(struct mtd_info *mtd, int section,
> +					  struct mtd_oob_region *region)
> +{
> +	if (section)
> +		return -ERANGE;
> +
> +	region->length = mtd->oobsize - 2;
> +	region->offset = 2;
> +
> +	return 0;
> +}
> +
> +static const struct mtd_ooblayout_ops skyhigh_spinand_ooblayout = {
> +	.ecc = skyhigh_spinand_ooblayout_ecc,
> +	.free = skyhigh_spinand_ooblayout_free,
> +};
> +
> +static int skyhigh_spinand_ecc_get_status(struct spinand_device *spinand,
> +				  u8 status)
> +{
> +	/* SHM
> +	 * 00 : No bit-flip
> +	 * 01 : 1-2 errors corrected
> +	 * 10 : 3-6 errors corrected         
> +	 * 11 : uncorrectable
> +	 */

Thanks for the comment but the switch case looks rather
straightforward, it is self-sufficient in this case.

> +
> +	switch (status & STATUS_ECC_MASK) {
> +	case STATUS_ECC_NO_BITFLIPS:
> +		return 0;
> +
> +	case SKYHIGH_STATUS_ECC_1TO2_BITFLIPS:
> +		return 2;
> +
> + 	case SKYHIGH_STATUS_ECC_3TO6_BITFLIPS:
> +		return 6; 
> +
> + 	case SKYHIGH_STATUS_ECC_UNCOR_ERROR:
> +		return -EBADMSG;;
> +
> +	default:
> +		break;

I guess you can directly call return -EINVAL here?

> +	}
> +
> +	return -EINVAL;
> +}
> +
> +static const struct spinand_info skyhigh_spinand_table[] = {
> +	SPINAND_INFO("S35ML01G301",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x15),
> +		     NAND_MEMORG(1, 2048, 64, 64, 1024, 20, 1, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)),
> +	SPINAND_INFO("S35ML01G300",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x14),
> +		     NAND_MEMORG(1, 2048, 128, 64, 1024, 20, 1, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)),
> +	SPINAND_INFO("S35ML02G300",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x25),
> +		     NAND_MEMORG(1, 2048, 128, 64, 2048, 40, 2, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)),
> +	SPINAND_INFO("S35ML04G300",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x35),
> +		     NAND_MEMORG(1, 2048, 128, 64, 4096, 80, 2, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)),
> +};
> +
> +static int skyhigh_spinand_init(struct spinand_device *spinand)
> +{
> +	return spinand_write_reg_op(spinand, REG_BLOCK_LOCK,
> +				    SKYHIGH_CONFIG_PROTECT_EN);

Is this really relevant? Isn't there an API for the block lock
mechanism?

> +}
> +
> +static const struct spinand_manufacturer_ops skyhigh_spinand_manuf_ops = {
> +	.init = skyhigh_spinand_init,
> + };
> +
> +const struct spinand_manufacturer skyhigh_spinand_manufacturer = {
> +	.id = SPINAND_MFR_SKYHIGH,
> +	.name = "SkyHigh",
> +	.chips = skyhigh_spinand_table,
> +	.nchips = ARRAY_SIZE(skyhigh_spinand_table),
> +	.ops = &skyhigh_spinand_manuf_ops,
> +};
> diff --git a/include/linux/mtd/spinand.h b/include/linux/mtd/spinand.h
> old mode 100644
> new mode 100755
> index badb4c1ac079..0e135076df24
> --- a/include/linux/mtd/spinand.h
> +++ b/include/linux/mtd/spinand.h
> @@ -268,6 +268,7 @@ extern const struct spinand_manufacturer gigadevice_spinand_manufacturer;
>  extern const struct spinand_manufacturer macronix_spinand_manufacturer;
>  extern const struct spinand_manufacturer micron_spinand_manufacturer;
>  extern const struct spinand_manufacturer paragon_spinand_manufacturer;
> +extern const struct spinand_manufacturer skyhigh_spinand_manufacturer;
>  extern const struct spinand_manufacturer toshiba_spinand_manufacturer;
>  extern const struct spinand_manufacturer winbond_spinand_manufacturer;
>  extern const struct spinand_manufacturer xtx_spinand_manufacturer;
> @@ -312,6 +313,7 @@ struct spinand_ecc_info {
>  
>  #define SPINAND_HAS_QE_BIT		BIT(0)
>  #define SPINAND_HAS_CR_FEAT_BIT		BIT(1)
> +#define SPINAND_ON_DIE_ECC_MANDATORY	BIT(2)	/* SHM */

If we go this route, then "mandatory" is not relevant here, we shall
convey the fact that the on-die ECC engine cannot be disabled and as
mentioned above, there are other impacts.

>  
>  /**
>   * struct spinand_ondie_ecc_conf - private SPI-NAND on-die ECC engine structure
> @@ -518,5 +520,6 @@ int spinand_match_and_init(struct spinand_device *spinand,
>  
>  int spinand_upd_cfg(struct spinand_device *spinand, u8 mask, u8 val);
>  int spinand_select_target(struct spinand_device *spinand, unsigned int target);
> +int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 val);
>  
>  #endif /* __LINUX_MTD_SPINAND_H */


Thanks,
Miquèl

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE: Re:
  2024-03-07  8:01 ` Miquel Raynal
@ 2024-03-08  1:27   ` Kyeongrho.Kim
       [not found]   ` <SE2P216MB210205B301549661575720CC833A2@SE2P216MB2102.KORP216.PROD.OUTLOOK.COM>
  1 sibling, 0 replies; 1546+ messages in thread
From: Kyeongrho.Kim @ 2024-03-08  1:27 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: richard@nod.at, vigneshr@ti.com, mmkurbanov@salutedevices.com,
	ddrokosov@sberdevices.ru, gch981213@gmail.com, michael@walle.cc,
	broonie@kernel.org, mika.westerberg@linux.intel.com,
	acelan.kao@canonical.com, linux-kernel@vger.kernel.org,
	linux-mtd@lists.infradead.org, Mohamed Sardi, Changsub.Shim

Hi Miquel,
Thank you for your comment.
I tried to match the patch format, but it seems to be not enough yet. 
Can you send me a good sample for the patch format?
Thanks,
KR
-----Original Message-----
From: Miquel Raynal <miquel.raynal@bootlin.com> 
Sent: Thursday, March 7, 2024 5:01 PM
To: Kyeongrho.Kim <kr.kim@skyhighmemory.com>
Cc: richard@nod.at; vigneshr@ti.com; mmkurbanov@salutedevices.com; ddrokosov@sberdevices.ru; gch981213@gmail.com; michael@walle.cc; broonie@kernel.org; mika.westerberg@linux.intel.com; acelan.kao@canonical.com; linux-kernel@vger.kernel.org; linux-mtd@lists.infradead.org; Mohamed Sardi <moh.sardi@skyhighmemory.com>; Changsub.Shim <changsub.shim@skyhighmemory.com>
Subject: Re:

Hi,

kr.kim@skyhighmemory.com wrote on Thu,  7 Mar 2024 15:07:29 +0900:

> Feat: Add SkyHigh Memory Patch code
> 
> Add SPI Nand Patch code of SkyHigh Memory
> - Add company dependent code with 'skyhigh.c'
> - Insert into 'core.c' so that 'always ECC on'

Patch formatting is still messed up.

> commit 6061b97a830af8cb5fd0917e833e779451f9046a (HEAD -> master)
> Author: KR Kim <kr.kim@skyhighmemory.com>
> Date:   Thu Mar 7 13:24:11 2024 +0900
> 
>     SPI Nand Patch code of SkyHigh Momory
> 
>     Signed-off-by: KR Kim <kr.kim@skyhighmemory.com>
> 
> From 6061b97a830af8cb5fd0917e833e779451f9046a Mon Sep 17 00:00:00 2001
> From: KR Kim <kr.kim@skyhighmemory.com>
> Date: Thu, 7 Mar 2024 13:24:11 +0900
> Subject: [PATCH] SPI Nand Patch code of SkyHigh Memory
> 
> ---
>  drivers/mtd/nand/spi/Makefile  |   2 +-
>  drivers/mtd/nand/spi/core.c    |   7 +-
>  drivers/mtd/nand/spi/skyhigh.c | 155 +++++++++++++++++++++++++++++++++
>  include/linux/mtd/spinand.h    |   3 +
>  4 files changed, 165 insertions(+), 2 deletions(-)  mode change 
> 100644 => 100755 drivers/mtd/nand/spi/Makefile  mode change 100644 => 
> 100755 drivers/mtd/nand/spi/core.c  create mode 100644 
> drivers/mtd/nand/spi/skyhigh.c  mode change 100644 => 100755 
> include/linux/mtd/spinand.h
> 
> diff --git a/drivers/mtd/nand/spi/Makefile 
> b/drivers/mtd/nand/spi/Makefile old mode 100644 new mode 100755 index 
> 19cc77288ebb..1e61ab21893a
> --- a/drivers/mtd/nand/spi/Makefile
> +++ b/drivers/mtd/nand/spi/Makefile
> @@ -1,4 +1,4 @@
>  # SPDX-License-Identifier: GPL-2.0
>  spinand-objs := core.o alliancememory.o ato.o esmt.o foresee.o 
> gigadevice.o macronix.o -spinand-objs += micron.o paragon.o toshiba.o 
> winbond.o xtx.o
> +spinand-objs += micron.o paragon.o skyhigh.o toshiba.o winbond.o 
> +xtx.o
>  obj-$(CONFIG_MTD_SPI_NAND) += spinand.o diff --git 
> a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c old mode 
> 100644 new mode 100755 index e0b6715e5dfe..e3f0a7544ba4
> --- a/drivers/mtd/nand/spi/core.c
> +++ b/drivers/mtd/nand/spi/core.c
> @@ -34,7 +34,7 @@ static int spinand_read_reg_op(struct spinand_device *spinand, u8 reg, u8 *val)
>  	return 0;
>  }
>  
> -static int spinand_write_reg_op(struct spinand_device *spinand, u8 
> reg, u8 val)
> +int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 
> +val)

Please do this in a separate commit.

>  {
>  	struct spi_mem_op op = SPINAND_SET_FEATURE_OP(reg,
>  						      spinand->scratchbuf);
> @@ -196,6 +196,10 @@ static int spinand_init_quad_enable(struct 
> spinand_device *spinand)  static int spinand_ecc_enable(struct spinand_device *spinand,
>  			      bool enable)
>  {
> +	/* SHM : always ECC enable */
> +	if (spinand->flags & SPINAND_ON_DIE_ECC_MANDATORY)
> +		return 0;

Silently always enabling ECC is not possible. If you cannot disable the on-die engine, then:
- you should prevent any other engine type to be used
- you should error out if a raw access is requested
- these chips are broken, IMO

> +
>  	return spinand_upd_cfg(spinand, CFG_ECC_ENABLE,
>  			       enable ? CFG_ECC_ENABLE : 0);  } @@ -945,6 +949,7 @@ static 
> const struct spinand_manufacturer *spinand_manufacturers[] = {
>  	&macronix_spinand_manufacturer,
>  	&micron_spinand_manufacturer,
>  	&paragon_spinand_manufacturer,
> +	&skyhigh_spinand_manufacturer,
>  	&toshiba_spinand_manufacturer,
>  	&winbond_spinand_manufacturer,
>  	&xtx_spinand_manufacturer,
> diff --git a/drivers/mtd/nand/spi/skyhigh.c 
> b/drivers/mtd/nand/spi/skyhigh.c new file mode 100644 index 
> 000000000000..92e7572094ff
> --- /dev/null
> +++ b/drivers/mtd/nand/spi/skyhigh.c
> @@ -0,0 +1,155 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2022 SkyHigh Memory Limited
> + *
> + * Author: Takahiro Kuwano <takahiro.kuwano@infineon.com>  */
> +
> +#include <linux/device.h>
> +#include <linux/kernel.h>
> +#include <linux/mtd/spinand.h>
> +
> +#define SPINAND_MFR_SKYHIGH		0x01
> +
> +#define SKYHIGH_STATUS_ECC_1TO2_BITFLIPS	(1 << 4)
> +#define SKYHIGH_STATUS_ECC_3TO6_BITFLIPS	(2 << 4)
> +#define SKYHIGH_STATUS_ECC_UNCOR_ERROR  	(3 << 4)
> +
> +#define SKYHIGH_CONFIG_PROTECT_EN	BIT(1)
> +
> +static SPINAND_OP_VARIANTS(read_cache_variants,
> +		SPINAND_PAGE_READ_FROM_CACHE_QUADIO_OP(0, 4, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_X4_OP(0, 1, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_DUALIO_OP(0, 2, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_X2_OP(0, 1, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_OP(true, 0, 1, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_OP(false, 0, 1, NULL, 0));
> +
> +static SPINAND_OP_VARIANTS(write_cache_variants,
> +		SPINAND_PROG_LOAD_X4(true, 0, NULL, 0),
> +		SPINAND_PROG_LOAD(true, 0, NULL, 0));
> +
> +static SPINAND_OP_VARIANTS(update_cache_variants,
> +		SPINAND_PROG_LOAD_X4(false, 0, NULL, 0),
> +		SPINAND_PROG_LOAD(false, 0, NULL, 0));
> +
> +static int skyhigh_spinand_ooblayout_ecc(struct mtd_info *mtd, int section,
> +					 struct mtd_oob_region *region)
> +{
> +	if (section)
> +		return -ERANGE;
> +
> +	/* SkyHigh's ecc parity is stored in the internal hidden area and is 
> +not needed for them. */

		     ECC		     an

"needed" is wrong here. Just stop after "area"


> +	region->length = 0;
> +	region->offset = mtd->oobsize;
> +
> +	return 0;
> +}
> +
> +static int skyhigh_spinand_ooblayout_free(struct mtd_info *mtd, int section,
> +					  struct mtd_oob_region *region) {
> +	if (section)
> +		return -ERANGE;
> +
> +	region->length = mtd->oobsize - 2;
> +	region->offset = 2;
> +
> +	return 0;
> +}
> +
> +static const struct mtd_ooblayout_ops skyhigh_spinand_ooblayout = {
> +	.ecc = skyhigh_spinand_ooblayout_ecc,
> +	.free = skyhigh_spinand_ooblayout_free, };
> +
> +static int skyhigh_spinand_ecc_get_status(struct spinand_device *spinand,
> +				  u8 status)
> +{
> +	/* SHM
> +	 * 00 : No bit-flip
> +	 * 01 : 1-2 errors corrected
> +	 * 10 : 3-6 errors corrected         
> +	 * 11 : uncorrectable
> +	 */

Thanks for the comment but the switch case looks rather straightforward, it is self-sufficient in this case.

> +
> +	switch (status & STATUS_ECC_MASK) {
> +	case STATUS_ECC_NO_BITFLIPS:
> +		return 0;
> +
> +	case SKYHIGH_STATUS_ECC_1TO2_BITFLIPS:
> +		return 2;
> +
> + 	case SKYHIGH_STATUS_ECC_3TO6_BITFLIPS:
> +		return 6;
> +
> + 	case SKYHIGH_STATUS_ECC_UNCOR_ERROR:
> +		return -EBADMSG;;
> +
> +	default:
> +		break;

I guess you can directly call return -EINVAL here?

> +	}
> +
> +	return -EINVAL;
> +}
> +
> +static const struct spinand_info skyhigh_spinand_table[] = {
> +	SPINAND_INFO("S35ML01G301",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x15),
> +		     NAND_MEMORG(1, 2048, 64, 64, 1024, 20, 1, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)),
> +	SPINAND_INFO("S35ML01G300",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x14),
> +		     NAND_MEMORG(1, 2048, 128, 64, 1024, 20, 1, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)),
> +	SPINAND_INFO("S35ML02G300",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x25),
> +		     NAND_MEMORG(1, 2048, 128, 64, 2048, 40, 2, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)),
> +	SPINAND_INFO("S35ML04G300",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x35),
> +		     NAND_MEMORG(1, 2048, 128, 64, 4096, 80, 2, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)), };
> +
> +static int skyhigh_spinand_init(struct spinand_device *spinand) {
> +	return spinand_write_reg_op(spinand, REG_BLOCK_LOCK,
> +				    SKYHIGH_CONFIG_PROTECT_EN);

Is this really relevant? Isn't there an API for the block lock mechanism?

> +}
> +
> +static const struct spinand_manufacturer_ops skyhigh_spinand_manuf_ops = {
> +	.init = skyhigh_spinand_init,
> + };
> +
> +const struct spinand_manufacturer skyhigh_spinand_manufacturer = {
> +	.id = SPINAND_MFR_SKYHIGH,
> +	.name = "SkyHigh",
> +	.chips = skyhigh_spinand_table,
> +	.nchips = ARRAY_SIZE(skyhigh_spinand_table),
> +	.ops = &skyhigh_spinand_manuf_ops,
> +};
> diff --git a/include/linux/mtd/spinand.h b/include/linux/mtd/spinand.h 
> old mode 100644 new mode 100755 index badb4c1ac079..0e135076df24
> --- a/include/linux/mtd/spinand.h
> +++ b/include/linux/mtd/spinand.h
> @@ -268,6 +268,7 @@ extern const struct spinand_manufacturer 
> gigadevice_spinand_manufacturer;  extern const struct 
> spinand_manufacturer macronix_spinand_manufacturer;  extern const 
> struct spinand_manufacturer micron_spinand_manufacturer;  extern const 
> struct spinand_manufacturer paragon_spinand_manufacturer;
> +extern const struct spinand_manufacturer 
> +skyhigh_spinand_manufacturer;
>  extern const struct spinand_manufacturer 
> toshiba_spinand_manufacturer;  extern const struct 
> spinand_manufacturer winbond_spinand_manufacturer;  extern const 
> struct spinand_manufacturer xtx_spinand_manufacturer; @@ -312,6 +313,7 
> @@ struct spinand_ecc_info {
>  
>  #define SPINAND_HAS_QE_BIT		BIT(0)
>  #define SPINAND_HAS_CR_FEAT_BIT		BIT(1)
> +#define SPINAND_ON_DIE_ECC_MANDATORY	BIT(2)	/* SHM */

If we go this route, then "mandatory" is not relevant here, we shall convey the fact that the on-die ECC engine cannot be disabled and as mentioned above, there are other impacts.

>  
>  /**
>   * struct spinand_ondie_ecc_conf - private SPI-NAND on-die ECC engine 
> structure @@ -518,5 +520,6 @@ int spinand_match_and_init(struct 
> spinand_device *spinand,
>  
>  int spinand_upd_cfg(struct spinand_device *spinand, u8 mask, u8 val);  
> int spinand_select_target(struct spinand_device *spinand, unsigned int 
> target);
> +int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 
> +val);
>  
>  #endif /* __LINUX_MTD_SPINAND_H */


Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE: Re:
       [not found]   ` <SE2P216MB210205B301549661575720CC833A2@SE2P216MB2102.KORP216.PROD.OUTLOOK.COM>
@ 2024-03-29  4:41     ` Kyeongrho.Kim
  0 siblings, 0 replies; 1546+ messages in thread
From: Kyeongrho.Kim @ 2024-03-29  4:41 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: richard@nod.at, vigneshr@ti.com, mmkurbanov@salutedevices.com,
	ddrokosov@sberdevices.ru, gch981213@gmail.com, michael@walle.cc,
	broonie@kernel.org, mika.westerberg@linux.intel.com,
	acelan.kao@canonical.com, linux-kernel@vger.kernel.org,
	linux-mtd@lists.infradead.org, Mohamed Sardi, Changsub.Shim

(I send again this mail with plain text not HTML.)

Dear Miquel,
Please see my reply in below email.
And please comment if you have any others.
Thanks,
KR

-----Original Message-----
From: Miquel Raynal <mailto:miquel.raynal@bootlin.com> 
Sent: Thursday, March 7, 2024 5:01 PM
To: Kyeongrho.Kim <mailto:kr.kim@skyhighmemory.com>
Cc: mailto:richard@nod.at; mailto:vigneshr@ti.com; mailto:mmkurbanov@salutedevices.com; mailto:ddrokosov@sberdevices.ru; mailto:gch981213@gmail.com; mailto:michael@walle.cc; mailto:broonie@kernel.org; mailto:mika.westerberg@linux.intel.com; mailto:acelan.kao@canonical.com; mailto:linux-kernel@vger.kernel.org; mailto:linux-mtd@lists.infradead.org; Mohamed Sardi <mailto:moh.sardi@skyhighmemory.com>; Changsub.Shim <mailto:changsub.shim@skyhighmemory.com>
Subject: Re:

Hi,

mailto:kr.kim@skyhighmemory.com wrote on Thu,  7 Mar 2024 15:07:29 +0900:

> Feat: Add SkyHigh Memory Patch code
> 
> Add SPI Nand Patch code of SkyHigh Memory
> - Add company dependent code with 'skyhigh.c'
> - Insert into 'core.c' so that 'always ECC on'

Patch formatting is still messed up.

> commit 6061b97a830af8cb5fd0917e833e779451f9046a (HEAD -> master)
> Author: KR Kim <mailto:kr.kim@skyhighmemory.com>
> Date:   Thu Mar 7 13:24:11 2024 +0900
> 
>     SPI Nand Patch code of SkyHigh Momory
> 
>     Signed-off-by: KR Kim <mailto:kr.kim@skyhighmemory.com>
> 
> From 6061b97a830af8cb5fd0917e833e779451f9046a Mon Sep 17 00:00:00 2001
> From: KR Kim <mailto:kr.kim@skyhighmemory.com>
> Date: Thu, 7 Mar 2024 13:24:11 +0900
> Subject: [PATCH] SPI Nand Patch code of SkyHigh Memory
> 
> ---
>  drivers/mtd/nand/spi/Makefile  |   2 +-
>  drivers/mtd/nand/spi/core.c    |   7 +-
>  drivers/mtd/nand/spi/skyhigh.c | 155 +++++++++++++++++++++++++++++++++
>  include/linux/mtd/spinand.h    |   3 +
>  4 files changed, 165 insertions(+), 2 deletions(-)  mode change 
> 100644 => 100755 drivers/mtd/nand/spi/Makefile  mode change 100644 => 
> 100755 drivers/mtd/nand/spi/core.c  create mode 100644 
> drivers/mtd/nand/spi/skyhigh.c  mode change 100644 => 100755 
> include/linux/mtd/spinand.h
> 
> diff --git a/drivers/mtd/nand/spi/Makefile 
> b/drivers/mtd/nand/spi/Makefile old mode 100644 new mode 100755 index 
> 19cc77288ebb..1e61ab21893a
> --- a/drivers/mtd/nand/spi/Makefile
> +++ b/drivers/mtd/nand/spi/Makefile
> @@ -1,4 +1,4 @@
>  # SPDX-License-Identifier: GPL-2.0
>  spinand-objs := core.o alliancememory.o ato.o esmt.o foresee.o 
> gigadevice.o macronix.o -spinand-objs += micron.o paragon.o toshiba.o 
> winbond.o xtx.o
> +spinand-objs += micron.o paragon.o skyhigh.o toshiba.o winbond.o 
> +xtx.o
>  obj-$(CONFIG_MTD_SPI_NAND) += spinand.o diff --git 
> a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c old mode 
> 100644 new mode 100755 index e0b6715e5dfe..e3f0a7544ba4
> --- a/drivers/mtd/nand/spi/core.c
> +++ b/drivers/mtd/nand/spi/core.c
> @@ -34,7 +34,7 @@ static int spinand_read_reg_op(struct spinand_device *spinand, u8 reg, u8 *val)
>     return 0;
>  }
>  
> -static int spinand_write_reg_op(struct spinand_device *spinand, u8 
> reg, u8 val)
> +int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 
> +val)

Please do this in a separate commit.
[SHM] May I know why we need to do a separate commit?
Please elaborate for the reason.
>  {
>     struct spi_mem_op op = SPINAND_SET_FEATURE_OP(reg,
>                                         spinand->scratchbuf);
> @@ -196,6 +196,10 @@ static int spinand_init_quad_enable(struct 
> spinand_device *spinand)  static int spinand_ecc_enable(struct spinand_device *spinand,
>                       bool enable)
>  {
> +   /* SHM : always ECC enable */
> +   if (spinand->flags & SPINAND_ON_DIE_ECC_MANDATORY)
> +         return 0;

Silently always enabling ECC is not possible. If you cannot disable the on-die engine, then:
- you should prevent any other engine type to be used
- you should error out if a raw access is requested
- these chips are broken, IMO
[SHM] I understand that you are concern.
We have already reviewed 'Always ECC on' to see if there was any problem in many aspects and confirmed that there was no problem.

> +
>     return spinand_upd_cfg(spinand, CFG_ECC_ENABLE,
>                        enable ? CFG_ECC_ENABLE : 0);  } @@ -945,6 +949,7 @@ static 
> const struct spinand_manufacturer *spinand_manufacturers[] = {
>     &macronix_spinand_manufacturer,
>     &micron_spinand_manufacturer,
>     &paragon_spinand_manufacturer,
> +   &skyhigh_spinand_manufacturer,
>     &toshiba_spinand_manufacturer,
>     &winbond_spinand_manufacturer,
>     &xtx_spinand_manufacturer,
> diff --git a/drivers/mtd/nand/spi/skyhigh.c 
> b/drivers/mtd/nand/spi/skyhigh.c new file mode 100644 index 
> 000000000000..92e7572094ff
> --- /dev/null
> +++ b/drivers/mtd/nand/spi/skyhigh.c
> @@ -0,0 +1,155 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2022 SkyHigh Memory Limited
> + *
> + * Author: Takahiro Kuwano <mailto:takahiro.kuwano@infineon.com>  */
> +
> +#include <linux/device.h>
> +#include <linux/kernel.h>
> +#include <linux/mtd/spinand.h>
> +
> +#define SPINAND_MFR_SKYHIGH      0x01
> +
> +#define SKYHIGH_STATUS_ECC_1TO2_BITFLIPS     (1 << 4)
> +#define SKYHIGH_STATUS_ECC_3TO6_BITFLIPS     (2 << 4)
> +#define SKYHIGH_STATUS_ECC_UNCOR_ERROR        (3 << 4)
> +
> +#define SKYHIGH_CONFIG_PROTECT_EN BIT(1)
> +
> +static SPINAND_OP_VARIANTS(read_cache_variants,
> +         SPINAND_PAGE_READ_FROM_CACHE_QUADIO_OP(0, 4, NULL, 0),
> +         SPINAND_PAGE_READ_FROM_CACHE_X4_OP(0, 1, NULL, 0),
> +         SPINAND_PAGE_READ_FROM_CACHE_DUALIO_OP(0, 2, NULL, 0),
> +         SPINAND_PAGE_READ_FROM_CACHE_X2_OP(0, 1, NULL, 0),
> +         SPINAND_PAGE_READ_FROM_CACHE_OP(true, 0, 1, NULL, 0),
> +         SPINAND_PAGE_READ_FROM_CACHE_OP(false, 0, 1, NULL, 0));
> +
> +static SPINAND_OP_VARIANTS(write_cache_variants,
> +         SPINAND_PROG_LOAD_X4(true, 0, NULL, 0),
> +         SPINAND_PROG_LOAD(true, 0, NULL, 0));
> +
> +static SPINAND_OP_VARIANTS(update_cache_variants,
> +         SPINAND_PROG_LOAD_X4(false, 0, NULL, 0),
> +         SPINAND_PROG_LOAD(false, 0, NULL, 0));
> +
> +static int skyhigh_spinand_ooblayout_ecc(struct mtd_info *mtd, int section,
> +                           struct mtd_oob_region *region)
> +{
> +   if (section)
> +         return -ERANGE;
> +
> +   /* SkyHigh's ecc parity is stored in the internal hidden area and is 
> +not needed for them. */

                 ECC                an

"needed" is wrong here. Just stop after "area"


> +   region->length = 0;
> +   region->offset = mtd->oobsize;
> +
> +   return 0;
> +}
> +
> +static int skyhigh_spinand_ooblayout_free(struct mtd_info *mtd, int section,
> +                             struct mtd_oob_region *region) {
> +   if (section)
> +         return -ERANGE;
> +
> +   region->length = mtd->oobsize - 2;
> +   region->offset = 2;
> +
> +   return 0;
> +}
> +
> +static const struct mtd_ooblayout_ops skyhigh_spinand_ooblayout = {
> +   .ecc = skyhigh_spinand_ooblayout_ecc,
> +   .free = skyhigh_spinand_ooblayout_free, };
> +
> +static int skyhigh_spinand_ecc_get_status(struct spinand_device *spinand,
> +                       u8 status)
> +{
> +   /* SHM
> +   * 00 : No bit-flip
> +   * 01 : 1-2 errors corrected
> +   * 10 : 3-6 errors corrected         
> +   * 11 : uncorrectable
> +   */

Thanks for the comment but the switch case looks rather straightforward, it is self-sufficient in this case.

> +
> +   switch (status & STATUS_ECC_MASK) {
> +   case STATUS_ECC_NO_BITFLIPS:
> +         return 0;
> +
> +   case SKYHIGH_STATUS_ECC_1TO2_BITFLIPS:
> +         return 2;
> +
> +   case SKYHIGH_STATUS_ECC_3TO6_BITFLIPS:
> +         return 6;
> +
> +   case SKYHIGH_STATUS_ECC_UNCOR_ERROR:
> +         return -EBADMSG;;
> +
> +   default:
> +         break;

I guess you can directly call return -EINVAL here?

> +   }
> +
> +   return -EINVAL;
> +}
> +
> +static const struct spinand_info skyhigh_spinand_table[] = {
> +   SPINAND_INFO("S35ML01G301",
> +              SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x15),
> +              NAND_MEMORG(1, 2048, 64, 64, 1024, 20, 1, 1, 1),
> +              NAND_ECCREQ(6, 32),
> +              SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +                                 &write_cache_variants,
> +                                 &update_cache_variants),
> +              SPINAND_ON_DIE_ECC_MANDATORY,
> +              SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +                           skyhigh_spinand_ecc_get_status)),
> +   SPINAND_INFO("S35ML01G300",
> +              SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x14),
> +              NAND_MEMORG(1, 2048, 128, 64, 1024, 20, 1, 1, 1),
> +              NAND_ECCREQ(6, 32),
> +              SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +                                 &write_cache_variants,
> +                                 &update_cache_variants),
> +              SPINAND_ON_DIE_ECC_MANDATORY,
> +              SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +                           skyhigh_spinand_ecc_get_status)),
> +   SPINAND_INFO("S35ML02G300",
> +              SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x25),
> +              NAND_MEMORG(1, 2048, 128, 64, 2048, 40, 2, 1, 1),
> +              NAND_ECCREQ(6, 32),
> +              SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +                                 &write_cache_variants,
> +                                 &update_cache_variants),
> +              SPINAND_ON_DIE_ECC_MANDATORY,
> +              SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +                           skyhigh_spinand_ecc_get_status)),
> +   SPINAND_INFO("S35ML04G300",
> +              SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x35),
> +              NAND_MEMORG(1, 2048, 128, 64, 4096, 80, 2, 1, 1),
> +              NAND_ECCREQ(6, 32),
> +              SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +                                 &write_cache_variants,
> +                                 &update_cache_variants),
> +              SPINAND_ON_DIE_ECC_MANDATORY,
> +              SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +                           skyhigh_spinand_ecc_get_status)), };
> +
> +static int skyhigh_spinand_init(struct spinand_device *spinand) {
> +   return spinand_write_reg_op(spinand, REG_BLOCK_LOCK,
> +                         SKYHIGH_CONFIG_PROTECT_EN);

Is this really relevant? Isn't there an API for the block lock mechanism?
[SHM] SHM device should be done ‘Config Protect Enable’ first for unlock.
I changed to use the 'spinand_lock_block' function instead of the 'spinand_write_reg_op' function.

> +}
> +
> +static const struct spinand_manufacturer_ops skyhigh_spinand_manuf_ops = {
> +   .init = skyhigh_spinand_init,
> + };
> +
> +const struct spinand_manufacturer skyhigh_spinand_manufacturer = {
> +   .id = SPINAND_MFR_SKYHIGH,
> +   .name = "SkyHigh",
> +   .chips = skyhigh_spinand_table,
> +   .nchips = ARRAY_SIZE(skyhigh_spinand_table),
> +   .ops = &skyhigh_spinand_manuf_ops,
> +};
> diff --git a/include/linux/mtd/spinand.h b/include/linux/mtd/spinand.h 
> old mode 100644 new mode 100755 index badb4c1ac079..0e135076df24
> --- a/include/linux/mtd/spinand.h
> +++ b/include/linux/mtd/spinand.h
> @@ -268,6 +268,7 @@ extern const struct spinand_manufacturer 
> gigadevice_spinand_manufacturer;  extern const struct 
> spinand_manufacturer macronix_spinand_manufacturer;  extern const 
> struct spinand_manufacturer micron_spinand_manufacturer;  extern const 
> struct spinand_manufacturer paragon_spinand_manufacturer;
> +extern const struct spinand_manufacturer 
> +skyhigh_spinand_manufacturer;
>  extern const struct spinand_manufacturer 
> toshiba_spinand_manufacturer;  extern const struct 
> spinand_manufacturer winbond_spinand_manufacturer;  extern const 
> struct spinand_manufacturer xtx_spinand_manufacturer; @@ -312,6 +313,7 
> @@ struct spinand_ecc_info {
>  
>  #define SPINAND_HAS_QE_BIT        BIT(0)
>  #define SPINAND_HAS_CR_FEAT_BIT         BIT(1)
> +#define SPINAND_ON_DIE_ECC_MANDATORY   BIT(2) /* SHM */

If we go this route, then "mandatory" is not relevant here, we shall convey the fact that the on-die ECC engine cannot be disabled and as mentioned above, there are other impacts.
[SHM] Please elaborate in more specific what I should do.
>  
>  /**
>   * struct spinand_ondie_ecc_conf - private SPI-NAND on-die ECC engine 
> structure @@ -518,5 +520,6 @@ int spinand_match_and_init(struct 
> spinand_device *spinand,
>  
>  int spinand_upd_cfg(struct spinand_device *spinand, u8 mask, u8 val);  
> int spinand_select_target(struct spinand_device *spinand, unsigned int 
> target);
> +int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 
> +val);
>  
>  #endif /* __LINUX_MTD_SPINAND_H */


Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-04-17  6:46 ` Yao Xingtao
@ 2024-04-17 18:14   ` Verma, Vishal L
  2024-04-22  7:26     ` Re: Xingtao Yao (Fujitsu)
  0 siblings, 1 reply; 1546+ messages in thread
From: Verma, Vishal L @ 2024-04-17 18:14 UTC (permalink / raw)
  To: Jiang, Dave, yaoxt.fnst@fujitsu.com
  Cc: caoqq@fujitsu.com, linux-cxl@vger.kernel.org,
	nvdimm@lists.linux.dev

On Wed, 2024-04-17 at 02:46 -0400, Yao Xingtao wrote:
> 
> Hi Dave,
>   I have applied this patch in my env, and done a lot of testing,
> this
> feature is currently working fine. 
>   But it is not merged into master branch yet, are there any updates
> on this feature?

Hi Xingtao,

Turns out that I had applied this to a branch but forgot to merge and
push it. Thanks for the ping - done now, and pushed to pending.

> 
> Associated patches:
> https://lore.kernel.org/linux-cxl/170112921107.2687457.2741231995154639197.stgit@djiang5-mobl3/
> https://lore.kernel.org/linux-cxl/170120423159.2725915.14670830315829916850.stgit@djiang5-mobl3/
> 
> Thanks
> Xingtao


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE: Re:
  2024-04-17 18:14   ` Verma, Vishal L
@ 2024-04-22  7:26     ` Xingtao Yao (Fujitsu)
  0 siblings, 0 replies; 1546+ messages in thread
From: Xingtao Yao (Fujitsu) @ 2024-04-22  7:26 UTC (permalink / raw)
  To: Verma, Vishal L, Jiang, Dave
  Cc: Quanquan Cao (Fujitsu), linux-cxl@vger.kernel.org,
	nvdimm@lists.linux.dev


> -----Original Message-----
> From: Verma, Vishal L <vishal.l.verma@intel.com>
> Sent: Thursday, April 18, 2024 2:14 AM
> To: Jiang, Dave <dave.jiang@intel.com>; Yao, Xingtao/姚 幸涛
> <yaoxt.fnst@fujitsu.com>
> Cc: Cao, Quanquan/曹 全全 <caoqq@fujitsu.com>; linux-cxl@vger.kernel.org;
> nvdimm@lists.linux.dev
> Subject: Re:
> 
> On Wed, 2024-04-17 at 02:46 -0400, Yao Xingtao wrote:
> >
> > Hi Dave,
> >   I have applied this patch in my env, and done a lot of testing,
> > this
> > feature is currently working fine.
> >   But it is not merged into master branch yet, are there any updates
> > on this feature?
> 
> Hi Xingtao,
> 
> Turns out that I had applied this to a branch but forgot to merge and
> push it. Thanks for the ping - done now, and pushed to pending.
Awesome, many thanks!!!

> 
> >
> > Associated patches:
> >
> https://lore.kernel.org/linux-cxl/170112921107.2687457.2741231995154639197.st
> git@djiang5-mobl3/
> >
> https://lore.kernel.org/linux-cxl/170120423159.2725915.14670830315829916850.s
> tgit@djiang5-mobl3/
> >
> > Thanks
> > Xingtao


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-04-19 15:46 George Guo
@ 2024-04-23 16:48 ` Greg KH
  0 siblings, 0 replies; 1546+ messages in thread
From: Greg KH @ 2024-04-23 16:48 UTC (permalink / raw)
  To: George Guo; +Cc: tom.zanussi, stable

On Fri, Apr 19, 2024 at 11:46:56PM +0800, George Guo wrote:
> Subject: [PATCH 4.19.y v6 0/2] Double-free bug discovery on testing trigger-field-variable-support.tc
> 
> 1) About v4-0001-tracing-Remove-hist-trigger-synth_var_refs.patch:
> 
> The reason I am backporting this patch is that no one found the double-free bug
> at that time, then later the code was removed on upstream, but
> 4.19-stable has the bug.

Both now queued up, thanks

greg k-h

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-04-24  8:58 ` Fabian Scheler
@ 2024-04-24  9:02   ` Scheler, Fabian
  0 siblings, 0 replies; 1546+ messages in thread
From: Scheler, Fabian @ 2024-04-24  9:02 UTC (permalink / raw)
  To: xenomai

Am 24.04.2024 um 10:58 schrieb Fabian Scheler:
> 
> As suggested by Florian I revised the patch so that the correct author is stated in the commit.
> 
> Ciao
> Fabian

OK, something went wrong here - this simply is additional information 
for the revised patch.

Ciao
Fabian

-- 
With best regards,
Dr. Fabian Scheler

Siemens AG
T CED EDC-DE
Hertha-Sponer-Weg 3
91058 Erlangen, Germany
Phone: +49 (1522) 1702973
Mobile: +49 (1522) 1702973
mailto:fabian.scheler@siemens.com
www.siemens.com

Siemens Aktiengesellschaft: Chairman of the Supervisory Board: Jim 
Hagemann Snabe; Managing Board: Roland Busch, Chairman, President and 
Chief Executive Officer; Cedrik Neike, Matthias Rebellius, Ralf P. 
Thomas, Judith Wiese; Registered offices: Berlin and Munich, Germany; 
Commercial registries: Berlin-Charlottenburg, HRB 12300, Munich, HRB 
6684; WEEE-Reg.-No. DE 23691322

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-05-20 10:09 ` Minwoo Im
@ 2024-05-20 13:34   ` Vincent Fu
  2024-05-21  0:00     ` Re: Minwoo Im
  0 siblings, 1 reply; 1546+ messages in thread
From: Vincent Fu @ 2024-05-20 13:34 UTC (permalink / raw)
  To: Minwoo Im, fio

On 5/20/24 06:09, Minwoo Im wrote:
> subscribe fio
> 
> 

Minwoo, here are instructions for subscribing to this list:

http://vger.kernel.org/vger-lists.html#fio

Vincent

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-05-20 13:34   ` Vincent Fu
@ 2024-05-21  0:00     ` Minwoo Im
  0 siblings, 0 replies; 1546+ messages in thread
From: Minwoo Im @ 2024-05-21  0:00 UTC (permalink / raw)
  To: Vincent Fu; +Cc: fio, Minwoo Im

[-- Attachment #1: Type: text/plain, Size: 307 bytes --]

On 24-05-20 09:34:15, Vincent Fu wrote:
> On 5/20/24 06:09, Minwoo Im wrote:
> > subscribe fio
> > 
> > 
> 
> Minwoo, here are instructions for subscribing to this list:

Ah, my bad.  I should have sent to majordomo@vger.kernel.org.

Thanks!

> 
> http://vger.kernel.org/vger-lists.html#fio
> 
> Vincent
> 

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-06-11 16:54 Jacob Pan
@ 2024-06-12  2:04 ` Sean Christopherson
  2024-06-12  2:55   ` Re: Xin Li
  0 siblings, 1 reply; 1546+ messages in thread
From: Sean Christopherson @ 2024-06-12  2:04 UTC (permalink / raw)
  To: Jacob Pan
  Cc: X86 Kernel, LKML, Thomas Gleixner, Dave Hansen, H. Peter Anvin,
	Ingo Molnar, Borislav Petkov, linux-perf-users, Peter Zijlstra,
	Andi Kleen, Xin Li

On Tue, Jun 11, 2024, Jacob Pan wrote:
> To tackle these challenges, Intel introduced NMI source reporting as a part
> of the FRED specification (detailed in Chapter 9). 

Chapter 9 of the linked spec is "VMX Interactions with FRED Transitions".  I
spent a minute or so poking around the spec and didn't find anything that describes
how "NMI source reporting" works.

> 1.	Performance monitoring.
> 2.	Inter-Processor Interrupts (IPIs) for functions like CPU backtrace,
> 	machine check, Kernel GNU Debugger (KGDB), reboot, panic stop, and
> 	self-test.
> 
> Other NMI sources will continue to be handled as previously when the NMI
> source is not utilized or remains unidentified.
> 
> Next steps:
> 1. KVM support

I can't tell for sure since I can't find the relevant spec info, but doesn't KVM
support need to land before this gets enabled?  Otherwise the source would get
lost if the NMI arrived while the CPU was in non-root mode, no?  E.g. I don't
see any changes to fred_entry_from_kvm() in this series.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-06-12  2:04 ` Sean Christopherson
@ 2024-06-12  2:55   ` Xin Li
  0 siblings, 0 replies; 1546+ messages in thread
From: Xin Li @ 2024-06-12  2:55 UTC (permalink / raw)
  To: Sean Christopherson, Jacob Pan
  Cc: X86 Kernel, LKML, Thomas Gleixner, Dave Hansen, H. Peter Anvin,
	Ingo Molnar, Borislav Petkov, linux-perf-users, Peter Zijlstra,
	Andi Kleen, Xin Li

On 6/11/2024 7:04 PM, Sean Christopherson wrote:
> On Tue, Jun 11, 2024, Jacob Pan wrote:
>> To tackle these challenges, Intel introduced NMI source reporting as a part
>> of the FRED specification (detailed in Chapter 9).
> 
> Chapter 9 of the linked spec is "VMX Interactions with FRED Transitions".  I
> spent a minute or so poking around the spec and didn't find anything that describes
> how "NMI source reporting" works.

I did the same thing when I saw NMI source was added to the spec :)

> 
>> 1.	Performance monitoring.
>> 2.	Inter-Processor Interrupts (IPIs) for functions like CPU backtrace,
>> 	machine check, Kernel GNU Debugger (KGDB), reboot, panic stop, and
>> 	self-test.
>>
>> Other NMI sources will continue to be handled as previously when the NMI
>> source is not utilized or remains unidentified.
>>
>> Next steps:
>> 1. KVM support
> 
> I can't tell for sure since I can't find the relevant spec info, but doesn't KVM
> support need to land before this gets enabled?  Otherwise the source would get
> lost if the NMI arrived while the CPU was in non-root mode, no?  E.g. I don't
> see any changes to fred_entry_from_kvm() in this series.

You're absolutely right!

There is a patch in NMI source KVM patches for this, but as you
mentioned it has to be in this NMI source native patches instead.

Thanks!
     Xin

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-06-26  6:11 Totoro W
@ 2024-06-26  7:09 ` Eduard Zingerman
  0 siblings, 0 replies; 1546+ messages in thread
From: Eduard Zingerman @ 2024-06-26  7:09 UTC (permalink / raw)
  To: Totoro W, bpf

On Wed, 2024-06-26 at 14:11 +0800, Totoro W wrote:
> Hi folks,
> 
> This is my first time to ask questions in this mailing list. I'm the
> author of https://github.com/tw4452852/zbpf which is a framework to
> write BPF programs with Zig toolchain.
> During the development, as the BTF is totally generated by the Zig
> toolchain, some naming conventions will make the BTF verifier refuse
> to load.
> Right now I have to patch the libbpf to do some fixup before loading
> into the kernel
> (https://github.com/tw4452852/libbpf_zig/blob/main/0001-temporary-WA-for-invalid-BTF-info-generated-by-Zig.patch).

> +		// https://github.com/tw4452852/zbpf/issues/3
> +		else if (btf_is_ptr(t)) {
> +			t->name_off = 0;

As far as I understand, you control BTF generation, why generate names
for pointers in a first place?

> Even though this just work-around the issue, I'm still curious about
> the current naming sanitation, I want to know some background about
> it.

Doing some git digging shows that name check was first introduced by
the following commit:

2667a2626f4d ("bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO")

And lived like that afterwards.

My guess is that kernel BTF is used to work with kernel functions and
data structures. All of which follow C naming convention.

> If possible, could we relax this to accept more languages (like Zig)
> to write BPF programs? Thanks in advance.

Could you please elaborate a bit?
Citation from [1]:

  Identifiers must start with an alphabetic character or underscore
  and may be followed by any number of alphanumeric characters or
  underscores. They must not overlap with any keywords.

  If a name that does not fit these requirements is needed, such as
  for linking with external libraries, the @"" syntax may be used.

Paragraph 1 matches C naming convention and should be accepted by
kernel/bpf/btf.c:btf_name_valid_identifier().
Paragraph 2 is basically any string.
Which one do you want?

[1] https://ziglang.org/documentation/master/#Identifiers

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-07-10  6:41 ` [PATCH v3] " Wolfram Sang
@ 2024-07-10 17:51   ` Bence Csókás
  0 siblings, 0 replies; 1546+ messages in thread
From: Bence Csókás @ 2024-07-10 17:51 UTC (permalink / raw)
  To: Wolfram Sang, linux-i2c; +Cc: Bence Csókás, Andi Shyti, linux-kernel

Hi!

On 1970. 01. 01. 1:00, Wolfram Sang wrote:
> Change the wording of this driver wrt. the newest I2C v7 and SMBus 3.2
> specifications and replace "master/slave" with more appropriate terms.
> 
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> ---
> 
> Change since v2:
> * reworded comment about NAK for consistency as well (Thanks, Bence!)
> 
>   drivers/i2c/busses/i2c-cp2615.c | 10 +++++-----
>   1 file changed, 5 insertions(+), 5 deletions(-)

Acked-by: Bence Csókás <bence98@sch.bme.hu>

Bence

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-07-14 19:59 raschupkin.ri
@ 2024-07-15 20:20 ` Joe Lawrence
  2024-07-15 22:45   ` Re: Roman Rashchupkin
                     ` (2 more replies)
  2024-07-16 17:33 ` Re: Song Liu
  1 sibling, 3 replies; 1546+ messages in thread
From: Joe Lawrence @ 2024-07-15 20:20 UTC (permalink / raw)
  To: raschupkin.ri; +Cc: live-patching, pmladek, mbenes, jikos, jpoimboe

On Sun, Jul 14, 2024 at 09:59:32PM +0200, raschupkin.ri@gmail.com wrote:
> 
> [PATCH] livepatch: support of modifying refcount_t without underflow after unpatch
> 
> CVE fixes sometimes add refcount_inc/dec() pairs to the code with existing refcount_t.
> Two problems arise when applying live-patch in this case:
> 1) After refcount_t is being inc() during system is live-patched, after unpatch the counter value will not be valid, as corresponing dec() would never be called.
> 2) Underflows are possible in runtime in case dec() is called before corresponding inc() in the live-patched code.
> 
> Proposed kprefcount_t functions are using following approach to solve these two problems:
> 1) In addition to original refcount_t, temporary refcount_t is allocated, and after unpatch it is just removed. This way system is safe with correct refcounting while patch is applied, and no underflow would happend after unpatch.
> 2) For inc/dec() added by live-patch code, one bit in reference-holder structure is used (unsigned char *ref_holder, kprefholder_flag). In case dec() is called first, it is just ignored as ref_holder bit would still not be initialized.
> 
> 
> API is defined include/linux/livepatch_refcount.h:
> 
> typedef struct kprefcount_struct {
> 	refcount_t *refcount;
> 	refcount_t kprefcount;
> 	spinlock_t lock;
> } kprefcount_t;
> 
> kprefcount_t *kprefcount_alloc(refcount_t *refcount, gfp_t flags);
> void kprefcount_free(kprefcount_t *kp_ref);
> int kprefcount_read(kprefcount_t *kp_ref);
> void kprefcount_inc(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
> void kprefcount_dec(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
> bool kprefcount_dec_and_test(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
> 

Hi Roman,

Can you point to a specific upstream commit that this API facilitated a
livepatch conversion?  That would make a good addition to the
Documentation/livepatch/ side of a potential v2.

But first, let me see if I understand the problem correctly.  Let's say
points A and A' below represent the original kernel code reference
get/put pairing in task execution flow.  A livepatch adds a new get/put
pair, B and B' in the middle like so:

  ---  execution flow  --->
  -- A  B       B'  A'  -->

There are potential issues if the livepatch is (de)activated
mid-sequence, between the new pairings:

  problem 1:
  -- A      .   B'  A'  -->                   'B, but no B =  extra put!
            ^ livepatch is activated here

  problem 2:
  -- A  B   .       A'  -->                   B, but no B' =  extra get!
            ^ livepatch is deactivated here

The first thing that comes to mind is that this might be solved using
the existing shadow variable API.  When the livepatch takes the new
reference (B), it could create a new <struct, NEW_REF> shadow variable
instance.  The livepatch code to return the reference (B') would then
check on the shadow variable existence before doing so.  This would
solve problem 1.

The second problem is a little trickier.  Perhaps the shadow variable
approach still works as long as a pre-unpatch hook* were to iterate
through all the <*, NEW_REF> shadow variable instances and returned
their reference before freeing the shadow variable and declaring the
livepatch inactive.  I believe that would align the reference counts
with original kernel code expectations.

* note this approach probably requires atomic-replace livepatches, so
  only a single pre-unpatch hook is ever executed.

Also, the proposed patchset looks like it creates a parallel reference
counting structure... does this mean that the livepatch will need to
update *all* reference counting calls for the API to work (so points A,
B, B', and A' in my ascii-art above)?  This question loops back to my
first point about a real-world example that can be added to
Documentation/livepatch/, much like the ones found in the
shadow-vars.rst file.

Thanks,

--
Joe

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-07-15 20:20 ` Joe Lawrence
@ 2024-07-15 22:45   ` Roman Rashchupkin
  2024-07-16  9:28   ` Re: Nicolai Stange
       [not found]   ` <66963d60.170a0220.70a9a.8866SMTPIN_ADDED_BROKEN@mx.google.com>
  2 siblings, 0 replies; 1546+ messages in thread
From: Roman Rashchupkin @ 2024-07-15 22:45 UTC (permalink / raw)
  To: Joe Lawrence
  Cc: live-patching, pmladek, mbenes, jikos, jpoimboe, quic_jjohnson

Hello.
About upstream commits creating live-patching for which this API would 
facilitate,
I could reference several CVEs:
- CVE-2023-5633
     "drm/vmwgfx: Keep a gem reference to user bos in surfaces"
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=91398b413d03660fd5828f7b4abc64e884b98069

  drm_gem_object_get(&vbo->tbo.base);/drm_gem_object_put(&tmp_buf->tbo.base);

- CVE-2023-6932
     "ipv4: igmp: fix refcnt uaf issue when receiving igmp query packet"
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e2b706c69190

     refcount_inc_not_zero(&im->refcnt)/ip_ma_put(im);

- CVE-2022-20566
     "Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put"
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d0be8347c623e0ac4202a1d4e0373882821f56b0

     kref_get_unless_zero(&c->kref)/l2cap_chan_put(chan)

Only in all of these 3 cases I can remember now, refcount_t is mostly 
used inside wrapper-functions, and from the top of my head now I don't 
remember CVEs that plainly add refcount_inc()/dec().
In case the proposed patch is merged, maybe CVE-2023-5633 would be 
suited best for documentation, or source git could be searched for 
better example.

Two types of problems that you classify, are exactly what I'm attempting 
to solve for added refcount_inc/dec() in the code that is added by 
live-patch. Let's continue with your numbering (1) and (2) for 
simplicity of discussion.

Concerning problem (1), shadow variables are certainly could be used 
instead of my refholder bit in reference-holder structures. That's only 
my preference for simplicity that live-patches code is so often lacking, 
to use one bit in existing structures instead of hash-table based shadow 
variables. But certainly shadow-variables are also a good approach, and 
could be used instead of mine (unsigned char *ref_holder, int 
kprefholder_flag) in the kprefcount_t API.

About problem (2), iterating through all shadow-variable/refholder 
instances would also work, but it is just unnecessary processing during 
unpatch.
In my approach the second kprefcount variable with lifetime of 
live-patch being applied is used, it provides correct refcounting during 
live-patch, but the main idea is that this variable can be just safely 
removed at the unpatch.
The only complication could be values of refholder bits, that must be 
reset at live-patch apply, or probably it is more simple to implement at 
the unpatch, as all kprefcount_t structs are allocated by patch-code.
---
Roman Rashchupkin

On 7/15/24 22:20, Joe Lawrence wrote:
> On Sun, Jul 14, 2024 at 09:59:32PM +0200, raschupkin.ri@gmail.com wrote:
>> [PATCH] livepatch: support of modifying refcount_t without underflow after unpatch
>>
>> CVE fixes sometimes add refcount_inc/dec() pairs to the code with existing refcount_t.
>> Two problems arise when applying live-patch in this case:
>> 1) After refcount_t is being inc() during system is live-patched, after unpatch the counter value will not be valid, as corresponing dec() would never be called.
>> 2) Underflows are possible in runtime in case dec() is called before corresponding inc() in the live-patched code.
>>
>> Proposed kprefcount_t functions are using following approach to solve these two problems:
>> 1) In addition to original refcount_t, temporary refcount_t is allocated, and after unpatch it is just removed. This way system is safe with correct refcounting while patch is applied, and no underflow would happend after unpatch.
>> 2) For inc/dec() added by live-patch code, one bit in reference-holder structure is used (unsigned char *ref_holder, kprefholder_flag). In case dec() is called first, it is just ignored as ref_holder bit would still not be initialized.
>>
>>
>> API is defined include/linux/livepatch_refcount.h:
>>
>> typedef struct kprefcount_struct {
>> 	refcount_t *refcount;
>> 	refcount_t kprefcount;
>> 	spinlock_t lock;
>> } kprefcount_t;
>>
>> kprefcount_t *kprefcount_alloc(refcount_t *refcount, gfp_t flags);
>> void kprefcount_free(kprefcount_t *kp_ref);
>> int kprefcount_read(kprefcount_t *kp_ref);
>> void kprefcount_inc(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
>> void kprefcount_dec(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
>> bool kprefcount_dec_and_test(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
>>
> Hi Roman,
>
> Can you point to a specific upstream commit that this API facilitated a
> livepatch conversion?  That would make a good addition to the
> Documentation/livepatch/ side of a potential v2.
>
> But first, let me see if I understand the problem correctly.  Let's say
> points A and A' below represent the original kernel code reference
> get/put pairing in task execution flow.  A livepatch adds a new get/put
> pair, B and B' in the middle like so:
>
>    ---  execution flow  --->
>    -- A  B       B'  A'  -->
>
> There are potential issues if the livepatch is (de)activated
> mid-sequence, between the new pairings:
>
>    problem 1:
>    -- A      .   B'  A'  -->                   'B, but no B =  extra put!
>              ^ livepatch is activated here
>
>    problem 2:
>    -- A  B   .       A'  -->                   B, but no B' =  extra get!
>              ^ livepatch is deactivated here
>
>
> The first thing that comes to mind is that this might be solved using
> the existing shadow variable API.  When the livepatch takes the new
> reference (B), it could create a new <struct, NEW_REF> shadow variable
> instance.  The livepatch code to return the reference (B') would then
> check on the shadow variable existence before doing so.  This would
> solve problem 1.
>
> The second problem is a little trickier.  Perhaps the shadow variable
> approach still works as long as a pre-unpatch hook* were to iterate
> through all the <*, NEW_REF> shadow variable instances and returned
> their reference before freeing the shadow variable and declaring the
> livepatch inactive.  I believe that would align the reference counts
> with original kernel code expectations.
>
> * note this approach probably requires atomic-replace livepatches, so
>    only a single pre-unpatch hook is ever executed.
>
>
> Also, the proposed patchset looks like it creates a parallel reference
> counting structure... does this mean that the livepatch will need to
> update *all* reference counting calls for the API to work (so points A,
> B, B', and A' in my ascii-art above)?  This question loops back to my
> first point about a real-world example that can be added to
> Documentation/livepatch/, much like the ones found in the
> shadow-vars.rst file.
>
> Thanks,
>
> --
> Joe
>


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-07-15 21:06 Phil Dennis-Jordan
@ 2024-07-16  6:07 ` Akihiko Odaki
  2024-07-17 11:16   ` Re: Phil Dennis-Jordan
  0 siblings, 1 reply; 1546+ messages in thread
From: Akihiko Odaki @ 2024-07-16  6:07 UTC (permalink / raw)
  To: Phil Dennis-Jordan, qemu-devel, pbonzini, agraf, graf,
	marcandre.lureau, berrange, thuth, philmd, peter.maydell, lists

On 2024/07/16 6:06, Phil Dennis-Jordan wrote:
> Date: Mon, 15 Jul 2024 21:07:12 +0200
> Subject: [PATCH 00/26] hw/display/apple-gfx: New macOS PV Graphics device
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> This sequence of patches integrates the paravirtualised graphics device
> implemented by macOS's ParavirtualizedGraphics.Framework into Qemu.
> Combined with the guest drivers which ship with macOS versions 11 and up,
> this allows the guest OS to use the host's GPU for hardware accelerated
> 3D graphics, GPGPU compute (both using the 'Metal' graphics API), and
> window compositing.
> 
> Some background:
> ----------------
> 
> The device exposed by the ParavirtualizedGraphics.Framework's (henceforth
> PVG) public API consists of a PCI device with a single memory-mapped BAR;
> the VMM is expected to pass reads and writes through to the framework, and
> to forward interrupts emenating from it to the guest VM.
> 
> The bulk of data exchange between host and guest occurs via shared memory,
> however. For this purpose, PVG makes callbacks to VMM code for allocating,
> mapping, unmapping, and deallocating "task" memory ranges. Each task
> represents a contiguous host virtual address range, and PVG expects the
> VMM to map specific guest system memory ranges to these host addresses via
> subsequent map callbacks. Multiple tasks can exist at a time, each with
> many mappings.
> 
> Data is exchanged via an undocumented, Apple-proprietary protocol. The
> PVG API only acts as a facilitator for establishing the communication
> mechanism. This is perhaps not ideal, and among other things means it
> only works on macOS hosts, but it's the only serious option we've got for
> good performance and quality graphics with macOS guests at this time.
> 
> The first iterations of this PVG integration into Qemu were developed
> by Alexander Graf as part of his "vmapple" machine patch series for
> supporting aarch64 macOS guests, and posted to qemu-devel in June and
> August 2023:
> 
> https://lore.kernel.org/all/20230830161425.91946-1-graf@amazon.com/T/
> 
> This integration mimics the "vmapple"/"apple-gfx" variant of the PVG device
> used by Apple's own VMM, Virtualization.framework. This variant does not use
> PCI but acts as a direct MMIO system device; there are two MMIO ranges, one
> behaving identically to the PCI BAR, while the other's functionality is
> exposed by private APIs in the PVG framework. It is only available on aarch64
> macOS hosts.
> 
> I had prior to this simultaneously and independently developed my own PVG
> integration for Qemu using the public PCI device APIs, with x86-64 and
> corresponding macOS guests and hosts as the target. After some months of
> use in production, I was slowly reviewing the code and readying it for
> upstreaming around the time Alexander posted his vmapple patches.
> 
> I ended up reviewing the vmapple PVG code in detail; I identified a number
> of issues with it (mainly thanks to my prior trial-and-error working with
> the framework) but overall I thought it a better basis for refinement
> than my own version:
> 
>   - It implemented the vmapple variant of the device. I thought it better to
>     port the part I understood well (PCI variant) to this than trying to port
>     the part I didn't understand well (MMIO vmapple variant) to my own code.
>   - The code was already tidier than my own.
> 
> It also became clear in out-of-band communication that Alexander would
> probably not end up having the time to see the patch through to inclusion,
> and was happy for me to start making changes and to integrate my PCI code.

Hi,

Thanks for continuing his effort.

Please submit a patch series that includes his patches. Please also 
merge fixes for his patches into them. This saves the effort to review 
the obsolete code and keeps git bisect working.

Regards,
Akihiko Odaki


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-07-15 20:20 ` Joe Lawrence
  2024-07-15 22:45   ` Re: Roman Rashchupkin
@ 2024-07-16  9:28   ` Nicolai Stange
       [not found]   ` <66963d60.170a0220.70a9a.8866SMTPIN_ADDED_BROKEN@mx.google.com>
  2 siblings, 0 replies; 1546+ messages in thread
From: Nicolai Stange @ 2024-07-16  9:28 UTC (permalink / raw)
  To: Joe Lawrence
  Cc: raschupkin.ri, live-patching, pmladek, mbenes, jikos, jpoimboe

Hi all,

Joe Lawrence <joe.lawrence@redhat.com> writes:

> On Sun, Jul 14, 2024 at 09:59:32PM +0200, raschupkin.ri@gmail.com wrote:
>> 
> But first, let me see if I understand the problem correctly.  Let's say
> points A and A' below represent the original kernel code reference
> get/put pairing in task execution flow.  A livepatch adds a new get/put
> pair, B and B' in the middle like so:
>
>   ---  execution flow  --->
>   -- A  B       B'  A'  -->
>
> There are potential issues if the livepatch is (de)activated
> mid-sequence, between the new pairings:
>
>   problem 1:
>   -- A      .   B'  A'  -->                   'B, but no B =  extra put!
>             ^ livepatch is activated here
>
>   problem 2:
>   -- A  B   .       A'  -->                   B, but no B' =  extra get!
>             ^ livepatch is deactivated here

I can confirm that this scenario happens quite often with real world CVE
fixes and there's currently no way to implement such changes safely from
a livepatch. But I also believe this is an instance of a broader problem
class we attempted to solve with that "enhanced" states API proposed and
discussed at LPC ([1], there's a link to a recording at the bottom). For
reference, see Petr's POC from [2].


> The first thing that comes to mind is that this might be solved using
> the existing shadow variable API.

Same.


> When the livepatch takes the new
> reference (B), it could create a new <struct, NEW_REF> shadow variable
> instance.  The livepatch code to return the reference (B') would then
> check on the shadow variable existence before doing so.  This would
> solve problem 1.
>
> The second problem is a little trickier.  Perhaps the shadow variable
> approach still works as long as a pre-unpatch hook* were to iterate
> through all the <*, NEW_REF> shadow variable instances and returned
> their reference before freeing the shadow variable and declaring the
> livepatch inactive.

I think the problem of consistently maintaining shadowed reference
counts (or anything shadowed for that matter) could be solved with the
help of aforementioned states API enhancements, so I would propose to
revive Petr's IMO more generic patchset as an alternative.

Thoughts?

Thanks,

Nicolai

[1] https://lpc.events/event/17/contributions/1541/
[2] https://lore.kernel.org/r/20231110170428.6664-1-pmladek@suse.com

-- 
SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461 Nürnberg, Germany
GF: Ivo Totev, Andrew McDonald, Werner Knoblich
(HRB 36809, AG Nürnberg)

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]   ` <66963d60.170a0220.70a9a.8866SMTPIN_ADDED_BROKEN@mx.google.com>
@ 2024-07-16  9:53     ` Roman Rashchupkin
  2024-07-25 14:52       ` Re: Joe Lawrence
  0 siblings, 1 reply; 1546+ messages in thread
From: Roman Rashchupkin @ 2024-07-16  9:53 UTC (permalink / raw)
  To: Nicolai Stange, Joe Lawrence
  Cc: live-patching, pmladek, mbenes, jikos, jpoimboe

 >> The first thing that comes to mind is that this might be solved using
 >> the existing shadow variable API.

 > Same.

I just don't have enough experience using live-patch shadow-variables, 
so I agree that probably that's a better general solution for problem 
(1) of refcount underflow, than mine refholder flags.

 > I can confirm that this scenario happens quite often with real world CVE
 > fixes and there's currently no way to implement such changes safely from
 > a livepatch. But I also believe this is an instance of a broader problem
 > class we attempted to solve with that "enhanced" states API proposed and
 > discussed at LPC ([1], there's a link to a recording at the bottom). For
 > reference, see Petr's POC from [2].

About (2) of incorrect refcount_t value left after unpatch, it seems as 
a bit different and more special problem with counters, compared to the 
support of live-patch states, which can be solved for refcount_t in a 
simpler way.

IMHO adding temporary kprefcount_t variable for the time of live-patch 
being applied, modifying only this kprefcount_t variable by all added in 
livepatch inc()/dec(), provides correct refcounting during live-patch is 
applied. Then at the unpatch this temporaray variable can just safely be 
discarded. This way the only additional code to live-patch would be 
functions with original refcount_dec_and_test() calls.

---

Roman Rashchupkin

On 7/16/24 11:28, Nicolai Stange wrote:
> Hi all,
>
> Joe Lawrence <joe.lawrence@redhat.com> writes:
>
>> On Sun, Jul 14, 2024 at 09:59:32PM +0200, raschupkin.ri@gmail.com wrote:
>> But first, let me see if I understand the problem correctly.  Let's say
>> points A and A' below represent the original kernel code reference
>> get/put pairing in task execution flow.  A livepatch adds a new get/put
>> pair, B and B' in the middle like so:
>>
>>    ---  execution flow  --->
>>    -- A  B       B'  A'  -->
>>
>> There are potential issues if the livepatch is (de)activated
>> mid-sequence, between the new pairings:
>>
>>    problem 1:
>>    -- A      .   B'  A'  -->                   'B, but no B =  extra put!
>>              ^ livepatch is activated here
>>
>>    problem 2:
>>    -- A  B   .       A'  -->                   B, but no B' =  extra get!
>>              ^ livepatch is deactivated here
> I can confirm that this scenario happens quite often with real world CVE
> fixes and there's currently no way to implement such changes safely from
> a livepatch. But I also believe this is an instance of a broader problem
> class we attempted to solve with that "enhanced" states API proposed and
> discussed at LPC ([1], there's a link to a recording at the bottom). For
> reference, see Petr's POC from [2].
>
>
>> The first thing that comes to mind is that this might be solved using
>> the existing shadow variable API.
> Same.
>
>
>> When the livepatch takes the new
>> reference (B), it could create a new <struct, NEW_REF> shadow variable
>> instance.  The livepatch code to return the reference (B') would then
>> check on the shadow variable existence before doing so.  This would
>> solve problem 1.
>>
>> The second problem is a little trickier.  Perhaps the shadow variable
>> approach still works as long as a pre-unpatch hook* were to iterate
>> through all the <*, NEW_REF> shadow variable instances and returned
>> their reference before freeing the shadow variable and declaring the
>> livepatch inactive.
> I think the problem of consistently maintaining shadowed reference
> counts (or anything shadowed for that matter) could be solved with the
> help of aforementioned states API enhancements, so I would propose to
> revive Petr's IMO more generic patchset as an alternative.
>
> Thoughts?
>
> Thanks,
>
> Nicolai
>
> [1] https://lpc.events/event/17/contributions/1541/
> [2] https://lore.kernel.org/r/20231110170428.6664-1-pmladek@suse.com
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-07-14 19:59 raschupkin.ri
  2024-07-15 20:20 ` Joe Lawrence
@ 2024-07-16 17:33 ` Song Liu
  1 sibling, 0 replies; 1546+ messages in thread
From: Song Liu @ 2024-07-16 17:33 UTC (permalink / raw)
  To: raschupkin.ri
  Cc: live-patching, joe.lawrence, pmladek, mbenes, jikos, jpoimboe

On Mon, Jul 15, 2024 at 4:00 AM <raschupkin.ri@gmail.com> wrote:
>
>
> [PATCH] livepatch: support of modifying refcount_t without underflow after unpatch
>
> CVE fixes sometimes add refcount_inc/dec() pairs to the code with existing refcount_t.
> Two problems arise when applying live-patch in this case:
> 1) After refcount_t is being inc() during system is live-patched, after unpatch the counter value will not be valid, as corresponing dec() would never be called.
> 2) Underflows are possible in runtime in case dec() is called before corresponding inc() in the live-patched code.
>
> Proposed kprefcount_t functions are using following approach to solve these two problems:
> 1) In addition to original refcount_t, temporary refcount_t is allocated, and after unpatch it is just removed. This way system is safe with correct refcounting while patch is applied, and no underflow would happend after unpatch.
> 2) For inc/dec() added by live-patch code, one bit in reference-holder structure is used (unsigned char *ref_holder, kprefholder_flag). In case dec() is called first, it is just ignored as ref_holder bit would still not be initialized.
>
>
> API is defined include/linux/livepatch_refcount.h:
>
> typedef struct kprefcount_struct {
>         refcount_t *refcount;
>         refcount_t kprefcount;
>         spinlock_t lock;
> } kprefcount_t;
>
> kprefcount_t *kprefcount_alloc(refcount_t *refcount, gfp_t flags);
> void kprefcount_free(kprefcount_t *kp_ref);
> int kprefcount_read(kprefcount_t *kp_ref);
> void kprefcount_inc(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
> void kprefcount_dec(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
> bool kprefcount_dec_and_test(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);

IIUC, kprefcount alone is not enough to solve the two issues. We still
need some mechanism to manage the "ref_holder". Shadow variable
is probably the best option here.

The primary idea here is to enhance the refcount with a map. This
may be too expensive in term memory consumption in some use
cases.

Overall, I don't think this change adds much more value on top of
shadow variable.

Thanks,
Song

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-07-16  6:07 ` Akihiko Odaki
@ 2024-07-17 11:16   ` Phil Dennis-Jordan
  0 siblings, 0 replies; 1546+ messages in thread
From: Phil Dennis-Jordan @ 2024-07-17 11:16 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: qemu-devel, pbonzini, agraf, graf, marcandre.lureau, berrange,
	thuth, philmd, peter.maydell, lists

[-- Attachment #1: Type: text/plain, Size: 717 bytes --]

On Tue, 16 Jul 2024 at 08:07, Akihiko Odaki <akihiko.odaki@daynix.com>
wrote:

> Hi,
>
> Thanks for continuing his effort.
>
> Please submit a patch series that includes his patches. Please also
> merge fixes for his patches into them. This saves the effort to review
> the obsolete code and keeps git bisect working.
>
>
Sorry about that - it looks like (a) my edits to the cover letter messed
something up and (b) patch 1 got email-filtered somewhere along the way for
having the "wrong" From: address. I've submitted v2 with most patches
squashed into patch 1, whose authorship I've also changed to myself (with
Co-authored-by tag for the original code) so hopefully this time around it
shows up OK.

Thanks,
Phil

[-- Attachment #2: Type: text/html, Size: 1122 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-07-16  9:53     ` Re: Roman Rashchupkin
@ 2024-07-25 14:52       ` Joe Lawrence
  0 siblings, 0 replies; 1546+ messages in thread
From: Joe Lawrence @ 2024-07-25 14:52 UTC (permalink / raw)
  To: Roman Rashchupkin, Nicolai Stange
  Cc: live-patching, pmladek, mbenes, jikos, jpoimboe

On 7/16/24 05:53, Roman Rashchupkin wrote:
>>> The first thing that comes to mind is that this might be solved using
>>> the existing shadow variable API.
> 
>> Same.
> 
> I just don't have enough experience using live-patch shadow-variables,
> so I agree that probably that's a better general solution for problem
> (1) of refcount underflow, than mine refholder flags.
> 

Yes, a general solution could cover the same problem but for different
datatypes, including locks, mutex, etc.

>> I can confirm that this scenario happens quite often with real world CVE
>> fixes and there's currently no way to implement such changes safely from
>> a livepatch. But I also believe this is an instance of a broader problem
>> class we attempted to solve with that "enhanced" states API proposed and
>> discussed at LPC ([1], there's a link to a recording at the bottom). For
>> reference, see Petr's POC from [2].

Thanks for the link -- I thought of that grand-unified
shadow/callback/states patch but couldn't find the latest version.  (I
see that Miroslav has just resurrected it with a fresh review, too.)

>> I think the problem of consistently maintaining shadowed reference
>> counts (or anything shadowed for that matter) could be solved with the
>> help of aforementioned states API enhancements, so I would propose to
>> revive Petr's IMO more generic patchset as an alternative.
>>
>> Thoughts?
>>

I definitely think the states API enhancement could be used to handle
the cases here via shadow variables.

In the meantime, are you using the kprefcount_t API currently via a
livepatch support module?  i.e. we don't need this in the kernel asap to
solve these problems, right?

-- 
Joe


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-08-14  8:03 howard_wang
@ 2024-08-14 15:04 ` Stephen Hemminger
  0 siblings, 0 replies; 1546+ messages in thread
From: Stephen Hemminger @ 2024-08-14 15:04 UTC (permalink / raw)
  To: howard_wang; +Cc: dev

On Wed, 14 Aug 2024 16:03:41 +0800
<howard_wang@realsil.com.cn> wrote:

> iff --git a/drivers/net/r8169/r8169_base.h b/drivers/net/r8169/r8169_base.h
> new file mode 100644
> index 0000000000..5d219a7966
> --- /dev/null
> +++ b/drivers/net/r8169/r8169_base.h
> @@ -0,0 +1,15 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Realtek Corporation. All rights reserved
> + */
> +
> +#ifndef _R8169_BASE_H_
> +#define _R8169_BASE_H_
> +
> +typedef uint8_t   u8;
> +typedef uint16_t  u16;
> +typedef uint32_t  u32;
> +typedef uint64_t  u64;
> +
> +#define PCI_VENDOR_ID_REALTEK 0x10EC
> +
> +#endif
> \ No newline at end of file

Fix you editor setup, all files should end with newline.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-08-16 11:07 Xi Ruoyao
@ 2024-08-19 12:40 ` Huacai Chen
  2024-08-19 13:01   ` Re: Jason A. Donenfeld
  2024-08-19 15:22   ` Re: Xi Ruoyao
  2024-08-27  9:45 ` Re: Jason A. Donenfeld
  1 sibling, 2 replies; 1546+ messages in thread
From: Huacai Chen @ 2024-08-19 12:40 UTC (permalink / raw)
  To: Xi Ruoyao
  Cc: Jason A . Donenfeld, WANG Xuerui, linux-crypto, loongarch,
	Jinyang He, Tiezhu Yang, Arnd Bergmann

Hi, Ruoyao,

Why no subject?

On Fri, Aug 16, 2024 at 7:07 PM Xi Ruoyao <xry111@xry111.site> wrote:
>
> Subject: [PATCH v3 0/2] LoongArch: Implement getrandom() in vDSO
>
> For the rationale to implement getrandom() in vDSO see [1].
>
> The vDSO getrandom() needs a stack-less ChaCha20 implementation, so we
> need to add architecture-specific code and wire it up with the generic
> code.  Both generic LoongArch implementation and Loongson SIMD eXtension
> based implementation are added.  To dispatch them at runtime without
> invoking cpucfg on each call, the alternative runtime patching mechanism
> is extended to cover the vDSO.
>
> The implementation is tested with the kernel selftests added by the last
> patch in [1].  I had to make some adjustments to make it work on
> LoongArch (see [2], I've not submitted the changes as at now because I'm
> unsure about the KHDR_INCLUDES addition).  The vdso_test_getrandom
> bench-single result:
>
>        vdso: 25000000 times in 0.647855257 seconds (generic)
>        vdso: 25000000 times in 0.601068605 seconds (LSX)
>        libc: 25000000 times in 6.948168864 seconds
>     syscall: 25000000 times in 6.990265548 seconds
>
> The vdso_test_getrandom bench-multi result:
>
>        vdso: 25000000 x 256 times in 35.322187834 seconds (generic)
>        vdso: 25000000 x 256 times in 29.183885426 seconds (LSX)
>        libc: 25000000 x 256 times in 356.628428409 seconds
>        syscall: 25000000 x 256 times in 334.764602866 seconds
I don't see significant improvements about LSX here, so I prefer to
just use the generic version to avoid complexity (I remember Linus
said the whole of __vdso_getrandom is not very useful).


Huacai

>
> [1]:https://lore.kernel.org/all/20240712014009.281406-1-Jason@zx2c4.com/
> [2]:https://github.com/xry111/linux/commits/xry111/la-vdso-v3/
>
> [v2]->v3:
> - Add a generic LoongArch implementation for which LSX isn't needed.
>
> v1->v2:
> - Properly send the series to the list.
>
> [v2]:https://lore.kernel.org/all/20240815133357.35829-1-xry111@xry111.site/
>
> Xi Ruoyao (3):
>   LoongArch: vDSO: Wire up getrandom() vDSO implementation
>   LoongArch: Perform alternative runtime patching on vDSO
>   LoongArch: vDSO: Add LSX implementation of vDSO getrandom()
>
>  arch/loongarch/Kconfig                      |   1 +
>  arch/loongarch/include/asm/vdso/getrandom.h |  47 ++++
>  arch/loongarch/include/asm/vdso/vdso.h      |   8 +
>  arch/loongarch/kernel/asm-offsets.c         |  10 +
>  arch/loongarch/kernel/vdso.c                |  14 +-
>  arch/loongarch/vdso/Makefile                |   6 +
>  arch/loongarch/vdso/memset.S                |  24 ++
>  arch/loongarch/vdso/vdso.lds.S              |   7 +
>  arch/loongarch/vdso/vgetrandom-chacha-lsx.S | 162 +++++++++++++
>  arch/loongarch/vdso/vgetrandom-chacha.S     | 252 ++++++++++++++++++++
>  arch/loongarch/vdso/vgetrandom.c            |  19 ++
>  11 files changed, 549 insertions(+), 1 deletion(-)
>  create mode 100644 arch/loongarch/include/asm/vdso/getrandom.h
>  create mode 100644 arch/loongarch/vdso/memset.S
>  create mode 100644 arch/loongarch/vdso/vgetrandom-chacha-lsx.S
>  create mode 100644 arch/loongarch/vdso/vgetrandom-chacha.S
>  create mode 100644 arch/loongarch/vdso/vgetrandom.c
>
> --
> 2.46.0
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-08-19 12:40 ` Huacai Chen
@ 2024-08-19 13:01   ` Jason A. Donenfeld
  2024-08-19 15:22     ` Re: Xi Ruoyao
  2024-08-19 15:22   ` Re: Xi Ruoyao
  1 sibling, 1 reply; 1546+ messages in thread
From: Jason A. Donenfeld @ 2024-08-19 13:01 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Xi Ruoyao, WANG Xuerui, linux-crypto, loongarch, Jinyang He,
	Tiezhu Yang, Arnd Bergmann

> I don't see significant improvements about LSX here, so I prefer to
> just use the generic version to avoid complexity (I remember Linus
> said the whole of __vdso_getrandom is not very useful).

I'm inclined to feel the same way, at least for now. Let's just go with
one implementation -- the generic one -- and then we can see if
optimization really makes sense later. I suspect the large speedup we're
already getting from being in the vDSO is already sufficient for
purposes.

Regards,
Jason

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-08-19 13:01   ` Re: Jason A. Donenfeld
@ 2024-08-19 15:22     ` Xi Ruoyao
  2024-08-19 15:54       ` Re: Xi Ruoyao
  0 siblings, 1 reply; 1546+ messages in thread
From: Xi Ruoyao @ 2024-08-19 15:22 UTC (permalink / raw)
  To: Jason A. Donenfeld, Huacai Chen
  Cc: WANG Xuerui, linux-crypto, loongarch, Jinyang He, Tiezhu Yang,
	Arnd Bergmann

On Mon, 2024-08-19 at 13:01 +0000, Jason A. Donenfeld wrote:
> > I don't see significant improvements about LSX here, so I prefer to
> > just use the generic version to avoid complexity (I remember Linus
> > said the whole of __vdso_getrandom is not very useful).
> 
> I'm inclined to feel the same way, at least for now. Let's just go with
> one implementation -- the generic one -- and then we can see if
> optimization really makes sense later. I suspect the large speedup we're
> already getting from being in the vDSO is already sufficient for
> purposes.

Ok I'll drop the 2nd and 3rd patches in the next version.  But I'm
puzzled why the LSX implementation isn't much faster, maybe I made some
mistake in it?

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-08-19 12:40 ` Huacai Chen
  2024-08-19 13:01   ` Re: Jason A. Donenfeld
@ 2024-08-19 15:22   ` Xi Ruoyao
  1 sibling, 0 replies; 1546+ messages in thread
From: Xi Ruoyao @ 2024-08-19 15:22 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Jason A . Donenfeld, WANG Xuerui, linux-crypto, loongarch,
	Jinyang He, Tiezhu Yang, Arnd Bergmann

On Mon, 2024-08-19 at 20:40 +0800, Huacai Chen wrote:
> Hi, Ruoyao,
> 
> Why no subject?

Because I misused git send-email (again) :(.


-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-08-19 15:22     ` Re: Xi Ruoyao
@ 2024-08-19 15:54       ` Xi Ruoyao
  0 siblings, 0 replies; 1546+ messages in thread
From: Xi Ruoyao @ 2024-08-19 15:54 UTC (permalink / raw)
  To: Jason A. Donenfeld, Huacai Chen
  Cc: WANG Xuerui, linux-crypto, loongarch, Jinyang He, Tiezhu Yang,
	Arnd Bergmann

On Mon, 2024-08-19 at 23:22 +0800, Xi Ruoyao wrote:
> On Mon, 2024-08-19 at 13:01 +0000, Jason A. Donenfeld wrote:
> > > I don't see significant improvements about LSX here, so I prefer to
> > > just use the generic version to avoid complexity (I remember Linus
> > > said the whole of __vdso_getrandom is not very useful).
> > 
> > I'm inclined to feel the same way, at least for now. Let's just go with
> > one implementation -- the generic one -- and then we can see if
> > optimization really makes sense later. I suspect the large speedup we're
> > already getting from being in the vDSO is already sufficient for
> > purposes.
> 
> Ok I'll drop the 2nd and 3rd patches in the next version.  But I'm
> puzzled why the LSX implementation isn't much faster, maybe I made some
> mistake in it?

After some thinking this seems making sense: the LoongArch desktop
processors have 4 ALUs able to perform the scalar add/rot/xor
operations, and the throughput is already maximized for ChaCha20 due to
the data dependency.  The advantage of LSX seems just to avoid reloading
key from the memory (because the register file is large enough to hold a
copy of it).

Perhaps LSX will be much better on those embedded processors with 2 ALUs
and 1 SIMD unit (if they don't downclock with heavy SIMD load), but I
don't have one for testing.

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-08-24  3:03                   ` Manivannan Sadhasivam
@ 2024-08-26  6:48                     ` Can Guo
  0 siblings, 0 replies; 1546+ messages in thread
From: Can Guo @ 2024-08-26  6:48 UTC (permalink / raw)
  To: Manivannan Sadhasivam, Bart Van Assche
  Cc: Bao D. Nguyen, Martin K . Petersen, linux-scsi,
	James E.J. Bottomley, Peter Wang, Avri Altman, Andrew Halaney,
	Bean Huo, Alim Akhtar, Eric Biggers, Minwoo Im, Maramaina Naresh

On 1/1/1970 8:00 AM, wrote:
> On Fri, Aug 23, 2024 at 07:48:50PM -0700, Bart Van Assche wrote:
>> On 8/23/24 7:29 PM, Manivannan Sadhasivam wrote:
>>> What if other vendors start adding the workaround in the core driver citing GKI
>>> requirement (provided it also removes some code as you justified)? Will it be
>>> acceptable? NO.
>> It's not up to you to define new rules for upstream kernel development.
> I'm not framing new rules, but just pointing out the common practice.
>
>> Anyone is allowed to publish patches that rework kernel code, whether
>> or not the purpose of such a patch is to work around a SoC bug.
>>
> Yes, at the same time if that code deviates from the norm, then anyone can
> complain. We are all working towards making the code better.
>
>> Additionally, it has already happened that one of your colleagues
>> submitted a workaround for a SoC bug to the UFS core driver.
>>  From the description of commit 0f52fcb99ea2 ("scsi: ufs: Try to save
>> power mode change and UIC cmd completion timeout"): "This is to deal
>> with the scenario in which completion has been raised but the one
>> waiting for the completion cannot be awaken in time due to kernel
>> scheduling problem." That description makes zero sense to me. My
>> conclusion from commit 0f52fcb99ea2 is that it is a workaround for a
>> bug in a UFS host controller, namely that a particular UFS host
>> controller not always generates a UIC completion interrupt when it
>> should.
>>
> 0f52fcb99ea2 was submitted in 2020 before I started contributing to UFS driver
> seriously. But the description of that commit never mentioned any issue with the
> controller. It vaguely mentions 'kernel scheduling problem' which I don't know
> how to interpret. If I were looking into the code at that time, I would've
> definitely asked for clarity during the review phase.

0f52fcb99ea2 is my commit, apologize for the confusion due to poor commit msg.
What we were trying to fix was not a SoC BUG. More background for this change:
from our customer side, we used to hit corner cases where the UIC command is
sent, UFS host controller generates the UIC command completion interrupt fine,
then UIC completion IRQ handler fires and calls the complete(), however the
completion timeout error still happens. In this case, UFS, UFS host and UFS
driver are the victims. And whatever could cause this scheduling problem should
be fixed properly by the right PoC, but we thought making UFS driver robust in
this spot would be good for all of the users who may face the similar issue,
hence the change.

Thanks,
Can Guo.

>
> But there is no need to take it as an example. I can only assert the fact that
> working around the controller defect in core code when we already have quirks
> for the same purpose defeats the purpose of quirks. And it will encourage other
> people to start changing the core code in the future thus bypassing the quirks.
>
> But I'm not a maintainer of this part of the code. So I cannot definitely stop
> you from getting this patch merged. I'll leave it up to Martin to decide.
>
> - Mani
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-08-16 11:07 Xi Ruoyao
  2024-08-19 12:40 ` Huacai Chen
@ 2024-08-27  9:45 ` Jason A. Donenfeld
  1 sibling, 0 replies; 1546+ messages in thread
From: Jason A. Donenfeld @ 2024-08-27  9:45 UTC (permalink / raw)
  To: Xi Ruoyao
  Cc: Huacai Chen, WANG Xuerui, linux-crypto, loongarch, Jinyang He,
	Tiezhu Yang, Arnd Bergmann

Hey,

Per https://lore.kernel.org/all/Zs2c_9Z6sFMNJs1O@zx2c4.com/ , you may
want to rebase on random.git and send a v4 series. Hopefully now it's
just a single patch.

Jason

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-09-13 17:11 David Hunter
@ 2024-09-13 20:39 ` Shuah Khan
  0 siblings, 0 replies; 1546+ messages in thread
From: Shuah Khan @ 2024-09-13 20:39 UTC (permalink / raw)
  To: David Hunter, Masahiro Yamada
  Cc: linux-kbuild, linux-kernel, shuah, javier.carrasco.cruz,
	Shuah Khan

On 9/13/24 11:11, David Hunter wrote:

Missing subject line for the cover-letter?
> 	
> Date: Fri, 13 Sep 2024 11:52:16 -0400
> Subject: [PATCH 0/7] linux-kbuild: fix: process configs set to "y"
> 
> An assumption made in this script is that the config options do not need
> to be processed because they will simply be in the new config file. This
> assumption is incorrect.
> 
> Process the config entries set to "y" because those config entries might
> have dependencies set to "m". If a config entry is set to "m" and is not
> loaded directly into the machine, the script will currently turn off
> that config entry; however, if that turned off config entry is a
> dependency for a "y" option. that means the config entry set to "y"
> will also be turned off later when the conf executive file is called.
> 
> Here is a model of the problem (arrows show dependency):
> 
> Original config file
> Config_1 (m) <-- Config_2 (y)
> 
> Config_1 is not loaded in this example, so it is turned off.
> After scripts/kconfig/streamline_config.pl, but before scripts/kconfig/conf
> Config_1 (n) <-- Config_2 (y)
> 
> After  scripts/kconfig/conf
> Config_1 (n) <-- Config_2 (n)
> 
> 
> It should also be noted that any module in the dependency chain will
> also be turned off, even if that module is loaded directly onto the
> computer. Here is an example:
> 
> Original config file
> Config_1 (m) <-- Config_2 (y) <-- Config_3 (m)
> 
> Config_3 will be loaded in this example.
> After scripts/kconfig/streamline_config.pl, but before scripts/kconfig/conf
> Config_1 (n) <-- Config_2 (y) <-- Config_3 (m)
> 
> After scripts/kconfig/conf
> Config_1 (n) <-- Config_2 (n) <-- Config_3 (n)
> 
> 
> I discovered this problem when I ran "make localmodconfig" on a generic
> Ubuntu config file. Many hardware devices were not recognized once the
> kernel was installed and booted. Another way to reproduced the error I
> had is to run "make localmodconfig" twice. The standard error might display
> warnings that certain modules should be selected but no config files are
> turned on that select that module.
> 
> With the changes in this series patch, all modules are loaded properly
> and all of the hardware is loaded when the kernel is installed and
> booted.
> 
> 
> David Hunter (7):
>    linux-kbuild: fix: config option can be bool
>    linux-kbuild: fix: missing variable operator
>    linux-kbuild: fix: ensure all defaults are tracked
>    linux-kbuild: fix: ensure selected configs were turned on in original
>    linux-kbuild: fix: implement choice for kconfigs
>    linux-kbuild: fix: configs with defaults do not need a prompt
>    linux-kbuild: fix: process config options set to "y"
> 
>   scripts/kconfig/streamline_config.pl | 77 ++++++++++++++++++++++++----
>   1 file changed, 66 insertions(+), 11 deletions(-)
> 


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-09-17  7:10 Akhil P Oommen
@ 2024-09-17  7:24 ` Dmitry Baryshkov
  0 siblings, 0 replies; 1546+ messages in thread
From: Dmitry Baryshkov @ 2024-09-17  7:24 UTC (permalink / raw)
  To: Akhil P Oommen; +Cc: GPUfirmwareforSA8775Pchipset, linux-firmware, quic_rajeshk

On Tue, 17 Sept 2024 at 09:10, Akhil P Oommen <quic_akhilpo@quicinc.com> wrote:
>
> The following changes since commit 6c88d9b8253b8ec6df701a551a56438ea2e5bacf:
>
>   Merge branch 'amd-staging' into 'main' (2024-09-13 20:28:50 +0000)
>
> are available in the Git repository at:
>
>   https://git.codelinaro.org/clo/linux-kernel/linux-firmware.git gpu-fw-SA8775p
>
> for you to fetch changes up to 43c971bcd74d9793140a8fbbcc805204cb797f96:
>
>   qcom: add gpu firmwares for sa8775p chipset (2024-09-17 11:57:51 +0530)
>
> ----------------------------------------------------------------
> Akhil P Oommen (1):
>       qcom: add gpu firmwares for sa8775p chipset
>
>  WHENCE                    |   2 ++
>  qcom/a663_gmu.bin         | Bin 0 -> 55892 bytes
>  qcom/sa8775p/a663_zap.mbn | Bin 0 -> 1054680 bytes
>  3 files changed, 2 insertions(+)
>  create mode 100644 qcom/a663_gmu.bin
>  create mode 100644 qcom/sa8775p/a663_zap.mbn

Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

Thank you!

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2024-10-10 22:44 PRIVATE
  0 siblings, 0 replies; 1546+ messages in thread
From: PRIVATE @ 2024-10-10 22:44 UTC (permalink / raw)
  To: kexec

I have a viable proposal for you, If You want to know more, get back to me.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-10-15 22:48 Daniel Yang
@ 2024-10-16  1:27 ` Jakub Kicinski
  0 siblings, 0 replies; 1546+ messages in thread
From: Jakub Kicinski @ 2024-10-16  1:27 UTC (permalink / raw)
  To: Daniel Yang
  Cc: Wenjia Zhang, Jan Karcher, D. Wythe, Tony Lu, Wen Gu,
	David S. Miller, Eric Dumazet, Paolo Abeni, linux-s390, netdev,
	linux-kernel

On Tue, 15 Oct 2024 15:48:03 -0700 Daniel Yang wrote:
> Subject: 
> Date: Tue, 15 Oct 2024 15:48:03 -0700
> X-Mailer: git-send-email 2.39.2
> 
> Date: Tue, 15 Oct 2024 15:31:12 -0700
> Subject: [PATCH v3 0/2 RESEND] resolve gtp possible deadlock warning

This is garbled as well. 

Before you repost please make sure you take a look at:
https://www.kernel.org/doc/html/next/process/maintainer-netdev.html#tl-dr
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-10-17  9:09 Paulo Miguel Almeida
@ 2024-10-17  9:12 ` Paulo Miguel Almeida
  0 siblings, 0 replies; 1546+ messages in thread
From: Paulo Miguel Almeida @ 2024-10-17  9:12 UTC (permalink / raw)
  To: tsbogend, bvanassche, gregkh, ricardo, zhanggenjian, linux-mips,
	linux-kernel

On Thu, Oct 17, 2024 at 10:09:26PM +1300, Paulo Miguel Almeida wrote:
> linux-hardening@vger.kernel.org
> Bcc: 
> Subject: [PATCH v2][next] mips: sgi-ip22: Replace "s[n]?printf" with
>  sysfs_emit in sysfs callbacks
> Reply-To: 
> 
> Replace open-coded pieces with sysfs_emit() helper in sysfs .show()
> callbacks.
> 
> Signed-off-by: Paulo Miguel Almeida <paulo.miguel.almeida.rodenas@gmail.com>
> ---
> Changelog:
> - v2: amend commit message (Req: Maciej W. Rozycki)
> - v1: https://lore.kernel.org/lkml/Zw2GRQkbx8Z8DlcS@mail.google.com/
> ---
> 

Apologies to you all. Fat finger from my part (and a little of mutt's fault too)

Will submit the patch shortly

- Paulo A.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2024-11-23  1:39 the Hide
@ 2024-11-23  7:32 ` Christoph Biedl
  0 siblings, 0 replies; 1546+ messages in thread
From: Christoph Biedl @ 2024-11-23  7:32 UTC (permalink / raw)
  To: the Hide; +Cc: stable

[-- Attachment #1: Type: text/plain, Size: 828 bytes --]

the Hide wrote...

> Who should I contact regarding the following error
> 
> 
> E: Malformed entry 5 in list file
> /etc/apt/sources.list.d/additional-repositories.list (Component)
> E: The list of sources could not be read.
> E: _cache->open() failed, please report.

Assuming you're using Debian and not some derivatve: Some Debian users
mailing list, like <https://lists.debian.org/debian-user/>

From the above error message I assume there's a format error in
/etc/apt/sources.list.d/additional-repositories.list - so it was wise to
include the content of that file in a message to that list.

If it's actually a bug in apt, the Debian bug tracker was the place to
go. This list here however is about development of the Linux kernel, the
stable releases, so not quite the right place.

    Christoph

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2024-11-25 19:23 Robert Harewood
  0 siblings, 0 replies; 1546+ messages in thread
From: Robert Harewood @ 2024-11-25 19:23 UTC (permalink / raw)
  To: v9fs

Dear CEO,

I hope this message finds you safe and well.

I have a team of investors who are interested in providing capital to your
business operations and projects without any upfront costs from you. I'd
love to have the chance to discuss the details with you further.

Please let me know if this is something that you would be interested in, and
we can schedule a call to further evaluate the details.

Thank you for your time and consideration.

Warm regards,

Robert Harewood,
Robert Harewood Advisory
The Broadgate Tower
20 Primrose St,
London, United Kingdom, EC2A 2EW.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2024-11-25 20:13 Robert Harewood
  0 siblings, 0 replies; 1546+ messages in thread
From: Robert Harewood @ 2024-11-25 20:13 UTC (permalink / raw)
  To: llvm

Dear CEO,

I hope this message finds you safe and well.

I have a team of investors who are interested in providing capital to your
business operations and projects without any upfront costs from you. I'd
love to have the chance to discuss the details with you further.

Please let me know if this is something that you would be interested in, and
we can schedule a call to further evaluate the details.

Thank you for your time and consideration.

Warm regards,

Robert Harewood,
Robert Harewood Advisory
The Broadgate Tower
20 Primrose St,
London, United Kingdom, EC2A 2EW.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-01-08 13:59 Jiang Liu
@ 2025-01-08 14:10 ` Christian König
  2025-01-08 16:33 ` Re: Mario Limonciello
  1 sibling, 0 replies; 1546+ messages in thread
From: Christian König @ 2025-01-08 14:10 UTC (permalink / raw)
  To: Jiang Liu, alexander.deucher, Xinhui.Pan, airlied, simona,
	sunil.khatri, lijo.lazar, Hawking.Zhang, mario.limonciello,
	Jun.Ma2, xiaogang.chen, Kent.Russell, shuox.liu, amd-gfx

Am 08.01.25 um 14:59 schrieb Jiang Liu:
> Subject: [RFC PATCH 00/13] Enhance device state machine to better support suspend/resume
>
> Recently we were testing suspend/resume functionality with AMD GPUs,
> we have encountered several resource tracking related bugs, such as
> double buffer free, use after free and unbalanced irq reference count.
>
> We have tried to solve these issues case by case, but found that may
> not be the right way. Especially about the unbalanced irq reference
> count, there will be new issues appear once we fixed the current known
> issues. After analyzing related source code, we found that there may be
> some fundamental implementaion flaws behind these resource tracking
> issues.

In general please run your patches through checkpatch.pl. There are 
quite a number of style issues with those code changes.

>
> The amdgpu driver has two major state machines to driver the device
> management flow, one is for ip blocks, the other is for ras blocks.
> The hook points defined in struct amd_ip_funcs for device setup/teardown
> are symmetric, but the implementation is asymmetric, sometime even
> ambiguous. The most obvious two issues we noticed are:
> 1) amdgpu_irq_get() are called from .late_init() but amdgpu_irq_put()
>     are called from .hw_fini() instead of .early_fini().

Yes and if I remember correctly that is absolutely intentional.

IRQs can't be enabled unless all IP blocks are up and running because 
otherwise the IRQ handler sometimes doesn't have the necessary 
functionality at hand.

But for HW fini we only disable IRQs before we actually tear down the HW 
state because we need them for operation feedback. E.g. for example ring 
buffer completion interrupts for tear down commands.

Regards,
Christian.

> 2) the way to reset ip_bloc.status.valid/sw/hw/late_initialized doesn't
>     match the way to set those flags.
>
> When taking device suspend/resume into account, in addition to device
> probe/remove, things get much more complex. Some issues arise because
> many suspend/resume implementations directly reuse .hw_init/.hw_fini/
> .late_init hook points.
>
> So we try to fix those issues by two enhancements/refinements to current
> device management state machines.
>
> The first change is to make the ip block state machine and associated
> status flags work in stack-like way as below:
> Callback        Status Flags
> early_init:     valid = true
> sw_init:        sw = true
> hw_init:        hw = true
> late_init:      late_initialized = true
> early_fini:     late_initialized = false
> hw_fini:        hw = false
> sw_fini:        sw = false
> late_fini:      valid = false
>
> Also do the same thing for ras block state machine, though it's much
> more simpler.
>
> The second change is fine tune the overall device management work
> flow as below:
> 1. amdgpu_driver_load_kms()
> 	amdgpu_device_init()
> 		amdgpu_device_ip_early_init()
> 			ip_blocks[i].early_init()
> 			ip_blocks[i].status.valid = true
> 		amdgpu_device_ip_init()
> 			amdgpu_ras_init()
> 			ip_blocks[i].sw_init()
> 			ip_blocks[i].status.sw = true
> 			ip_blocks[i].hw_init()
> 			ip_blocks[i].status.hw = true
> 		amdgpu_device_ip_late_init()
> 			ip_blocks[i].late_init()
> 			ip_blocks[i].status.late_initialized = true
> 			amdgpu_ras_late_init()
> 				ras_blocks[i].ras_late_init()
> 					amdgpu_ras_feature_enable_on_boot()
>
> 2. amdgpu_pmops_suspend()/amdgpu_pmops_freeze()/amdgpu_pmops_poweroff()
> 	amdgpu_device_suspend()
> 		amdgpu_ras_early_fini()
> 			ras_blocks[i].ras_early_fini()
> 				amdgpu_ras_feature_disable()
> 		amdgpu_ras_suspend()
> 			amdgpu_ras_disable_all_features()
> +++		ip_blocks[i].early_fini()
> +++		ip_blocks[i].status.late_initialized = false
> 		ip_blocks[i].suspend()
>
> 3. amdgpu_pmops_resume()/amdgpu_pmops_thaw()/amdgpu_pmops_restore()
> 	amdgpu_device_resume()
> 		amdgpu_device_ip_resume()
> 			ip_blocks[i].resume()
> 		amdgpu_device_ip_late_init()
> 			ip_blocks[i].late_init()
> 			ip_blocks[i].status.late_initialized = true
> 			amdgpu_ras_late_init()
> 				ras_blocks[i].ras_late_init()
> 					amdgpu_ras_feature_enable_on_boot()
> 		amdgpu_ras_resume()
> 			amdgpu_ras_enable_all_features()
>
> 4. amdgpu_driver_unload_kms()
> 	amdgpu_device_fini_hw()
> 		amdgpu_ras_early_fini()
> 			ras_blocks[i].ras_early_fini()
> +++		ip_blocks[i].early_fini()
> +++		ip_blocks[i].status.late_initialized = false
> 		ip_blocks[i].hw_fini()
> 		ip_blocks[i].status.hw = false
>
> 5. amdgpu_driver_release_kms()
> 	amdgpu_device_fini_sw()
> 		amdgpu_device_ip_fini()
> 			ip_blocks[i].sw_fini()
> 			ip_blocks[i].status.sw = false
> ---			ip_blocks[i].status.valid = false
> +++			amdgpu_ras_fini()
> 			ip_blocks[i].late_fini()
> +++			ip_blocks[i].status.valid = false
> ---			ip_blocks[i].status.late_initialized = false
> ---			amdgpu_ras_fini()
>
> The main changes include:
> 1) invoke ip_blocks[i].early_fini in amdgpu_pmops_suspend().
>     Currently there's only one ip block which provides `early_fini`
>     callback. We have add a check of `in_s3` to keep current behavior in
>     function amdgpu_dm_early_fini(). So there should be no functional
>     changes.
> 2) set ip_blocks[i].status.late_initialized to false after calling
>     callback `early_fini`. We have auditted all usages of the
>     late_initialized flag and no functional changes found.
> 3) only set ip_blocks[i].status.valid = false after calling the
>     `late_fini` callback.
> 4) call amdgpu_ras_fini() before invoking ip_blocks[i].late_fini.
>
> Then we try to refine each subsystem, such as nbio, asic, gfx, gmc,
> ras etc, to follow the new design. Currently we have only taken the
> nbio and asic as examples to show the proposed changes. Once we have
> confirmed that's the right way to go, we will handle the lefting
> subsystems.
>
> This is in early stage and requesting for comments, any comments and
> suggestions are welcomed!
> Jiang Liu (13):
>    amdgpu: wrong array index to get ip block for PSP
>    drm/admgpu: add helper functions to track status for ras manager
>    drm/amdgpu: add a flag to track ras debugfs creation status
>    drm/amdgpu: free all resources on error recovery path of
>      amdgpu_ras_init()
>    drm/amdgpu: introduce a flag to track refcount held for features
>    drm/amdgpu: enhance amdgpu_ras_block_late_fini()
>    drm/amdgpu: enhance amdgpu_ras_pre_fini() to better support SR
>    drm/admgpu: rename amdgpu_ras_pre_fini() to amdgpu_ras_early_fini()
>    drm/amdgpu: make IP block state machine works in stack like way
>    drm/admgpu: make device state machine work in stack like way
>    drm/amdgpu/sdma: improve the way to manage irq reference count
>    drm/amdgpu/nbio: improve the way to manage irq reference count
>    drm/amdgpu/asic: make ip block operations symmetric by .early_fini()
>
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  40 +++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  37 ++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c      |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c      |  16 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h      |   1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       |   8 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       | 144 +++++++++++++-----
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h       |  16 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c      |  26 +++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h      |   2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c      |   2 +-
>   drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c        |   1 +
>   drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c        |   1 +
>   drivers/gpu/drm/amd/amdgpu/nv.c               |  14 +-
>   drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c        |   8 -
>   drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c      |  23 +--
>   drivers/gpu/drm/amd/amdgpu/soc15.c            |  38 ++---
>   drivers/gpu/drm/amd/amdgpu/soc21.c            |  35 +++--
>   drivers/gpu/drm/amd/amdgpu/soc24.c            |  17 ++-
>   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |   3 +
>   25 files changed, 326 insertions(+), 118 deletions(-)
>


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-01-08 13:59 Jiang Liu
  2025-01-08 14:10 ` Christian König
@ 2025-01-08 16:33 ` Mario Limonciello
  2025-01-09  5:34   ` Re: Gerry Liu
  1 sibling, 1 reply; 1546+ messages in thread
From: Mario Limonciello @ 2025-01-08 16:33 UTC (permalink / raw)
  To: Jiang Liu, alexander.deucher, christian.koenig, Xinhui.Pan,
	airlied, simona, sunil.khatri, lijo.lazar, Hawking.Zhang, Jun.Ma2,
	xiaogang.chen, Kent.Russell, shuox.liu, amd-gfx

On 1/8/2025 07:59, Jiang Liu wrote:
> Subject: [RFC PATCH 00/13] Enhance device state machine to better support suspend/resume

I'm not sure how this happened, but your subject didn't end up in the 
subject of the thread on patch 0 so the thread just looks like an 
unsubjected thread.

> 
> Recently we were testing suspend/resume functionality with AMD GPUs,
> we have encountered several resource tracking related bugs, such as
> double buffer free, use after free and unbalanced irq reference count.

Can you share more aobut how you were hitting these issues?  Are they 
specific to S3 or to s2idle flows?  dGPU or APU?
Are they only with SRIOV?

Is there anything to do with the host influencing the failures to 
happen, or are you contriving the failures to find the bugs?

I know we've had some reports about resource tracking warnings on the 
reset flows, but I haven't heard much about suspend/resume.

> 
> We have tried to solve these issues case by case, but found that may
> not be the right way. Especially about the unbalanced irq reference
> count, there will be new issues appear once we fixed the current known
> issues. After analyzing related source code, we found that there may be
> some fundamental implementaion flaws behind these resource tracking

implementation

> issues.
> 
> The amdgpu driver has two major state machines to driver the device
> management flow, one is for ip blocks, the other is for ras blocks.
> The hook points defined in struct amd_ip_funcs for device setup/teardown
> are symmetric, but the implementation is asymmetric, sometime even
> ambiguous. The most obvious two issues we noticed are:
> 1) amdgpu_irq_get() are called from .late_init() but amdgpu_irq_put()
>     are called from .hw_fini() instead of .early_fini().
> 2) the way to reset ip_bloc.status.valid/sw/hw/late_initialized doesn't
>     match the way to set those flags.
> 
> When taking device suspend/resume into account, in addition to device
> probe/remove, things get much more complex. Some issues arise because
> many suspend/resume implementations directly reuse .hw_init/.hw_fini/
> .late_init hook points.
>
> So we try to fix those issues by two enhancements/refinements to current
> device management state machines.
> 
> The first change is to make the ip block state machine and associated
> status flags work in stack-like way as below:
> Callback        Status Flags
> early_init:     valid = true
> sw_init:        sw = true
> hw_init:        hw = true
> late_init:      late_initialized = true
> early_fini:     late_initialized = false
> hw_fini:        hw = false
> sw_fini:        sw = false
> late_fini:      valid = false

At a high level this makes sense to me, but I'd just call 'late' or 
'late_init'.

Another idea if you make it stack like is to do it as a true enum for 
the state machine and store it all in one variable.

> 
> Also do the same thing for ras block state machine, though it's much
> more simpler.
> 
> The second change is fine tune the overall device management work
> flow as below:
> 1. amdgpu_driver_load_kms()
> 	amdgpu_device_init()
> 		amdgpu_device_ip_early_init()
> 			ip_blocks[i].early_init()
> 			ip_blocks[i].status.valid = true
> 		amdgpu_device_ip_init()
> 			amdgpu_ras_init()
> 			ip_blocks[i].sw_init()
> 			ip_blocks[i].status.sw = true
> 			ip_blocks[i].hw_init()
> 			ip_blocks[i].status.hw = true
> 		amdgpu_device_ip_late_init()
> 			ip_blocks[i].late_init()
> 			ip_blocks[i].status.late_initialized = true
> 			amdgpu_ras_late_init()
> 				ras_blocks[i].ras_late_init()
> 					amdgpu_ras_feature_enable_on_boot()
> 
> 2. amdgpu_pmops_suspend()/amdgpu_pmops_freeze()/amdgpu_pmops_poweroff()
> 	amdgpu_device_suspend()
> 		amdgpu_ras_early_fini()
> 			ras_blocks[i].ras_early_fini()
> 				amdgpu_ras_feature_disable()
> 		amdgpu_ras_suspend()
> 			amdgpu_ras_disable_all_features()
> +++		ip_blocks[i].early_fini()
> +++		ip_blocks[i].status.late_initialized = false
> 		ip_blocks[i].suspend()
> 
> 3. amdgpu_pmops_resume()/amdgpu_pmops_thaw()/amdgpu_pmops_restore()
> 	amdgpu_device_resume()
> 		amdgpu_device_ip_resume()
> 			ip_blocks[i].resume()
> 		amdgpu_device_ip_late_init()
> 			ip_blocks[i].late_init()
> 			ip_blocks[i].status.late_initialized = true
> 			amdgpu_ras_late_init()
> 				ras_blocks[i].ras_late_init()
> 					amdgpu_ras_feature_enable_on_boot()
> 		amdgpu_ras_resume()
> 			amdgpu_ras_enable_all_features()
> 
> 4. amdgpu_driver_unload_kms()
> 	amdgpu_device_fini_hw()
> 		amdgpu_ras_early_fini()
> 			ras_blocks[i].ras_early_fini()
> +++		ip_blocks[i].early_fini()
> +++		ip_blocks[i].status.late_initialized = false
> 		ip_blocks[i].hw_fini()
> 		ip_blocks[i].status.hw = false
> 
> 5. amdgpu_driver_release_kms()
> 	amdgpu_device_fini_sw()
> 		amdgpu_device_ip_fini()
> 			ip_blocks[i].sw_fini()
> 			ip_blocks[i].status.sw = false
> ---			ip_blocks[i].status.valid = false
> +++			amdgpu_ras_fini()
> 			ip_blocks[i].late_fini()
> +++			ip_blocks[i].status.valid = false
> ---			ip_blocks[i].status.late_initialized = false
> ---			amdgpu_ras_fini()
> 
> The main changes include:
> 1) invoke ip_blocks[i].early_fini in amdgpu_pmops_suspend().
>     Currently there's only one ip block which provides `early_fini`
>     callback. We have add a check of `in_s3` to keep current behavior in
>     function amdgpu_dm_early_fini(). So there should be no functional
>     changes.
> 2) set ip_blocks[i].status.late_initialized to false after calling
>     callback `early_fini`. We have auditted all usages of the
>     late_initialized flag and no functional changes found.
> 3) only set ip_blocks[i].status.valid = false after calling the
>     `late_fini` callback.
> 4) call amdgpu_ras_fini() before invoking ip_blocks[i].late_fini.
> 
> Then we try to refine each subsystem, such as nbio, asic, gfx, gmc,
> ras etc, to follow the new design. Currently we have only taken the
> nbio and asic as examples to show the proposed changes. Once we have
> confirmed that's the right way to go, we will handle the lefting
> subsystems.
> 
> This is in early stage and requesting for comments, any comments and
> suggestions are welcomed!
> Jiang Liu (13):
>    amdgpu: wrong array index to get ip block for PSP
>    drm/admgpu: add helper functions to track status for ras manager
>    drm/amdgpu: add a flag to track ras debugfs creation status
>    drm/amdgpu: free all resources on error recovery path of
>      amdgpu_ras_init()
>    drm/amdgpu: introduce a flag to track refcount held for features
>    drm/amdgpu: enhance amdgpu_ras_block_late_fini()
>    drm/amdgpu: enhance amdgpu_ras_pre_fini() to better support SR
>    drm/admgpu: rename amdgpu_ras_pre_fini() to amdgpu_ras_early_fini()
>    drm/amdgpu: make IP block state machine works in stack like way
>    drm/admgpu: make device state machine work in stack like way
>    drm/amdgpu/sdma: improve the way to manage irq reference count
>    drm/amdgpu/nbio: improve the way to manage irq reference count
>    drm/amdgpu/asic: make ip block operations symmetric by .early_fini()
> 
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  40 +++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  37 ++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c      |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c      |  16 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h      |   1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       |   8 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       | 144 +++++++++++++-----
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h       |  16 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c      |  26 +++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h      |   2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c      |   2 +-
>   drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c        |   1 +
>   drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c        |   1 +
>   drivers/gpu/drm/amd/amdgpu/nv.c               |  14 +-
>   drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c        |   8 -
>   drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c      |  23 +--
>   drivers/gpu/drm/amd/amdgpu/soc15.c            |  38 ++---
>   drivers/gpu/drm/amd/amdgpu/soc21.c            |  35 +++--
>   drivers/gpu/drm/amd/amdgpu/soc24.c            |  17 ++-
>   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |   3 +
>   25 files changed, 326 insertions(+), 118 deletions(-)
> 


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-01-08 16:33 ` Re: Mario Limonciello
@ 2025-01-09  5:34   ` Gerry Liu
  2025-01-09 17:10     ` Re: Mario Limonciello
  0 siblings, 1 reply; 1546+ messages in thread
From: Gerry Liu @ 2025-01-09  5:34 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: alexander.deucher, christian.koenig, Xinhui.Pan, airlied, simona,
	sunil.khatri, Lazar, Lijo, Hawking.Zhang, Chen, Xiaogang,
	Kent.Russell, Shuo Liu, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 10505 bytes --]



> 2025年1月9日 00:33，Mario Limonciello <mario.limonciello@amd.com> 写道：
> 
> On 1/8/2025 07:59, Jiang Liu wrote:
>> Subject: [RFC PATCH 00/13] Enhance device state machine to better support suspend/resume
> 
> I'm not sure how this happened, but your subject didn't end up in the subject of the thread on patch 0 so the thread just looks like an unsubjected thread.
Maybe it’s caused by one extra blank line at the header.

> 
>> Recently we were testing suspend/resume functionality with AMD GPUs,
>> we have encountered several resource tracking related bugs, such as
>> double buffer free, use after free and unbalanced irq reference count.
> 
> Can you share more aobut how you were hitting these issues?  Are they specific to S3 or to s2idle flows?  dGPU or APU?
> Are they only with SRIOV?
> 
> Is there anything to do with the host influencing the failures to happen, or are you contriving the failures to find the bugs?
> 
> I know we've had some reports about resource tracking warnings on the reset flows, but I haven't heard much about suspend/resume.
We are investigating to develop some advanced product features based on amdgpu suspend/resume.
So we started by tested the suspend/resume functionality of AMD 308x GPUs with the following simple script:
```
echo platform > /sys/power/pm_test
i=0
while true; do
        echo mem > /sys/power/state
        let i=i+1
        echo $i
        sleep 1
done
```

It succeeds with the first and second iteration but always fails on following iterations on a bare metal servers with eight MI308X GPUs.
With some investigation we found that the gpu asic should be reset during the test, so we submitted a patch to fix the failure (https://github.com/ROCm/ROCK-Kernel-Driver/pull/181 <https://github.com/ROCm/ROCK-Kernel-Driver/pull/181>)

During analyze and root-cause the failure, we have encountered several crashes, resource leakages and false alarms.
So I have worked out patch sets to solve issues we encountered. The other patch set is https://lists.freedesktop.org/archives/amd-gfx/2025-January/118484.html <https://lists.freedesktop.org/archives/amd-gfx/2025-January/118484.html>

With sriov in single VF mode, resume always fails. Seems some contexts/vram buffers get lost during suspend and haven’t be restored on resume, so cause failure.
We haven’t tested sriov in multiple VFs mode yet. We need more help from AMD side to make SR work for SRIOV:)  

> 
>> We have tried to solve these issues case by case, but found that may
>> not be the right way. Especially about the unbalanced irq reference
>> count, there will be new issues appear once we fixed the current known
>> issues. After analyzing related source code, we found that there may be
>> some fundamental implementaion flaws behind these resource tracking
> 
> implementation
> 
>> issues.
>> The amdgpu driver has two major state machines to driver the device
>> management flow, one is for ip blocks, the other is for ras blocks.
>> The hook points defined in struct amd_ip_funcs for device setup/teardown
>> are symmetric, but the implementation is asymmetric, sometime even
>> ambiguous. The most obvious two issues we noticed are:
>> 1) amdgpu_irq_get() are called from .late_init() but amdgpu_irq_put()
>>    are called from .hw_fini() instead of .early_fini().
>> 2) the way to reset ip_bloc.status.valid/sw/hw/late_initialized doesn't
>>    match the way to set those flags.
>> When taking device suspend/resume into account, in addition to device
>> probe/remove, things get much more complex. Some issues arise because
>> many suspend/resume implementations directly reuse .hw_init/.hw_fini/
>> .late_init hook points.
>> 
>> So we try to fix those issues by two enhancements/refinements to current
>> device management state machines.
>> The first change is to make the ip block state machine and associated
>> status flags work in stack-like way as below:
>> Callback        Status Flags
>> early_init:     valid = true
>> sw_init:        sw = true
>> hw_init:        hw = true
>> late_init:      late_initialized = true
>> early_fini:     late_initialized = false
>> hw_fini:        hw = false
>> sw_fini:        sw = false
>> late_fini:      valid = false
> 
> At a high level this makes sense to me, but I'd just call 'late' or 'late_init'.
> 
> Another idea if you make it stack like is to do it as a true enum for the state machine and store it all in one variable.
I will add a patch to convert those bool flags into an enum.
Thanks,
Gerry

> 
>> Also do the same thing for ras block state machine, though it's much
>> more simpler.
>> The second change is fine tune the overall device management work
>> flow as below:
>> 1. amdgpu_driver_load_kms()
>> 	amdgpu_device_init()
>> 		amdgpu_device_ip_early_init()
>> 			ip_blocks[i].early_init()
>> 			ip_blocks[i].status.valid = true
>> 		amdgpu_device_ip_init()
>> 			amdgpu_ras_init()
>> 			ip_blocks[i].sw_init()
>> 			ip_blocks[i].status.sw = true
>> 			ip_blocks[i].hw_init()
>> 			ip_blocks[i].status.hw = true
>> 		amdgpu_device_ip_late_init()
>> 			ip_blocks[i].late_init()
>> 			ip_blocks[i].status.late_initialized = true
>> 			amdgpu_ras_late_init()
>> 				ras_blocks[i].ras_late_init()
>> 					amdgpu_ras_feature_enable_on_boot()
>> 2. amdgpu_pmops_suspend()/amdgpu_pmops_freeze()/amdgpu_pmops_poweroff()
>> 	amdgpu_device_suspend()
>> 		amdgpu_ras_early_fini()
>> 			ras_blocks[i].ras_early_fini()
>> 				amdgpu_ras_feature_disable()
>> 		amdgpu_ras_suspend()
>> 			amdgpu_ras_disable_all_features()
>> +++		ip_blocks[i].early_fini()
>> +++		ip_blocks[i].status.late_initialized = false
>> 		ip_blocks[i].suspend()
>> 3. amdgpu_pmops_resume()/amdgpu_pmops_thaw()/amdgpu_pmops_restore()
>> 	amdgpu_device_resume()
>> 		amdgpu_device_ip_resume()
>> 			ip_blocks[i].resume()
>> 		amdgpu_device_ip_late_init()
>> 			ip_blocks[i].late_init()
>> 			ip_blocks[i].status.late_initialized = true
>> 			amdgpu_ras_late_init()
>> 				ras_blocks[i].ras_late_init()
>> 					amdgpu_ras_feature_enable_on_boot()
>> 		amdgpu_ras_resume()
>> 			amdgpu_ras_enable_all_features()
>> 4. amdgpu_driver_unload_kms()
>> 	amdgpu_device_fini_hw()
>> 		amdgpu_ras_early_fini()
>> 			ras_blocks[i].ras_early_fini()
>> +++		ip_blocks[i].early_fini()
>> +++		ip_blocks[i].status.late_initialized = false
>> 		ip_blocks[i].hw_fini()
>> 		ip_blocks[i].status.hw = false
>> 5. amdgpu_driver_release_kms()
>> 	amdgpu_device_fini_sw()
>> 		amdgpu_device_ip_fini()
>> 			ip_blocks[i].sw_fini()
>> 			ip_blocks[i].status.sw = false
>> ---			ip_blocks[i].status.valid = false
>> +++			amdgpu_ras_fini()
>> 			ip_blocks[i].late_fini()
>> +++			ip_blocks[i].status.valid = false
>> ---			ip_blocks[i].status.late_initialized = false
>> ---			amdgpu_ras_fini()
>> The main changes include:
>> 1) invoke ip_blocks[i].early_fini in amdgpu_pmops_suspend().
>>    Currently there's only one ip block which provides `early_fini`
>>    callback. We have add a check of `in_s3` to keep current behavior in
>>    function amdgpu_dm_early_fini(). So there should be no functional
>>    changes.
>> 2) set ip_blocks[i].status.late_initialized to false after calling
>>    callback `early_fini`. We have auditted all usages of the
>>    late_initialized flag and no functional changes found.
>> 3) only set ip_blocks[i].status.valid = false after calling the
>>    `late_fini` callback.
>> 4) call amdgpu_ras_fini() before invoking ip_blocks[i].late_fini.
>> Then we try to refine each subsystem, such as nbio, asic, gfx, gmc,
>> ras etc, to follow the new design. Currently we have only taken the
>> nbio and asic as examples to show the proposed changes. Once we have
>> confirmed that's the right way to go, we will handle the lefting
>> subsystems.
>> This is in early stage and requesting for comments, any comments and
>> suggestions are welcomed!
>> Jiang Liu (13):
>>   amdgpu: wrong array index to get ip block for PSP
>>   drm/admgpu: add helper functions to track status for ras manager
>>   drm/amdgpu: add a flag to track ras debugfs creation status
>>   drm/amdgpu: free all resources on error recovery path of
>>     amdgpu_ras_init()
>>   drm/amdgpu: introduce a flag to track refcount held for features
>>   drm/amdgpu: enhance amdgpu_ras_block_late_fini()
>>   drm/amdgpu: enhance amdgpu_ras_pre_fini() to better support SR
>>   drm/admgpu: rename amdgpu_ras_pre_fini() to amdgpu_ras_early_fini()
>>   drm/amdgpu: make IP block state machine works in stack like way
>>   drm/admgpu: make device state machine work in stack like way
>>   drm/amdgpu/sdma: improve the way to manage irq reference count
>>   drm/amdgpu/nbio: improve the way to manage irq reference count
>>   drm/amdgpu/asic: make ip block operations symmetric by .early_fini()
>>  drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  40 +++++
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  37 ++++-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c       |   2 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c      |   2 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c      |  16 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h      |   1 +
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       |   8 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       | 144 +++++++++++++-----
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h       |  16 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c      |  26 +++-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h      |   2 +
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c       |   2 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       |   2 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c      |   2 +-
>>  drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c       |   2 +-
>>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c       |   2 +-
>>  drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c        |   1 +
>>  drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c        |   1 +
>>  drivers/gpu/drm/amd/amdgpu/nv.c               |  14 +-
>>  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c        |   8 -
>>  drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c      |  23 +--
>>  drivers/gpu/drm/amd/amdgpu/soc15.c            |  38 ++---
>>  drivers/gpu/drm/amd/amdgpu/soc21.c            |  35 +++--
>>  drivers/gpu/drm/amd/amdgpu/soc24.c            |  17 ++-
>>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |   3 +
>>  25 files changed, 326 insertions(+), 118 deletions(-)


[-- Attachment #2: Type: text/html, Size: 38668 bytes --]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-01-09  5:34   ` Re: Gerry Liu
@ 2025-01-09 17:10     ` Mario Limonciello
  2025-01-13  1:19       ` Re: Gerry Liu
  0 siblings, 1 reply; 1546+ messages in thread
From: Mario Limonciello @ 2025-01-09 17:10 UTC (permalink / raw)
  To: Gerry Liu
  Cc: alexander.deucher, christian.koenig, Xinhui.Pan, airlied, simona,
	sunil.khatri, Lazar, Lijo, Hawking.Zhang, Chen, Xiaogang,
	Kent.Russell, Shuo Liu, amd-gfx

General note - don't use HTML for mailing list communication.

I'm not sure if Apple Mail lets you switch this around.

If not, you might try using Thunderbird instead.  You can pick to reply 
in plain text or HTML by holding shift when you hit "reply all"

For my reply I'll convert my reply to plain text, please see inline below.

On 1/8/2025 23:34, Gerry Liu wrote:
> 
> 
>> 2025年1月9日 00:33，Mario Limonciello <mario.limonciello@amd.com 
>> <mailto:mario.limonciello@amd.com>> 写道：
>>
>> On 1/8/2025 07:59, Jiang Liu wrote:
>>> Subject: [RFC PATCH 00/13] Enhance device state machine to better 
>>> support suspend/resume
>>
>> I'm not sure how this happened, but your subject didn't end up in the 
>> subject of the thread on patch 0 so the thread just looks like an 
>> unsubjected thread.
> Maybe it’s caused by one extra blank line at the header.

Yeah that might be it.  Hopefully it doesn't happen on v2.

> 
>>
>>> Recently we were testing suspend/resume functionality with AMD GPUs,
>>> we have encountered several resource tracking related bugs, such as
>>> double buffer free, use after free and unbalanced irq reference count.
>>
>> Can you share more aobut how you were hitting these issues?  Are they 
>> specific to S3 or to s2idle flows?  dGPU or APU?
>> Are they only with SRIOV?
>>
>> Is there anything to do with the host influencing the failures to 
>> happen, or are you contriving the failures to find the bugs?
>>
>> I know we've had some reports about resource tracking warnings on the 
>> reset flows, but I haven't heard much about suspend/resume.
> We are investigating to develop some advanced product features based on 
> amdgpu suspend/resume.
> So we started by tested the suspend/resume functionality of AMD 308x 
> GPUs with the following simple script:
> ```
> echoplatform >/sys/power/pm_test
> i=0
> while true; do
> echomem >/sys/power/state
> leti=i+1
> echo$i
> sleep1
> done
> ```
> 
> It succeeds with the first and second iteration but always fails on 
> following iterations on a bare metal servers with eight MI308X GPUs.

Can you share more about this server?  Does it support suspend to ram or 
a hardware backed suspend to idle?  If you don't know, you can check 
like this:

❯ cat /sys/power/mem_sleep
s2idle [deep]

If it's suspend to idle, what does the FACP indicate?  You can do this 
check to find out if you don't know.

❯ sudo cp /sys/firmware/acpi/tables/FACP /tmp
❯ sudo iasl -d /tmp/FACP
❯ grep "idle" -i /tmp/FACP.dsl
                       Low Power S0 Idle (V5) : 0


> With some investigation we found that the gpu asic should be reset 
> during the test, 

Yeah; but this comes back to my above questions.  Typically there is an 
assumption that the power rails are going to be cut in system suspend.

If that doesn't hold true, then you're doing a pure software suspend and 
have found a series of issues in the driver with how that's handled.

> so we submitted a patch to fix the failure (https:// 
> github.com/ROCm/ROCK-Kernel-Driver/pull/181 <https://github.com/ROCm/ 
> ROCK-Kernel-Driver/pull/181>)

Typically kernel patches don't go through that repo, they're discussed 
on the mailing lists. Can you bring this patch for discussion on amd-gfx?

> 
> During analyze and root-cause the failure, we have encountered several 
> crashes, resource leakages and false alarms.

Yeah; I think you found some real issues.

> So I have worked out patch sets to solve issues we encountered. The 
> other patch set is https://lists.freedesktop.org/archives/amd-gfx/2025- 
> January/118484.html <https://lists.freedesktop.org/archives/amd- 
> gfx/2025-January/118484.html>

Thanks!

> 
> With sriov in single VF mode, resume always fails. Seems some contexts/ 
> vram buffers get lost during suspend and haven’t be restored on resume, 
> so cause failure.
> We haven’t tested sriov in multiple VFs mode yet. We need more help from 
> AMD side to make SR work for SRIOV:)
> 
>>
>>> We have tried to solve these issues case by case, but found that may
>>> not be the right way. Especially about the unbalanced irq reference
>>> count, there will be new issues appear once we fixed the current known
>>> issues. After analyzing related source code, we found that there may be
>>> some fundamental implementaion flaws behind these resource tracking
>>
>> implementation
>>
>>> issues.
>>> The amdgpu driver has two major state machines to driver the device
>>> management flow, one is for ip blocks, the other is for ras blocks.
>>> The hook points defined in struct amd_ip_funcs for device setup/teardown
>>> are symmetric, but the implementation is asymmetric, sometime even
>>> ambiguous. The most obvious two issues we noticed are:
>>> 1) amdgpu_irq_get() are called from .late_init() but amdgpu_irq_put()
>>>    are called from .hw_fini() instead of .early_fini().
>>> 2) the way to reset ip_bloc.status.valid/sw/hw/late_initialized doesn't
>>>    match the way to set those flags.
>>> When taking device suspend/resume into account, in addition to device
>>> probe/remove, things get much more complex. Some issues arise because
>>> many suspend/resume implementations directly reuse .hw_init/.hw_fini/
>>> .late_init hook points.
>>>
>>> So we try to fix those issues by two enhancements/refinements to current
>>> device management state machines.
>>> The first change is to make the ip block state machine and associated
>>> status flags work in stack-like way as below:
>>> Callback        Status Flags
>>> early_init:     valid = true
>>> sw_init:        sw = true
>>> hw_init:        hw = true
>>> late_init:      late_initialized = true
>>> early_fini:     late_initialized = false
>>> hw_fini:        hw = false
>>> sw_fini:        sw = false
>>> late_fini:      valid = false
>>
>> At a high level this makes sense to me, but I'd just call 'late' or 
>> 'late_init'.
>>
>> Another idea if you make it stack like is to do it as a true enum for 
>> the state machine and store it all in one variable.
> I will add a patch to convert those bool flags into an enum.

Thanks!


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-01-09 17:10     ` Re: Mario Limonciello
@ 2025-01-13  1:19       ` Gerry Liu
  2025-01-13 21:59         ` Re: Mario Limonciello
  0 siblings, 1 reply; 1546+ messages in thread
From: Gerry Liu @ 2025-01-13  1:19 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: alexander.deucher, christian.koenig, Xinhui.Pan, airlied, simona,
	sunil.khatri, Lazar, Lijo, Hawking.Zhang, Chen, Xiaogang,
	Kent.Russell, Shuo Liu, amd-gfx



> 2025年1月10日 01:10，Mario Limonciello <mario.limonciello@amd.com> 写道：
> 
> General note - don't use HTML for mailing list communication.
> 
> I'm not sure if Apple Mail lets you switch this around.
> 
> If not, you might try using Thunderbird instead.  You can pick to reply in plain text or HTML by holding shift when you hit "reply all"
> 
> For my reply I'll convert my reply to plain text, please see inline below.
> 
> On 1/8/2025 23:34, Gerry Liu wrote:
>>> 2025年1月9日 00:33，Mario Limonciello <mario.limonciello@amd.com <mailto:mario.limonciello@amd.com>> 写道：
>>> 
>>> On 1/8/2025 07:59, Jiang Liu wrote:
>>>> Subject: [RFC PATCH 00/13] Enhance device state machine to better support suspend/resume
>>> 
>>> I'm not sure how this happened, but your subject didn't end up in the subject of the thread on patch 0 so the thread just looks like an unsubjected thread.
>> Maybe it’s caused by one extra blank line at the header.
> 
> Yeah that might be it.  Hopefully it doesn't happen on v2.
> 
>>> 
>>>> Recently we were testing suspend/resume functionality with AMD GPUs,
>>>> we have encountered several resource tracking related bugs, such as
>>>> double buffer free, use after free and unbalanced irq reference count.
>>> 
>>> Can you share more aobut how you were hitting these issues?  Are they specific to S3 or to s2idle flows?  dGPU or APU?
>>> Are they only with SRIOV?
>>> 
>>> Is there anything to do with the host influencing the failures to happen, or are you contriving the failures to find the bugs?
>>> 
>>> I know we've had some reports about resource tracking warnings on the reset flows, but I haven't heard much about suspend/resume.
>> We are investigating to develop some advanced product features based on amdgpu suspend/resume.
>> So we started by tested the suspend/resume functionality of AMD 308x GPUs with the following simple script:
>> ```
>> echoplatform >/sys/power/pm_test
>> i=0
>> while true; do
>> echomem >/sys/power/state
>> leti=i+1
>> echo$i
>> sleep1
>> done
>> ```
>> It succeeds with the first and second iteration but always fails on following iterations on a bare metal servers with eight MI308X GPUs.
> 
> Can you share more about this server?  Does it support suspend to ram or a hardware backed suspend to idle?  If you don't know, you can check like this:
> 
> ❯ cat /sys/power/mem_sleep
> s2idle [deep]
# cat /sys/power/mem_sleep 
[s2idle]

> 
> If it's suspend to idle, what does the FACP indicate?  You can do this check to find out if you don't know.
> 
> ❯ sudo cp /sys/firmware/acpi/tables/FACP /tmp
> ❯ sudo iasl -d /tmp/FACP
> ❯ grep "idle" -i /tmp/FACP.dsl
>                      Low Power S0 Idle (V5) : 0
> 
With acpidump and `iasl -d facp.data`, we got:
[070h 0112   4]        Flags (decoded below) : 000084A5
      WBINVD instruction is operational (V1) : 1
              WBINVD flushes all caches (V1) : 0
                    All CPUs support C1 (V1) : 1
                  C2 works on MP system (V1) : 0
            Control Method Power Button (V1) : 0
            Control Method Sleep Button (V1) : 1
        RTC wake not in fixed reg space (V1) : 0
            RTC can wake system from S4 (V1) : 1
                        32-bit PM Timer (V1) : 0
                      Docking Supported (V1) : 0
               Reset Register Supported (V2) : 1
                            Sealed Case (V3) : 0
                    Headless - No Video (V3) : 0
        Use native instr after SLP_TYPx (V3) : 0
              PCIEXP_WAK Bits Supported (V4) : 0
                     Use Platform Timer (V4) : 1
               RTC_STS valid on S4 wake (V4) : 0
                Remote Power-on capable (V4) : 0
                 Use APIC Cluster Model (V4) : 0
     Use APIC Physical Destination Mode (V4) : 0
                       Hardware Reduced (V5) : 0
                      Low Power S0 Idle (V5) : 0

>> With some investigation we found that the gpu asic should be reset during the test, 
> 
> Yeah; but this comes back to my above questions.  Typically there is an assumption that the power rails are going to be cut in system suspend.
> 
> If that doesn't hold true, then you're doing a pure software suspend and have found a series of issues in the driver with how that's handled.
Yeah, we are trying to do a `pure software suspend`, letting hypervisor to save/restore system images instead of guest OS.
And during the suspend process, we hope we can cancel the suspend request at any later stage.
We cancel suspend at late stages, it does behave like a pure software suspend.

> 
>> so we submitted a patch to fix the failure (https:// github.com/ROCm/ROCK-Kernel-Driver/pull/181 <https://github.com/ROCm/ ROCK-Kernel-Driver/pull/181>)
> 
> Typically kernel patches don't go through that repo, they're discussed on the mailing lists. Can you bring this patch for discussion on amd-gfx?
Will post to amd-gfx after solving the conflicts.

Regards,
Gerry

> 
>> During analyze and root-cause the failure, we have encountered several crashes, resource leakages and false alarms.
> 
> Yeah; I think you found some real issues.
> 
>> So I have worked out patch sets to solve issues we encountered. The other patch set is https://lists.freedesktop.org/archives/amd-gfx/2025- January/118484.html <https://lists.freedesktop.org/archives/amd- gfx/2025-January/118484.html>
> 
> Thanks!
> 
>> With sriov in single VF mode, resume always fails. Seems some contexts/ vram buffers get lost during suspend and haven’t be restored on resume, so cause failure.
>> We haven’t tested sriov in multiple VFs mode yet. We need more help from AMD side to make SR work for SRIOV:)
>>> 
>>>> We have tried to solve these issues case by case, but found that may
>>>> not be the right way. Especially about the unbalanced irq reference
>>>> count, there will be new issues appear once we fixed the current known
>>>> issues. After analyzing related source code, we found that there may be
>>>> some fundamental implementaion flaws behind these resource tracking
>>> 
>>> implementation
>>> 
>>>> issues.
>>>> The amdgpu driver has two major state machines to driver the device
>>>> management flow, one is for ip blocks, the other is for ras blocks.
>>>> The hook points defined in struct amd_ip_funcs for device setup/teardown
>>>> are symmetric, but the implementation is asymmetric, sometime even
>>>> ambiguous. The most obvious two issues we noticed are:
>>>> 1) amdgpu_irq_get() are called from .late_init() but amdgpu_irq_put()
>>>>    are called from .hw_fini() instead of .early_fini().
>>>> 2) the way to reset ip_bloc.status.valid/sw/hw/late_initialized doesn't
>>>>    match the way to set those flags.
>>>> When taking device suspend/resume into account, in addition to device
>>>> probe/remove, things get much more complex. Some issues arise because
>>>> many suspend/resume implementations directly reuse .hw_init/.hw_fini/
>>>> .late_init hook points.
>>>> 
>>>> So we try to fix those issues by two enhancements/refinements to current
>>>> device management state machines.
>>>> The first change is to make the ip block state machine and associated
>>>> status flags work in stack-like way as below:
>>>> Callback        Status Flags
>>>> early_init:     valid = true
>>>> sw_init:        sw = true
>>>> hw_init:        hw = true
>>>> late_init:      late_initialized = true
>>>> early_fini:     late_initialized = false
>>>> hw_fini:        hw = false
>>>> sw_fini:        sw = false
>>>> late_fini:      valid = false
>>> 
>>> At a high level this makes sense to me, but I'd just call 'late' or 'late_init'.
>>> 
>>> Another idea if you make it stack like is to do it as a true enum for the state machine and store it all in one variable.
>> I will add a patch to convert those bool flags into an enum.
> 
> Thanks!


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-01-13  1:19       ` Re: Gerry Liu
@ 2025-01-13 21:59         ` Mario Limonciello
  0 siblings, 0 replies; 1546+ messages in thread
From: Mario Limonciello @ 2025-01-13 21:59 UTC (permalink / raw)
  To: Gerry Liu
  Cc: alexander.deucher, christian.koenig, Xinhui.Pan, airlied, simona,
	sunil.khatri, Lazar, Lijo, Hawking.Zhang, Chen, Xiaogang,
	Kent.Russell, Shuo Liu, amd-gfx

On 1/12/2025 19:19, Gerry Liu wrote:
> 
> 
>> 2025年1月10日 01:10，Mario Limonciello <mario.limonciello@amd.com> 写道：
>>
>> General note - don't use HTML for mailing list communication.
>>
>> I'm not sure if Apple Mail lets you switch this around.
>>
>> If not, you might try using Thunderbird instead.  You can pick to reply in plain text or HTML by holding shift when you hit "reply all"
>>
>> For my reply I'll convert my reply to plain text, please see inline below.
>>
>> On 1/8/2025 23:34, Gerry Liu wrote:
>>>> 2025年1月9日 00:33，Mario Limonciello <mario.limonciello@amd.com <mailto:mario.limonciello@amd.com>> 写道：
>>>>
>>>> On 1/8/2025 07:59, Jiang Liu wrote:
>>>>> Subject: [RFC PATCH 00/13] Enhance device state machine to better support suspend/resume
>>>>
>>>> I'm not sure how this happened, but your subject didn't end up in the subject of the thread on patch 0 so the thread just looks like an unsubjected thread.
>>> Maybe it’s caused by one extra blank line at the header.
>>
>> Yeah that might be it.  Hopefully it doesn't happen on v2.
>>
>>>>
>>>>> Recently we were testing suspend/resume functionality with AMD GPUs,
>>>>> we have encountered several resource tracking related bugs, such as
>>>>> double buffer free, use after free and unbalanced irq reference count.
>>>>
>>>> Can you share more aobut how you were hitting these issues?  Are they specific to S3 or to s2idle flows?  dGPU or APU?
>>>> Are they only with SRIOV?
>>>>
>>>> Is there anything to do with the host influencing the failures to happen, or are you contriving the failures to find the bugs?
>>>>
>>>> I know we've had some reports about resource tracking warnings on the reset flows, but I haven't heard much about suspend/resume.
>>> We are investigating to develop some advanced product features based on amdgpu suspend/resume.
>>> So we started by tested the suspend/resume functionality of AMD 308x GPUs with the following simple script:
>>> ```
>>> echoplatform >/sys/power/pm_test
>>> i=0
>>> while true; do
>>> echomem >/sys/power/state
>>> leti=i+1
>>> echo$i
>>> sleep1
>>> done
>>> ```
>>> It succeeds with the first and second iteration but always fails on following iterations on a bare metal servers with eight MI308X GPUs.
>>
>> Can you share more about this server?  Does it support suspend to ram or a hardware backed suspend to idle?  If you don't know, you can check like this:
>>
>> ❯ cat /sys/power/mem_sleep
>> s2idle [deep]
> # cat /sys/power/mem_sleep
> [s2idle]
> 
>>
>> If it's suspend to idle, what does the FACP indicate?  You can do this check to find out if you don't know.
>>
>> ❯ sudo cp /sys/firmware/acpi/tables/FACP /tmp
>> ❯ sudo iasl -d /tmp/FACP
>> ❯ grep "idle" -i /tmp/FACP.dsl
>>                       Low Power S0 Idle (V5) : 0
>>
> With acpidump and `iasl -d facp.data`, we got:
> [070h 0112   4]        Flags (decoded below) : 000084A5
>        WBINVD instruction is operational (V1) : 1
>                WBINVD flushes all caches (V1) : 0
>                      All CPUs support C1 (V1) : 1
>                    C2 works on MP system (V1) : 0
>              Control Method Power Button (V1) : 0
>              Control Method Sleep Button (V1) : 1
>          RTC wake not in fixed reg space (V1) : 0
>              RTC can wake system from S4 (V1) : 1
>                          32-bit PM Timer (V1) : 0
>                        Docking Supported (V1) : 0
>                 Reset Register Supported (V2) : 1
>                              Sealed Case (V3) : 0
>                      Headless - No Video (V3) : 0
>          Use native instr after SLP_TYPx (V3) : 0
>                PCIEXP_WAK Bits Supported (V4) : 0
>                       Use Platform Timer (V4) : 1
>                 RTC_STS valid on S4 wake (V4) : 0
>                  Remote Power-on capable (V4) : 0
>                   Use APIC Cluster Model (V4) : 0
>       Use APIC Physical Destination Mode (V4) : 0
>                         Hardware Reduced (V5) : 0
>                        Low Power S0 Idle (V5) : 0
> 
>>> With some investigation we found that the gpu asic should be reset during the test,
>>
>> Yeah; but this comes back to my above questions.  Typically there is an assumption that the power rails are going to be cut in system suspend.
>>
>> If that doesn't hold true, then you're doing a pure software suspend and have found a series of issues in the driver with how that's handled.
> Yeah, we are trying to do a `pure software suspend`, letting hypervisor to save/restore system images instead of guest OS.
> And during the suspend process, we hope we can cancel the suspend request at any later stage.
> We cancel suspend at late stages, it does behave like a pure software suspend.
> 

Thanks; this all makes a lot more sense now.  This isn't an area that 
has a lot of coverage right now.  Most suspend testing happens with the 
power being cut and coming back fresh.

Will keep this in mind as reviewing your future iterations of your patches.

>>
>>> so we submitted a patch to fix the failure (https:// github.com/ROCm/ROCK-Kernel-Driver/pull/181 <https://github.com/ROCm/ ROCK-Kernel-Driver/pull/181>)
>>
>> Typically kernel patches don't go through that repo, they're discussed on the mailing lists. Can you bring this patch for discussion on amd-gfx?
> Will post to amd-gfx after solving the conflicts.

Thx!

> 
> Regards,
> Gerry
> 
>>
>>> During analyze and root-cause the failure, we have encountered several crashes, resource leakages and false alarms.
>>
>> Yeah; I think you found some real issues.
>>
>>> So I have worked out patch sets to solve issues we encountered. The other patch set is https://lists.freedesktop.org/archives/amd-gfx/2025- January/118484.html <https://lists.freedesktop.org/archives/amd- gfx/2025-January/118484.html>
>>
>> Thanks!
>>
>>> With sriov in single VF mode, resume always fails. Seems some contexts/ vram buffers get lost during suspend and haven’t be restored on resume, so cause failure.
>>> We haven’t tested sriov in multiple VFs mode yet. We need more help from AMD side to make SR work for SRIOV:)
>>>>
>>>>> We have tried to solve these issues case by case, but found that may
>>>>> not be the right way. Especially about the unbalanced irq reference
>>>>> count, there will be new issues appear once we fixed the current known
>>>>> issues. After analyzing related source code, we found that there may be
>>>>> some fundamental implementaion flaws behind these resource tracking
>>>>
>>>> implementation
>>>>
>>>>> issues.
>>>>> The amdgpu driver has two major state machines to driver the device
>>>>> management flow, one is for ip blocks, the other is for ras blocks.
>>>>> The hook points defined in struct amd_ip_funcs for device setup/teardown
>>>>> are symmetric, but the implementation is asymmetric, sometime even
>>>>> ambiguous. The most obvious two issues we noticed are:
>>>>> 1) amdgpu_irq_get() are called from .late_init() but amdgpu_irq_put()
>>>>>     are called from .hw_fini() instead of .early_fini().
>>>>> 2) the way to reset ip_bloc.status.valid/sw/hw/late_initialized doesn't
>>>>>     match the way to set those flags.
>>>>> When taking device suspend/resume into account, in addition to device
>>>>> probe/remove, things get much more complex. Some issues arise because
>>>>> many suspend/resume implementations directly reuse .hw_init/.hw_fini/
>>>>> .late_init hook points.
>>>>>
>>>>> So we try to fix those issues by two enhancements/refinements to current
>>>>> device management state machines.
>>>>> The first change is to make the ip block state machine and associated
>>>>> status flags work in stack-like way as below:
>>>>> Callback        Status Flags
>>>>> early_init:     valid = true
>>>>> sw_init:        sw = true
>>>>> hw_init:        hw = true
>>>>> late_init:      late_initialized = true
>>>>> early_fini:     late_initialized = false
>>>>> hw_fini:        hw = false
>>>>> sw_fini:        sw = false
>>>>> late_fini:      valid = false
>>>>
>>>> At a high level this makes sense to me, but I'd just call 'late' or 'late_init'.
>>>>
>>>> Another idea if you make it stack like is to do it as a true enum for the state machine and store it all in one variable.
>>> I will add a patch to convert those bool flags into an enum.
>>
>> Thanks!
> 


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-04-18  7:46 Shung-Hsi Yu
@ 2025-04-18  7:49 ` Shung-Hsi Yu
  2025-04-23 17:30 ` Re: patchwork-bot+netdevbpf
  1 sibling, 0 replies; 1546+ messages in thread
From: Shung-Hsi Yu @ 2025-04-18  7:49 UTC (permalink / raw)
  To: bpf
  Cc: Martin KaFai Lau, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Kumar Kartikeya Dwivedi, Dan Carpenter

On Fri, Apr 18, 2025 at 3:46 PM Shung-Hsi Yu <shung-hsi.yu@suse.com> wrote:
> From bda8bb8011d865cebf066350c8625e8be1625656 Mon Sep 17 00:00:00 2001
> From: Shung-Hsi Yu <shung-hsi.yu@suse.com>
> Date: Fri, 18 Apr 2025 15:22:00 +0800
> Subject: [PATCH bpf-next 1/1] bpf: use proper type to calculate
>  bpf_raw_tp_null_args.mask index
...

Email headers are off, hence no subject. WIll resend.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-04-22  8:04 ` Feng Yang
@ 2025-04-22 14:37   ` Alexei Starovoitov
  0 siblings, 0 replies; 1546+ messages in thread
From: Alexei Starovoitov @ 2025-04-22 14:37 UTC (permalink / raw)
  To: Feng Yang
  Cc: Andrii Nakryiko, Alexei Starovoitov, bpf, Daniel Borkmann, Eduard,
	LKML, linux-trace-kernel, Martin KaFai Lau, Network Development,
	Song Liu, Feng Yang, Yonghong Song

On Tue, Apr 22, 2025 at 1:04 AM Feng Yang <yangfeng59949@163.com> wrote:
>
> Subject: Re: [PATCH bpf-next] bpf: Remove bpf_get_smp_processor_id_proto
>
> On Mon, 21 Apr 2025 18:53:07 -0700 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
>
> > On Thu, Apr 17, 2025 at 8:41 PM Feng Yang <yangfeng59949@163.com> wrote:
> > >
> > > From: Feng Yang <yangfeng@kylinos.cn>
> > >
> > > All BPF programs either disable CPU preemption or CPU migration,
> > > so the bpf_get_smp_processor_id_proto can be safely removed,
> > > and the bpf_get_raw_smp_processor_id_proto in bpf_base_func_proto works perfectly.
> > >
> > > Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > > Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
> > > ---
> > >  include/linux/bpf.h      |  1 -
> > >  kernel/bpf/core.c        |  1 -
> > >  kernel/bpf/helpers.c     | 12 ------------
> > >  kernel/trace/bpf_trace.c |  2 --
> > >  net/core/filter.c        |  6 ------
> > >  5 files changed, 22 deletions(-)
> > >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index 3f0cc89c0622..36e525141556 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -3316,7 +3316,6 @@ extern const struct bpf_func_proto bpf_map_peek_elem_proto;
> > >  extern const struct bpf_func_proto bpf_map_lookup_percpu_elem_proto;
> > >
> > >  extern const struct bpf_func_proto bpf_get_prandom_u32_proto;
> > > -extern const struct bpf_func_proto bpf_get_smp_processor_id_proto;
> > >  extern const struct bpf_func_proto bpf_get_numa_node_id_proto;
> > >  extern const struct bpf_func_proto bpf_tail_call_proto;
> > >  extern const struct bpf_func_proto bpf_ktime_get_ns_proto;
> > > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > > index ba6b6118cf50..1ad41a16b86e 100644
> > > --- a/kernel/bpf/core.c
> > > +++ b/kernel/bpf/core.c
> > > @@ -2943,7 +2943,6 @@ const struct bpf_func_proto bpf_spin_unlock_proto __weak;
> > >  const struct bpf_func_proto bpf_jiffies64_proto __weak;
> > >
> > >  const struct bpf_func_proto bpf_get_prandom_u32_proto __weak;
> > > -const struct bpf_func_proto bpf_get_smp_processor_id_proto __weak;
> > >  const struct bpf_func_proto bpf_get_numa_node_id_proto __weak;
> > >  const struct bpf_func_proto bpf_ktime_get_ns_proto __weak;
> > >  const struct bpf_func_proto bpf_ktime_get_boot_ns_proto __weak;
> > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > index e3a2662f4e33..2d2bfb2911f8 100644
> > > --- a/kernel/bpf/helpers.c
> > > +++ b/kernel/bpf/helpers.c
> > > @@ -149,18 +149,6 @@ const struct bpf_func_proto bpf_get_prandom_u32_proto = {
> > >         .ret_type       = RET_INTEGER,
> > >  };
> > >
> > > -BPF_CALL_0(bpf_get_smp_processor_id)
> > > -{
> > > -       return smp_processor_id();
> > > -}
> > > -
> > > -const struct bpf_func_proto bpf_get_smp_processor_id_proto = {
> > > -       .func           = bpf_get_smp_processor_id,
> > > -       .gpl_only       = false,
> > > -       .ret_type       = RET_INTEGER,
> > > -       .allow_fastcall = true,
> > > -};
> > > -
> >
> > bpf_get_raw_smp_processor_id_proto doesn't have
> > allow_fastcall = true
> >
> > so this breaks tests.
> >
> > Instead of removing BPF_CALL_0(bpf_get_smp_processor_id)
> > we should probably remove BPF_CALL_0(bpf_get_raw_cpu_id)
> > and adjust SKF_AD_OFF + SKF_AD_CPU case.
> > I don't recall why raw_ version was used back in 2014.
> >
>
> The following two seem to explain the reason:
> https://lore.kernel.org/all/7103e2085afa29c006cd5b94a6e4a2ac83efc30d.1467106475.git.daniel@iogearbox.net/
> https://lore.kernel.org/all/02fa71ebe1c560cad489967aa29c653b48932596.1474586162.git.daniel@iogearbox.net/
>

Ahh. socket filters run in RCU CS. They don't disable preemption or migration.
Then let's keep things as-is.
We still want debugging provided by smp_processor_id().
If we switch everything to raw_ may miss things. Like this example with
socket filters.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-04-18  7:46 Shung-Hsi Yu
  2025-04-18  7:49 ` Shung-Hsi Yu
@ 2025-04-23 17:30 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 1546+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-04-23 17:30 UTC (permalink / raw)
  To: Shung-Hsi Yu
  Cc: bpf, martin.lau, ast, daniel, andrii, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	memxor, dan.carpenter

Hello:

This patch was applied to bpf/bpf-next.git (master)
by Andrii Nakryiko <andrii@kernel.org>:

On Fri, 18 Apr 2025 15:46:31 +0800 you wrote:
> >From bda8bb8011d865cebf066350c8625e8be1625656 Mon Sep 17 00:00:00 2001
> From: Shung-Hsi Yu <shung-hsi.yu@suse.com>
> Date: Fri, 18 Apr 2025 15:22:00 +0800
> Subject: [PATCH bpf-next 1/1] bpf: use proper type to calculate
>  bpf_raw_tp_null_args.mask index
> 
> The calculation of the index used to access the mask field in 'struct
> bpf_raw_tp_null_args' is done with 'int' type, which could overflow when
> the tracepoint being attached has more than 8 arguments.
> 
> [...]

Here is the summary with links:
  - 
    https://git.kernel.org/bpf/bpf-next/c/53ebef53a657

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-04-24  0:40 Cong Wang
@ 2025-04-24  0:59 ` Jiayuan Chen
  2025-04-24  9:19   ` Re: Jiayuan Chen
  0 siblings, 1 reply; 1546+ messages in thread
From: Jiayuan Chen @ 2025-04-24  0:59 UTC (permalink / raw)
  To: Cong Wang; +Cc: john.fastabend, jakub, netdev, bpf

April 24, 2025 at 08:40, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:



> 
> netdev@vger.kernel.org, bpf@vger.kernel.org
> 
> Bcc: 
> 
> Subject: test_sockmap failures on the latest bpf-next
> 
> Reply-To: 
> 
> Hi all,
> 
> The latest bpf-next failed on test_sockmap tests, I got the following
> 
> failures (including 1 kernel warning). It is 100% reproducible here.
> 
> I don't have time to look into them, a quick glance at the changelog
> 
> shows quite some changes from Jiayuan. So please take a look, Jiayuan.
> 
> Meanwhile, please let me know if you need more information from me.
> 
> Thanks!
> 
> --------------->

Thanks, I'm working on it.

> 
> [root@localhost bpf]# ./test_sockmap 
> 
> # 1/ 6 sockmap::txmsg test passthrough:OK
> 
> # 2/ 6 sockmap::txmsg test redirect:OK
> 
> # 3/ 2 sockmap::txmsg test redirect wait send mem:OK
> 
> # 4/ 6 sockmap::txmsg test drop:OK
> 
> [ 182.498017] perf: interrupt took too long (3406 > 3238), lowering kernel.perf_event_max_sample_rate to 58500
> 
> # 5/ 6 sockmap::txmsg test ingress redirect:OK
> 
> # 6/ 7 sockmap::txmsg test skb:OK
> 
> # 7/12 sockmap::txmsg test apply:OK
> 
> # 8/12 sockmap::txmsg test cork:OK
> 
> # 9/ 3 sockmap::txmsg test hanging corks:OK
> 
> #10/11 sockmap::txmsg test push_data:OK
> 
> #11/17 sockmap::txmsg test pull-data:OK
> 
> #12/ 9 sockmap::txmsg test pop-data:OK
> 
> #13/ 6 sockmap::txmsg test push/pop data:OK
> 
> #14/ 1 sockmap::txmsg test ingress parser:OK
> 
> #15/ 1 sockmap::txmsg test ingress parser2:OK
> 
> #16/ 6 sockhash::txmsg test passthrough:OK
> 
> #17/ 6 sockhash::txmsg test redirect:OK
> 
> #18/ 2 sockhash::txmsg test redirect wait send mem:OK
> 
> #19/ 6 sockhash::txmsg test drop:OK
> 
> #20/ 6 sockhash::txmsg test ingress redirect:OK
> 
> #21/ 7 sockhash::txmsg test skb:OK
> 
> #22/12 sockhash::txmsg test apply:OK
> 
> #23/12 sockhash::txmsg test cork:OK
> 
> #24/ 3 sockhash::txmsg test hanging corks:OK
> 
> #25/11 sockhash::txmsg test push_data:OK
> 
> #26/17 sockhash::txmsg test pull-data:OK
> 
> #27/ 9 sockhash::txmsg test pop-data:OK
> 
> #28/ 6 sockhash::txmsg test push/pop data:OK
> 
> #29/ 1 sockhash::txmsg test ingress parser:OK
> 
> #30/ 1 sockhash::txmsg test ingress parser2:OK
> 
> #31/ 6 sockhash:ktls:txmsg test passthrough:OK
> 
> #32/ 6 sockhash:ktls:txmsg test redirect:OK
> 
> #33/ 2 sockhash:ktls:txmsg test redirect wait send mem:OK
> 
> [ 263.509707] ------------[ cut here ]------------
> 
> [ 263.510439] WARNING: CPU: 1 PID: 40 at net/ipv4/af_inet.c:156 inet_sock_destruct+0x173/0x1d5
> 
> [ 263.511450] CPU: 1 UID: 0 PID: 40 Comm: kworker/1:1 Tainted: G W 6.15.0-rc3+ #238 PREEMPT(voluntary) 
> 
> [ 263.512683] Tainted: [W]=WARN
> 
> [ 263.513062] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> 
> [ 263.514763] Workqueue: events sk_psock_destroy
> 
> [ 263.515332] RIP: 0010:inet_sock_destruct+0x173/0x1d5
> 
> [ 263.515916] Code: e8 dc dc 3f ff 41 83 bc 24 c0 02 00 00 00 74 02 0f 0b 49 8d bc 24 ac 02 00 00 e8 c2 dc 3f ff 41 83 bc 24 ac 02 00 00 00 74 02 <0f> 0b e8 c7 95 3d 00 49 8d bc 24 b0 05 00 00 e8 c0 dd 3f ff 49 8b
> 
> [ 263.518899] RSP: 0018:ffff8880085cfc18 EFLAGS: 00010202
> 
> [ 263.519596] RAX: 1ffff11003dbfc00 RBX: ffff88801edfe3e8 RCX: ffffffff822f5af4
> 
> [ 263.520502] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff88801edfe16c
> 
> [ 263.522128] RBP: ffff88801edfe184 R08: ffffed1003dbfc31 R09: 0000000000000000
> 
> [ 263.523008] R10: ffffffff822f5ab7 R11: ffff88801edfe187 R12: ffff88801edfdec0
> 
> [ 263.523822] R13: ffff888020376ac0 R14: ffff888020376ac0 R15: ffff888020376a60
> 
> [ 263.524682] FS: 0000000000000000(0000) GS:ffff8880b0e88000(0000) knlGS:0000000000000000
> 
> [ 263.525999] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 
> [ 263.526765] CR2: 0000556365155830 CR3: 000000001d6aa000 CR4: 0000000000350ef0
> 
> [ 263.527700] Call Trace:
> 
> [ 263.528037] <TASK>
> 
> [ 263.528339] __sk_destruct+0x46/0x222
> 
> [ 263.528856] sk_psock_destroy+0x22f/0x242
> 
> [ 263.529471] process_one_work+0x504/0x8a8
> 
> [ 263.530029] ? process_one_work+0x39d/0x8a8
> 
> [ 263.530587] ? __pfx_process_one_work+0x10/0x10
> 
> [ 263.531195] ? worker_thread+0x44/0x2ae
> 
> [ 263.531721] ? __list_add_valid_or_report+0x83/0xea
> 
> [ 263.532395] ? srso_return_thunk+0x5/0x5f
> 
> [ 263.532929] ? __list_add+0x45/0x52
> 
> [ 263.533482] process_scheduled_works+0x73/0x82
> 
> [ 263.534079] worker_thread+0x1ce/0x2ae
> 
> [ 263.534582] ? _raw_spin_unlock_irqrestore+0x2e/0x44
> 
> [ 263.535243] ? __pfx_worker_thread+0x10/0x10
> 
> [ 263.535822] kthread+0x32a/0x33c
> 
> [ 263.536278] ? kthread+0x13c/0x33c
> 
> [ 263.536724] ? __pfx_kthread+0x10/0x10
> 
> [ 263.537225] ? srso_return_thunk+0x5/0x5f
> 
> [ 263.537869] ? find_held_lock+0x2b/0x75
> 
> [ 263.538388] ? __pfx_kthread+0x10/0x10
> 
> [ 263.538866] ? srso_return_thunk+0x5/0x5f
> 
> [ 263.539523] ? local_clock_noinstr+0x32/0x9c
> 
> [ 263.540128] ? srso_return_thunk+0x5/0x5f
> 
> [ 263.540677] ? srso_return_thunk+0x5/0x5f
> 
> [ 263.541228] ? __lock_release+0xd3/0x1ad
> 
> [ 263.541890] ? srso_return_thunk+0x5/0x5f
> 
> [ 263.542442] ? tracer_hardirqs_on+0x17/0x149
> 
> [ 263.543047] ? _raw_spin_unlock_irq+0x24/0x39
> 
> [ 263.543589] ? __pfx_kthread+0x10/0x10
> 
> [ 263.544069] ? __pfx_kthread+0x10/0x10
> 
> [ 263.544543] ret_from_fork+0x21/0x41
> 
> [ 263.545000] ? __pfx_kthread+0x10/0x10
> 
> [ 263.545557] ret_from_fork_asm+0x1a/0x30
> 
> [ 263.546095] </TASK>
> 
> [ 263.546374] irq event stamp: 1094079
> 
> [ 263.546798] hardirqs last enabled at (1094089): [<ffffffff813be0f6>] __up_console_sem+0x47/0x4e
> 
> [ 263.547762] hardirqs last disabled at (1094098): [<ffffffff813be0d6>] __up_console_sem+0x27/0x4e
> 
> [ 263.548817] softirqs last enabled at (1093692): [<ffffffff812f2906>] handle_softirqs+0x48c/0x4de
> 
> [ 263.550127] softirqs last disabled at (1094117): [<ffffffff812f29b3>] __irq_exit_rcu+0x4b/0xc3
> 
> [ 263.551104] ---[ end trace 0000000000000000 ]---
> 
> #34/ 6 sockhash:ktls:txmsg test drop:OK
> 
> #35/ 6 sockhash:ktls:txmsg test ingress redirect:OK
> 
> #36/ 7 sockhash:ktls:txmsg test skb:OK
> 
> #37/12 sockhash:ktls:txmsg test apply:OK
> 
> [ 278.915147] perf: interrupt took too long (4331 > 4257), lowering kernel.perf_event_max_sample_rate to 46000
> 
> [ 282.974989] test_sockmap (1077) used greatest stack depth: 25072 bytes left
> 
> #38/12 sockhash:ktls:txmsg test cork:OK
> 
> #39/ 3 sockhash:ktls:txmsg test hanging corks:OK
> 
> #40/11 sockhash:ktls:txmsg test push_data:OK
> 
> #41/17 sockhash:ktls:txmsg test pull-data:OK
> 
> recv failed(): Invalid argument
> 
> rx thread exited with err 1.
> 
> recv failed(): Invalid argument
> 
> rx thread exited with err 1.
> 
> recv failed(): Bad message
> 
> rx thread exited with err 1.
> 
> #42/ 9 sockhash:ktls:txmsg test pop-data:FAIL
> 
> recv failed(): Bad message
> 
> rx thread exited with err 1.
> 
> recv failed(): Message too long
> 
> rx thread exited with err 1.
> 
> #43/ 6 sockhash:ktls:txmsg test push/pop data:FAIL
> 
> #44/ 1 sockhash:ktls:txmsg test ingress parser:OK
> 
> #45/ 0 sockhash:ktls:txmsg test ingress parser2:OK
> 
> Pass: 43 Fail: 5
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-04-24  0:59 ` Jiayuan Chen
@ 2025-04-24  9:19   ` Jiayuan Chen
  0 siblings, 0 replies; 1546+ messages in thread
From: Jiayuan Chen @ 2025-04-24  9:19 UTC (permalink / raw)
  To: Cong Wang; +Cc: john.fastabend, jakub, netdev, bpf

April 24, 2025 at 08:59, "Jiayuan Chen" <jiayuan.chen@linux.dev> wrote:

> 
> April 24, 2025 at 08:40, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
> 
> > 
> > netdev@vger.kernel.org, bpf@vger.kernel.org
> > 
> >  Bcc: 
> > 
> > 
> >  Subject: test_sockmap failures on the latest bpf-next
> > 
> >  Reply-To: 
> > 
> >  
> > 
> >  Hi all,
> > 
> >  
> > 
> >  The latest bpf-next failed on test_sockmap tests, I got the following
> > 
> >  failures (including 1 kernel warning). It is 100% reproducible here.
> > 
> >  I don't have time to look into them, a quick glance at the changelog
> >  
> >  shows quite some changes from Jiayuan. So please take a look, Jiayuan.
> > 
> >  Meanwhile, please let me know if you need more information from me.
> > 
> >  Thanks!
> > 
> >  
> > 
> >  --------------->
> > 
> 
> Thanks, I'm working on it.
> 

After resetting my commit to 0bb2f7a1ad1f, which is before my changes, the warning still exists.

The warning originates from test_txmsg_redir_wait_sndmem(), which performs
'KTLS + sockmap with redir EGRESS and limited receive buffer'.

The memory charge/uncharge logic is problematic, I need some time to investigate and fix it.

Thanks.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-05-09 17:38 Shawn Anastasio
@ 2025-05-10 19:50 ` Trilok Soni
  0 siblings, 0 replies; 1546+ messages in thread
From: Trilok Soni @ 2025-05-10 19:50 UTC (permalink / raw)
  To: Shawn Anastasio, linux-pci, Lukas Wunner,
	Krishna Chaitanya Chundru
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, chaitanya chundru, Bjorn Andersson, Konrad Dybcio,
	cros-qcom-dts-watchers, Jingoo Han, Bartosz Golaszewski,
	quic_vbadigan, amitk, devicetree, linux-kernel, linux-arm-msm,
	jorge.ramirez, Dmitry Baryshkov, Timothy Pearson

On 5/9/2025 10:38 AM, Shawn Anastasio wrote:
> From: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
> 
> Date: Sat, 12 Apr 2025 07:19:56 +0530
> Subject: [PATCH v6] PCI: PCI: Add pcie_link_is_active() to determine if the
>  PCIe link is active

I don't understand this patch and it doesn't have any subject in email. Please fix. 

> 
> Introduce a common API to check if the PCIe link is active, replacing
> duplicate code in multiple locations.
> 
> Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
> Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
> ---
> This is an updated patch pulled from Krishna's v5 series:
> https://patchwork.kernel.org/project/linux-pci/list/?series=952665
> 
> The following changes to Krishna's v5 were made by me:
>   - Revert pcie_link_is_active return type back to int per Lukas' review
>     comments
>   - Handle non-zero error returns at call site of the new function in
>     pci.c/pci_bridge_wait_for_secondary_bus
> 
>  drivers/pci/hotplug/pciehp.h      |  1 -
>  drivers/pci/hotplug/pciehp_ctrl.c |  2 +-
>  drivers/pci/hotplug/pciehp_hpc.c  | 33 +++----------------------------
>  drivers/pci/pci.c                 | 26 +++++++++++++++++++++---
>  include/linux/pci.h               |  4 ++++
>  5 files changed, 31 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
> index 273dd8c66f4e..acef728530e3 100644
> --- a/drivers/pci/hotplug/pciehp.h
> +++ b/drivers/pci/hotplug/pciehp.h
> @@ -186,7 +186,6 @@ int pciehp_query_power_fault(struct controller *ctrl);
>  int pciehp_card_present(struct controller *ctrl);
>  int pciehp_card_present_or_link_active(struct controller *ctrl);
>  int pciehp_check_link_status(struct controller *ctrl);
> -int pciehp_check_link_active(struct controller *ctrl);
>  void pciehp_release_ctrl(struct controller *ctrl);
> 
>  int pciehp_sysfs_enable_slot(struct hotplug_slot *hotplug_slot);
> diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
> index d603a7aa7483..4bb58ba1c766 100644
> --- a/drivers/pci/hotplug/pciehp_ctrl.c
> +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> @@ -260,7 +260,7 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>  	/* Turn the slot on if it's occupied or link is up */
>  	mutex_lock(&ctrl->state_lock);
>  	present = pciehp_card_present(ctrl);
> -	link_active = pciehp_check_link_active(ctrl);
> +	link_active = pcie_link_is_active(ctrl->pcie->port);
>  	if (present <= 0 && link_active <= 0) {
>  		if (ctrl->state == BLINKINGON_STATE) {
>  			ctrl->state = OFF_STATE;
> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> index 8a09fb6083e2..278bc21d531d 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -221,33 +221,6 @@ static void pcie_write_cmd_nowait(struct controller *ctrl, u16 cmd, u16 mask)
>  	pcie_do_write_cmd(ctrl, cmd, mask, false);
>  }
> 
> -/**
> - * pciehp_check_link_active() - Is the link active
> - * @ctrl: PCIe hotplug controller
> - *
> - * Check whether the downstream link is currently active. Note it is
> - * possible that the card is removed immediately after this so the
> - * caller may need to take it into account.
> - *
> - * If the hotplug controller itself is not available anymore returns
> - * %-ENODEV.
> - */
> -int pciehp_check_link_active(struct controller *ctrl)
> -{
> -	struct pci_dev *pdev = ctrl_dev(ctrl);
> -	u16 lnk_status;
> -	int ret;
> -
> -	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
> -	if (ret == PCIBIOS_DEVICE_NOT_FOUND || PCI_POSSIBLE_ERROR(lnk_status))
> -		return -ENODEV;
> -
> -	ret = !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
> -	ctrl_dbg(ctrl, "%s: lnk_status = %x\n", __func__, lnk_status);
> -
> -	return ret;
> -}
> -
>  static bool pci_bus_check_dev(struct pci_bus *bus, int devfn)
>  {
>  	u32 l;
> @@ -467,7 +440,7 @@ int pciehp_card_present_or_link_active(struct controller *ctrl)
>  	if (ret)
>  		return ret;
> 
> -	return pciehp_check_link_active(ctrl);
> +	return pcie_link_is_active(ctrl_dev(ctrl));
>  }
> 
>  int pciehp_query_power_fault(struct controller *ctrl)
> @@ -584,7 +557,7 @@ static void pciehp_ignore_dpc_link_change(struct controller *ctrl,
>  	 * Synthesize it to ensure that it is acted on.
>  	 */
>  	down_read_nested(&ctrl->reset_lock, ctrl->depth);
> -	if (!pciehp_check_link_active(ctrl))
> +	if (!pcie_link_is_active(ctrl_dev(ctrl)))
>  		pciehp_request(ctrl, PCI_EXP_SLTSTA_DLLSC);
>  	up_read(&ctrl->reset_lock);
>  }
> @@ -884,7 +857,7 @@ int pciehp_slot_reset(struct pcie_device *dev)
>  	pcie_capability_write_word(dev->port, PCI_EXP_SLTSTA,
>  				   PCI_EXP_SLTSTA_DLLSC);
> 
> -	if (!pciehp_check_link_active(ctrl))
> +	if (!pcie_link_is_active(ctrl_dev(ctrl)))
>  		pciehp_request(ctrl, PCI_EXP_SLTSTA_DLLSC);
> 
>  	return 0;
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e77d5b53c0ce..3bb8354b14bf 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4926,7 +4926,6 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
>  		return 0;
> 
>  	if (pcie_get_speed_cap(dev) <= PCIE_SPEED_5_0GT) {
> -		u16 status;
> 
>  		pci_dbg(dev, "waiting %d ms for downstream link\n", delay);
>  		msleep(delay);
> @@ -4942,8 +4941,7 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
>  		if (!dev->link_active_reporting)
>  			return -ENOTTY;
> 
> -		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &status);
> -		if (!(status & PCI_EXP_LNKSTA_DLLLA))
> +		if (pcie_link_is_active(dev) <= 0)
>  			return -ENOTTY;
> 
>  		return pci_dev_wait(child, reset_type,
> @@ -6247,6 +6245,28 @@ void pcie_print_link_status(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL(pcie_print_link_status);
> 
> +/**
> + * pcie_link_is_active() - Checks if the link is active or not
> + * @pdev: PCI device to query
> + *
> + * Check whether the link is active or not.
> + *
> + * Return: link state, or -ENODEV if the config read failes.
> + */
> +int pcie_link_is_active(struct pci_dev *pdev)
> +{
> +	u16 lnk_status;
> +	int ret;
> +
> +	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
> +	if (ret == PCIBIOS_DEVICE_NOT_FOUND || PCI_POSSIBLE_ERROR(lnk_status))
> +		return -ENODEV;
> +
> +	pci_dbg(pdev, "lnk_status = %x\n", lnk_status);
> +	return !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
> +}
> +EXPORT_SYMBOL(pcie_link_is_active);
> +
>  /**
>   * pci_select_bars - Make BAR mask from the type of resource
>   * @dev: the PCI device for which BAR mask is made
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 51e2bd6405cd..a79a9919320c 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1945,6 +1945,7 @@ pci_release_mem_regions(struct pci_dev *pdev)
>  			    pci_select_bars(pdev, IORESOURCE_MEM));
>  }
> 
> +int pcie_link_is_active(struct pci_dev *dev);
>  #else /* CONFIG_PCI is not enabled */
> 
>  static inline void pci_set_flags(int flags) { }
> @@ -2093,6 +2094,9 @@ pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
>  {
>  	return -ENOSPC;
>  }
> +
> +static inline bool pcie_link_is_active(struct pci_dev *dev)
> +{ return false; }
>  #endif /* CONFIG_PCI */
> 
>  /* Include architecture-dependent settings and functions */
> --
> 2.30.2
> 
> 


-- 
---Trilok Soni

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-05-14 20:21 Nicolas Pitre
@ 2025-05-15  8:33 ` Jiri Slaby
  0 siblings, 0 replies; 1546+ messages in thread
From: Jiri Slaby @ 2025-05-15  8:33 UTC (permalink / raw)
  To: Nicolas Pitre, Greg Kroah-Hartman; +Cc: npitre, linux-serial, linux-kernel

On 14. 05. 25, 22:21, Nicolas Pitre wrote:
>  From 28043dec8352fd857c6878c2ee568620a124b855 Mon Sep 17 00:00:00 2001
> From: Nicolas Pitre <nico@fluxnic.net>
> Date: Wed, 14 May 2025 15:58:22 -0400
> Subject: [PATCH] vt: remove VT_RESIZE and VT_RESIZEX from vt_compat_ioctl()
> From: Nicolas Pitre <npitre@baylibre.com>
> 
> They are listed amon those cmd values that "treat 'arg' as an integer"
> which is wrong. They should instead fall into the default case. Probably
> nobody ever exercized that code since 2009 but still.

AFAICS in the debian code search, exactly noone (except sanitizers, 
strace, fuzzers, valgrind, ...) uses VT_RESIZEX.

VT_RESIZE is used by kbd's resizecons -- and there it's the sole purpose 
to call this ioctl. I wonder how comes noone using 32bit of resizecons 
on 64bit noticed?

Thinking...

Actually, on x86, it doesn't matter if it takes arg (case VT_RESIZE) or 
compat_ptr() (default label) path as both are given the same user pointer...

It matters on s390x, but noone cares about the 32--64bit mix in there, 
apparently.

> Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
> Fixes: e92166517e3c ("tty: handle VT specific compat ioctls in vt driver")

FWIW, the e-mail's Subject is empty.

Reviewed-by: Jiri Slaby <jirislaby@kernel.org>

> diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c
> index 83a3d49535e5..61342e06970a 100644
> --- a/drivers/tty/vt/vt_ioctl.c
> +++ b/drivers/tty/vt/vt_ioctl.c
> @@ -1119,8 +1119,6 @@ long vt_compat_ioctl(struct tty_struct *tty,
>   	case VT_WAITACTIVE:
>   	case VT_RELDISP:
>   	case VT_DISALLOCATE:
> -	case VT_RESIZE:
> -	case VT_RESIZEX:
>   		return vt_ioctl(tty, cmd, arg);
>   
>   	/*


-- 
js
suse labs

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-07-01 13:44 Emanuele Ghidoli
@ 2025-07-11  2:21 ` Fabio Estevam
  0 siblings, 0 replies; 1546+ messages in thread
From: Fabio Estevam @ 2025-07-11  2:21 UTC (permalink / raw)
  To: Emanuele Ghidoli; +Cc: Francesco Dolcini, Tom Rini, Emanuele Ghidoli, u-boot

On Tue, Jul 1, 2025 at 10:45 AM Emanuele Ghidoli
<ghidoliemanuele@gmail.com> wrote:
>
> From: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
>
> Subject: [PATCH v1 0/5] Enable RNG support for KASLR on Toradex arm64 i.MX SoMs
>
> This patch series enables RNG support to automatically populate /chosen/kaslr-seed on the following Toradex arm64 i.MX System on Modules (SoMs):
> - Verdin iMX8MM
> - Verdin iMX8MP
> - Toradex SMARC iMX8MP
> - Apalis iMX8
> - Colibri iMX8X
>
> This improves kernel security by supporting Kernel Address Space Layout Randomization (KASLR) using a runtime-provided seed from the hardware RNG.
>
> Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>

Applied the series, thanks.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-07-20 18:26 ` >
@ 2025-07-20 19:30     ` David Lechner
  2025-07-21  7:52     ` Re: Andy Shevchenko
  1 sibling, 0 replies; 1546+ messages in thread
From: David Lechner @ 2025-07-20 19:30 UTC (permalink / raw)
  To: >, linux-kernel, devicetree, linux-iio, netdev,
	linux-arm-kernel, linux-amlogic
  Cc: ribalda, jic23, nuno.sa, andy, robh, krzk+dt, conor+dt,
	andrew+netdev, davem, edumazet, kuba, pabeni, neil.armstrong,
	khilman, jbrunet, martin.blumenstingl

On 7/20/25 1:26 PM, > wrote:
> Changes in v2:
> - Fixed commit message grammar
> - Fixed subject line style as per DT convention
> - Added missing reviewers/maintainers in CC
> 

By placing this before the headers, our email clients think this
message doesn't have a subject. It should go after the ---.

> From 5c00524cbb47e30ee04223fe9502af2eb003ddf1 Mon Sep 17 00:00:00 2001
> From: sanjay suthar <sanjaysuthar661996@gmail.com>
> Date: Sun, 20 Jul 2025 01:11:00 +0530
> Subject: [PATCH v2] dt-bindings: cleanup: fix duplicated 'is is' in YAML docs
> 
> Fix minor grammatical issues by removing duplicated "is" in two devicetree
> binding documents:
> 
> - net/amlogic,meson-dwmac.yaml
> - iio/dac/ti,dac7612.yaml
> 
> Signed-off-by: sanjay suthar <sanjaysuthar661996@gmail.com>
> ---

This is where the changelog belongs.

>  Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml      | 2 +-
>  Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml b/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
> index 20dd1370660d..624c640be4c8 100644
> --- a/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
> +++ b/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
> @@ -9,7 +9,7 @@ title: Texas Instruments DAC7612 family of DACs
>  description:
>    The DAC7612 is a dual, 12-bit digital-to-analog converter (DAC) with
>    guaranteed 12-bit monotonicity performance over the industrial temperature
> -  range. Is is programmable through an SPI interface.
> +  range. It is programmable through an SPI interface.
>  
>  maintainers:
>    - Ricardo Ribalda Delgado <ricardo@ribalda.com>
> diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
> index 0cd78d71768c..5c91716d1f21 100644
> --- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
> +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
> @@ -149,7 +149,7 @@ properties:
>        - description:
>            The first register range should be the one of the DWMAC controller
>        - description:
> -          The second range is is for the Amlogic specific configuration
> +          The second range is for the Amlogic specific configuration
>            (for example the PRG_ETHERNET register range on Meson8b and newer)
>  
>    interrupts:

I would be tempted to split this into two patches. It's a bit odd to have
a single patch touching two unrelated bindings.

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2025-07-20 19:30     ` David Lechner
  0 siblings, 0 replies; 1546+ messages in thread
From: David Lechner @ 2025-07-20 19:30 UTC (permalink / raw)
  To: >, linux-kernel, devicetree, linux-iio, netdev,
	linux-arm-kernel, linux-amlogic
  Cc: ribalda, jic23, nuno.sa, andy, robh, krzk+dt, conor+dt,
	andrew+netdev, davem, edumazet, kuba, pabeni, neil.armstrong,
	khilman, jbrunet, martin.blumenstingl

On 7/20/25 1:26 PM, > wrote:
> Changes in v2:
> - Fixed commit message grammar
> - Fixed subject line style as per DT convention
> - Added missing reviewers/maintainers in CC
> 

By placing this before the headers, our email clients think this
message doesn't have a subject. It should go after the ---.

> From 5c00524cbb47e30ee04223fe9502af2eb003ddf1 Mon Sep 17 00:00:00 2001
> From: sanjay suthar <sanjaysuthar661996@gmail.com>
> Date: Sun, 20 Jul 2025 01:11:00 +0530
> Subject: [PATCH v2] dt-bindings: cleanup: fix duplicated 'is is' in YAML docs
> 
> Fix minor grammatical issues by removing duplicated "is" in two devicetree
> binding documents:
> 
> - net/amlogic,meson-dwmac.yaml
> - iio/dac/ti,dac7612.yaml
> 
> Signed-off-by: sanjay suthar <sanjaysuthar661996@gmail.com>
> ---

This is where the changelog belongs.

>  Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml      | 2 +-
>  Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml b/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
> index 20dd1370660d..624c640be4c8 100644
> --- a/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
> +++ b/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
> @@ -9,7 +9,7 @@ title: Texas Instruments DAC7612 family of DACs
>  description:
>    The DAC7612 is a dual, 12-bit digital-to-analog converter (DAC) with
>    guaranteed 12-bit monotonicity performance over the industrial temperature
> -  range. Is is programmable through an SPI interface.
> +  range. It is programmable through an SPI interface.
>  
>  maintainers:
>    - Ricardo Ribalda Delgado <ricardo@ribalda.com>
> diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
> index 0cd78d71768c..5c91716d1f21 100644
> --- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
> +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
> @@ -149,7 +149,7 @@ properties:
>        - description:
>            The first register range should be the one of the DWMAC controller
>        - description:
> -          The second range is is for the Amlogic specific configuration
> +          The second range is for the Amlogic specific configuration
>            (for example the PRG_ETHERNET register range on Meson8b and newer)
>  
>    interrupts:

I would be tempted to split this into two patches. It's a bit odd to have
a single patch touching two unrelated bindings.


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-07-20 19:30     ` Re: David Lechner
@ 2025-07-21  6:52       ` Krzysztof Kozlowski
  -1 siblings, 0 replies; 1546+ messages in thread
From: Krzysztof Kozlowski @ 2025-07-21  6:52 UTC (permalink / raw)
  To: David Lechner, >, linux-kernel, devicetree, linux-iio, netdev,
	linux-arm-kernel, linux-amlogic
  Cc: ribalda, jic23, nuno.sa, andy, robh, krzk+dt, conor+dt,
	andrew+netdev, davem, edumazet, kuba, pabeni, neil.armstrong,
	khilman, jbrunet, martin.blumenstingl

On 20/07/2025 21:30, David Lechner wrote:
>>    - Ricardo Ribalda Delgado <ricardo@ribalda.com>
>> diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> index 0cd78d71768c..5c91716d1f21 100644
>> --- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> @@ -149,7 +149,7 @@ properties:
>>        - description:
>>            The first register range should be the one of the DWMAC controller
>>        - description:
>> -          The second range is is for the Amlogic specific configuration
>> +          The second range is for the Amlogic specific configuration
>>            (for example the PRG_ETHERNET register range on Meson8b and newer)
>>  
>>    interrupts:
> 
> I would be tempted to split this into two patches. It's a bit odd to have


No, it's a churn to split this into more than one patch.

> a single patch touching two unrelated bindings.




Best regards,
Krzysztof

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2025-07-21  6:52       ` Krzysztof Kozlowski
  0 siblings, 0 replies; 1546+ messages in thread
From: Krzysztof Kozlowski @ 2025-07-21  6:52 UTC (permalink / raw)
  To: David Lechner, >, linux-kernel, devicetree, linux-iio, netdev,
	linux-arm-kernel, linux-amlogic
  Cc: ribalda, jic23, nuno.sa, andy, robh, krzk+dt, conor+dt,
	andrew+netdev, davem, edumazet, kuba, pabeni, neil.armstrong,
	khilman, jbrunet, martin.blumenstingl

On 20/07/2025 21:30, David Lechner wrote:
>>    - Ricardo Ribalda Delgado <ricardo@ribalda.com>
>> diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> index 0cd78d71768c..5c91716d1f21 100644
>> --- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> @@ -149,7 +149,7 @@ properties:
>>        - description:
>>            The first register range should be the one of the DWMAC controller
>>        - description:
>> -          The second range is is for the Amlogic specific configuration
>> +          The second range is for the Amlogic specific configuration
>>            (for example the PRG_ETHERNET register range on Meson8b and newer)
>>  
>>    interrupts:
> 
> I would be tempted to split this into two patches. It's a bit odd to have


No, it's a churn to split this into more than one patch.

> a single patch touching two unrelated bindings.




Best regards,
Krzysztof


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-07-20 18:26 ` >
@ 2025-07-21  7:52     ` Andy Shevchenko
  2025-07-21  7:52     ` Re: Andy Shevchenko
  1 sibling, 0 replies; 1546+ messages in thread
From: Andy Shevchenko @ 2025-07-21  7:52 UTC (permalink / raw)
  To: >
  Cc: linux-kernel, devicetree, linux-iio, netdev, linux-arm-kernel,
	linux-amlogic, ribalda, jic23, dlechner, nuno.sa, andy, robh,
	krzk+dt, conor+dt, andrew+netdev, davem, edumazet, kuba, pabeni,
	neil.armstrong, khilman, jbrunet, martin.blumenstingl

On Sun, Jul 20, 2025 at 11:56:27PM +0530, > wrote:
> Changes in v2:
> - Fixed commit message grammar
> - Fixed subject line style as per DT convention
> - Added missing reviewers/maintainers in CC
> 
> From 5c00524cbb47e30ee04223fe9502af2eb003ddf1 Mon Sep 17 00:00:00 2001
> From: sanjay suthar <sanjaysuthar661996@gmail.com>
> Date: Sun, 20 Jul 2025 01:11:00 +0530
> Subject: [PATCH v2] dt-bindings: cleanup: fix duplicated 'is is' in YAML docs
> 
> Fix minor grammatical issues by removing duplicated "is" in two devicetree
> binding documents:
> 
> - net/amlogic,meson-dwmac.yaml
> - iio/dac/ti,dac7612.yaml

This mail is b0rken.

-- 
With Best Regards,
Andy Shevchenko



_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2025-07-21  7:52     ` Andy Shevchenko
  0 siblings, 0 replies; 1546+ messages in thread
From: Andy Shevchenko @ 2025-07-21  7:52 UTC (permalink / raw)
  To: >
  Cc: linux-kernel, devicetree, linux-iio, netdev, linux-arm-kernel,
	linux-amlogic, ribalda, jic23, dlechner, nuno.sa, andy, robh,
	krzk+dt, conor+dt, andrew+netdev, davem, edumazet, kuba, pabeni,
	neil.armstrong, khilman, jbrunet, martin.blumenstingl

On Sun, Jul 20, 2025 at 11:56:27PM +0530, > wrote:
> Changes in v2:
> - Fixed commit message grammar
> - Fixed subject line style as per DT convention
> - Added missing reviewers/maintainers in CC
> 
> From 5c00524cbb47e30ee04223fe9502af2eb003ddf1 Mon Sep 17 00:00:00 2001
> From: sanjay suthar <sanjaysuthar661996@gmail.com>
> Date: Sun, 20 Jul 2025 01:11:00 +0530
> Subject: [PATCH v2] dt-bindings: cleanup: fix duplicated 'is is' in YAML docs
> 
> Fix minor grammatical issues by removing duplicated "is" in two devicetree
> binding documents:
> 
> - net/amlogic,meson-dwmac.yaml
> - iio/dac/ti,dac7612.yaml

This mail is b0rken.

-- 
With Best Regards,
Andy Shevchenko




^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
       [not found]       ` <CADU64hDZeyaCpHXBmSG1rtHjpxmjejT7asK9oGBUMF55eYeh4w@mail.gmail.com>
@ 2025-07-21 14:09           ` David Lechner
  0 siblings, 0 replies; 1546+ messages in thread
From: David Lechner @ 2025-07-21 14:09 UTC (permalink / raw)
  To: Sanjay Suthar, Krzysztof Kozlowski
  Cc: linux-kernel, devicetree, linux-iio, netdev, linux-arm-kernel,
	linux-amlogic, ribalda, jic23, nuno.sa, andy, robh, krzk+dt,
	conor+dt, andrew+netdev, davem, edumazet, kuba, pabeni,
	neil.armstrong, khilman, jbrunet, martin.blumenstingl

On 7/21/25 5:15 AM, Sanjay Suthar wrote:
> On Mon, Jul 21, 2025 at 12:22 PM Krzysztof Kozlowski <krzk@kernel.org <mailto:krzk@kernel.org>> wrote:
>>
>> On 20/07/2025 21:30, David Lechner wrote:
>> >>    - Ricardo Ribalda Delgado <ricardo@ribalda.com <mailto:ricardo@ribalda.com>>
>> >> diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> >> index 0cd78d71768c..5c91716d1f21 100644
>> >> --- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> >> +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> >> @@ -149,7 +149,7 @@ properties:
>> >>        - description:
>> >>            The first register range should be the one of the DWMAC controller
>> >>        - description:
>> >> -          The second range is is for the Amlogic specific configuration
>> >> +          The second range is for the Amlogic specific configuration
>> >>            (for example the PRG_ETHERNET register range on Meson8b and newer)
>> >>
>> >>    interrupts:
>> >
>> > I would be tempted to split this into two patches. It's a bit odd to have
>>
>>
>> No, it's a churn to split this into more than one patch.
>>
> 
> Thanks for the reply. Since there are suggestions on patch split as it is touching different subsystems, still not clear if I should split the patch or single patch is fine. I would appreciate if you can guide on the next steps to be taken
> 
> Best Regards,
> Sanjay Suthar

Krzysztof is one of the devicetree maintainers and I am not, so you
should do what Krzysztof says - leave it as one patch. :-)

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
@ 2025-07-21 14:09           ` David Lechner
  0 siblings, 0 replies; 1546+ messages in thread
From: David Lechner @ 2025-07-21 14:09 UTC (permalink / raw)
  To: Sanjay Suthar, Krzysztof Kozlowski
  Cc: linux-kernel, devicetree, linux-iio, netdev, linux-arm-kernel,
	linux-amlogic, ribalda, jic23, nuno.sa, andy, robh, krzk+dt,
	conor+dt, andrew+netdev, davem, edumazet, kuba, pabeni,
	neil.armstrong, khilman, jbrunet, martin.blumenstingl

On 7/21/25 5:15 AM, Sanjay Suthar wrote:
> On Mon, Jul 21, 2025 at 12:22 PM Krzysztof Kozlowski <krzk@kernel.org <mailto:krzk@kernel.org>> wrote:
>>
>> On 20/07/2025 21:30, David Lechner wrote:
>> >>    - Ricardo Ribalda Delgado <ricardo@ribalda.com <mailto:ricardo@ribalda.com>>
>> >> diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> >> index 0cd78d71768c..5c91716d1f21 100644
>> >> --- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> >> +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> >> @@ -149,7 +149,7 @@ properties:
>> >>        - description:
>> >>            The first register range should be the one of the DWMAC controller
>> >>        - description:
>> >> -          The second range is is for the Amlogic specific configuration
>> >> +          The second range is for the Amlogic specific configuration
>> >>            (for example the PRG_ETHERNET register range on Meson8b and newer)
>> >>
>> >>    interrupts:
>> >
>> > I would be tempted to split this into two patches. It's a bit odd to have
>>
>>
>> No, it's a churn to split this into more than one patch.
>>
> 
> Thanks for the reply. Since there are suggestions on patch split as it is touching different subsystems, still not clear if I should split the patch or single patch is fine. I would appreciate if you can guide on the next steps to be taken
> 
> Best Regards,
> Sanjay Suthar

Krzysztof is one of the devicetree maintainers and I am not, so you
should do what Krzysztof says - leave it as one patch. :-)


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-06  3:34 Sang-Heon Jeon
@ 2025-08-06  3:44 ` Sang-Heon Jeon
  0 siblings, 0 replies; 1546+ messages in thread
From: Sang-Heon Jeon @ 2025-08-06  3:44 UTC (permalink / raw)
  To: damon

Because of my lack of knowledge, the above mail is just a mistake.
Just Ignore plz.
Sorry for being annoying.

PS) Unfortunately, I found this thread [1] after I sent unnecessary
mail to others.

[1] https://lore.kernel.org/damon/20240926213942.17022-1-sj@kernel.org/

On Wed, Aug 6, 2025 at 12:34 PM Sang-Heon Jeon <ekffu200098@gmail.com> wrote:
>
> subscribe damon mailing list

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-12 13:34 Baoquan He
@ 2025-08-12 13:49 ` Baoquan He
  0 siblings, 0 replies; 1546+ messages in thread
From: Baoquan He @ 2025-08-12 13:49 UTC (permalink / raw)
  To: linux-mm, christophe.leroy

On 08/12/25 at 09:34pm, Baoquan He wrote:
> alexghiti@rivosinc.com, agordeev@linux.ibm.com, linux@armlinux.org.uk,
> linux-arm-kernel@lists.infradead.org, loongarch@lists.linux.dev,
> linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org,
> x86@kernel.org, chris@zankel.net, jcmvbkbc@gmail.com, linux-um@lists.infradead.org
> Cc: ryabinin.a.a@gmail.com, glider@google.com, andreyknvl@gmail.com,
> 	dvyukov@google.com, vincenzo.frascino@arm.com,
> 	akpm@linux-foundation.org, kasan-dev@googlegroups.com,
> 	linux-kernel@vger.kernel.org, kexec@lists.infradead.org,
> 	sj@kernel.org, lorenzo.stoakes@oracle.com, elver@google.com,
> 	snovitoll@gmail.com
> Bcc: bhe@redhat.com
> Subject: Re: [PATCH v2 00/12] mm/kasan: make kasan=on|off work for all three
>  modes
> Reply-To: 
> In-Reply-To: <20250812124941.69508-1-bhe@redhat.com>
> 
> Forgot adding related ARCH mailing list or people to CC, add them.

Sorry for the noise, I made mistake on mail format when adding people to
CC.

> 
> On 08/12/25 at 08:49pm, Baoquan He wrote:
> > Currently only hw_tags mode of kasan can be enabled or disabled with
> > kernel parameter kasan=on|off for built kernel. For kasan generic and
> > sw_tags mode, there's no way to disable them once kernel is built.
> > This is not convenient sometime, e.g in system kdump is configured.
> > When the 1st kernel has KASAN enabled and crash triggered to switch to
> > kdump kernel, the generic or sw_tags mode will cost much extra memory
> > for kasan shadow while in fact it's meaningless to have kasan in kdump
> > kernel.
> > 
> > So this patchset moves the kasan=on|off out of hw_tags scope and into
> > common code to make it visible in generic and sw_tags mode too. Then we
> > can add kasan=off in kdump kernel to reduce the unneeded meomry cost for
> > kasan.
> > 
> > Changelog:
> > ====
> > v1->v2:
> > - Add __ro_after_init for __ro_after_init, and remove redundant blank
> >   lines in mm/kasan/common.c. Thanks to Marco.
> > - Fix a code bug in <linux/kasan-enabled.h> when CONFIG_KASAN is unset,
> >   this is found out by SeongJae and Lorenzo, and also reported by LKP
> >   report, thanks to them.
> > - Add a missing kasan_enabled() checking in kasan_report(). This will
> >   cause below KASAN report info even though kasan=off is set:
> >      ==================================================================
> >      BUG: KASAN: stack-out-of-bounds in tick_program_event+0x130/0x150
> >      Read of size 4 at addr ffff00005f747778 by task swapper/0/1
> >      
> >      CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.16.0+ #8 PREEMPT(voluntary) 
> >      Hardware name: GIGABYTE R272-P30-JG/MP32-AR0-JG, BIOS F31n (SCP: 2.10.20220810) 09/30/2022
> >      Call trace:
> >       show_stack+0x30/0x90 (C)
> >       dump_stack_lvl+0x7c/0xa0
> >       print_address_description.constprop.0+0x90/0x310
> >       print_report+0x104/0x1f0
> >       kasan_report+0xc8/0x110
> >       __asan_report_load4_noabort+0x20/0x30
> >       tick_program_event+0x130/0x150
> >       ......snip...
> >      ==================================================================
> > 
> > - Add jump_label_init() calling before kasan_init() in setup_arch() in these
> >   architectures: xtensa, arm. Because they currenly rely on
> >   jump_label_init() in main() which is a little late. Then the early static
> >   key kasan_flag_enabled in kasan_init() won't work.
> > 
> > - In UML architecture, change to enable kasan_flag_enabled in arch_mm_preinit()
> >   because kasan_init() is enabled before main(), there's no chance to operate
> >   on static key in kasan_init().
> > 
> > Test:
> > =====
> > In v1, I took test on x86_64 for generic mode, and on arm64 for
> > generic, sw_tags and hw_tags mode. All of them works well.
> > 
> > In v2, I only tested on arm64 for generic, sw_tags and hw_tags mode, it
> > works. For powerpc, I got a BOOK3S/64 machine, while it says
> > 'KASAN not enabled as it requires radix' and KASAN is disabled. Will
> > look for other POWER machine to test this.
> > ====
> > 
> > Baoquan He (12):
> >   mm/kasan: add conditional checks in functions to return directly if
> >     kasan is disabled
> >   mm/kasan: move kasan= code to common place
> >   mm/kasan/sw_tags: don't initialize kasan if it's disabled
> >   arch/arm: don't initialize kasan if it's disabled
> >   arch/arm64: don't initialize kasan if it's disabled
> >   arch/loongarch: don't initialize kasan if it's disabled
> >   arch/powerpc: don't initialize kasan if it's disabled
> >   arch/riscv: don't initialize kasan if it's disabled
> >   arch/x86: don't initialize kasan if it's disabled
> >   arch/xtensa: don't initialize kasan if it's disabled
> >   arch/um: don't initialize kasan if it's disabled
> >   mm/kasan: make kasan=on|off take effect for all three modes
> > 
> >  arch/arm/kernel/setup.c                |  6 +++++
> >  arch/arm/mm/kasan_init.c               |  6 +++++
> >  arch/arm64/mm/kasan_init.c             |  7 ++++++
> >  arch/loongarch/mm/kasan_init.c         |  5 ++++
> >  arch/powerpc/mm/kasan/init_32.c        |  8 +++++-
> >  arch/powerpc/mm/kasan/init_book3e_64.c |  6 +++++
> >  arch/powerpc/mm/kasan/init_book3s_64.c |  6 +++++
> >  arch/riscv/mm/kasan_init.c             |  6 +++++
> >  arch/um/kernel/mem.c                   |  6 +++++
> >  arch/x86/mm/kasan_init_64.c            |  6 +++++
> >  arch/xtensa/kernel/setup.c             |  1 +
> >  arch/xtensa/mm/kasan_init.c            |  6 +++++
> >  include/linux/kasan-enabled.h          | 18 ++++++-------
> >  mm/kasan/common.c                      | 25 ++++++++++++++++++
> >  mm/kasan/generic.c                     | 20 +++++++++++++--
> >  mm/kasan/hw_tags.c                     | 35 ++------------------------
> >  mm/kasan/init.c                        |  6 +++++
> >  mm/kasan/quarantine.c                  |  3 +++
> >  mm/kasan/report.c                      |  4 ++-
> >  mm/kasan/shadow.c                      | 23 ++++++++++++++++-
> >  mm/kasan/sw_tags.c                     |  9 +++++++
> >  21 files changed, 165 insertions(+), 47 deletions(-)
> > 
> > -- 
> > 2.41.0
> > 
> 
> 



^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-13 17:25 ` Jon Hunter
@ 2025-08-14 15:36   ` Greg KH
  2025-08-15 16:20     ` Re: Jon Hunter
  0 siblings, 1 reply; 1546+ messages in thread
From: Greg KH @ 2025-08-14 15:36 UTC (permalink / raw)
  To: Jon Hunter
  Cc: achill, akpm, broonie, conor, f.fainelli, hargar, linux-kernel,
	linux-tegra, linux, lkft-triage, patches, patches, pavel, rwarsow,
	shuah, srw, stable, sudipm.mukherjee, torvalds

On Wed, Aug 13, 2025 at 06:25:32PM +0100, Jon Hunter wrote:
> On Wed, Aug 13, 2025 at 08:48:28AM -0700, Jon Hunter wrote:
> > On Tue, 12 Aug 2025 19:43:28 +0200, Greg Kroah-Hartman wrote:
> > > This is the start of the stable review cycle for the 6.15.10 release.
> > > There are 480 patches in this series, all will be posted as a response
> > > to this one.  If anyone has any issues with these being applied, please
> > > let me know.
> > > 
> > > Responses should be made by Thu, 14 Aug 2025 17:42:20 +0000.
> > > Anything received after that time might be too late.
> > > 
> > > The whole patch series can be found in one patch at:
> > > 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.15.10-rc1.gz
> > > or in the git tree and branch at:
> > > 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.15.y
> > > and the diffstat can be found below.
> > > 
> > > thanks,
> > > 
> > > greg k-h
> > 
> > Failures detected for Tegra ...
> > 
> > Test results for stable-v6.15:
> >     10 builds:	10 pass, 0 fail
> >     28 boots:	28 pass, 0 fail
> >     120 tests:	119 pass, 1 fail
> > 
> > Linux version:	6.15.10-rc1-g2510f67e2e34
> > Boards tested:	tegra124-jetson-tk1, tegra186-p2771-0000,
> >                 tegra186-p3509-0000+p3636-0001, tegra194-p2972-0000,
> >                 tegra194-p3509-0000+p3668-0000, tegra20-ventana,
> >                 tegra210-p2371-2180, tegra210-p3450-0000,
> >                 tegra30-cardhu-a04
> > 
> > Test failures:	tegra194-p2972-0000: boot.py
> 
> I am seeing the following kernel warning for both linux-6.15.y and linux-6.16.y …
> 
>  WARNING KERN sched: DL replenish lagged too much
> 
> I believe that this is introduced by …
> 
> Peter Zijlstra <peterz@infradead.org>
>     sched/deadline: Less agressive dl_server handling
> 
> This has been reported here: https://lore.kernel.org/all/CAMuHMdXn4z1pioTtBGMfQM0jsLviqS2jwysaWXpoLxWYoGa82w@mail.gmail.com/

I've now dropped this.

Is that causing the test failure for you?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-14 15:36   ` Greg KH
@ 2025-08-15 16:20     ` Jon Hunter
  2025-08-15 16:53       ` Re: Greg KH
  0 siblings, 1 reply; 1546+ messages in thread
From: Jon Hunter @ 2025-08-15 16:20 UTC (permalink / raw)
  To: Greg KH
  Cc: achill, akpm, broonie, conor, f.fainelli, hargar, linux-kernel,
	linux-tegra, linux, lkft-triage, patches, patches, pavel, rwarsow,
	shuah, srw, stable, sudipm.mukherjee, torvalds

On 14/08/2025 16:36, Greg KH wrote:
> On Wed, Aug 13, 2025 at 06:25:32PM +0100, Jon Hunter wrote:
>> On Wed, Aug 13, 2025 at 08:48:28AM -0700, Jon Hunter wrote:
>>> On Tue, 12 Aug 2025 19:43:28 +0200, Greg Kroah-Hartman wrote:
>>>> This is the start of the stable review cycle for the 6.15.10 release.
>>>> There are 480 patches in this series, all will be posted as a response
>>>> to this one.  If anyone has any issues with these being applied, please
>>>> let me know.
>>>>
>>>> Responses should be made by Thu, 14 Aug 2025 17:42:20 +0000.
>>>> Anything received after that time might be too late.
>>>>
>>>> The whole patch series can be found in one patch at:
>>>> 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.15.10-rc1.gz
>>>> or in the git tree and branch at:
>>>> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.15.y
>>>> and the diffstat can be found below.
>>>>
>>>> thanks,
>>>>
>>>> greg k-h
>>>
>>> Failures detected for Tegra ...
>>>
>>> Test results for stable-v6.15:
>>>      10 builds:	10 pass, 0 fail
>>>      28 boots:	28 pass, 0 fail
>>>      120 tests:	119 pass, 1 fail
>>>
>>> Linux version:	6.15.10-rc1-g2510f67e2e34
>>> Boards tested:	tegra124-jetson-tk1, tegra186-p2771-0000,
>>>                  tegra186-p3509-0000+p3636-0001, tegra194-p2972-0000,
>>>                  tegra194-p3509-0000+p3668-0000, tegra20-ventana,
>>>                  tegra210-p2371-2180, tegra210-p3450-0000,
>>>                  tegra30-cardhu-a04
>>>
>>> Test failures:	tegra194-p2972-0000: boot.py
>>
>> I am seeing the following kernel warning for both linux-6.15.y and linux-6.16.y …
>>
>>   WARNING KERN sched: DL replenish lagged too much
>>
>> I believe that this is introduced by …
>>
>> Peter Zijlstra <peterz@infradead.org>
>>      sched/deadline: Less agressive dl_server handling
>>
>> This has been reported here: https://lore.kernel.org/all/CAMuHMdXn4z1pioTtBGMfQM0jsLviqS2jwysaWXpoLxWYoGa82w@mail.gmail.com/
> 
> I've now dropped this.
> 
> Is that causing the test failure for you?

Yes that is causing the test failure. Thanks!

Jon

-- 
nvpublic


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-15 16:20     ` Re: Jon Hunter
@ 2025-08-15 16:53       ` Greg KH
  0 siblings, 0 replies; 1546+ messages in thread
From: Greg KH @ 2025-08-15 16:53 UTC (permalink / raw)
  To: Jon Hunter
  Cc: achill, akpm, broonie, conor, f.fainelli, hargar, linux-kernel,
	linux-tegra, linux, lkft-triage, patches, patches, pavel, rwarsow,
	shuah, srw, stable, sudipm.mukherjee, torvalds

On Fri, Aug 15, 2025 at 05:20:34PM +0100, Jon Hunter wrote:
> On 14/08/2025 16:36, Greg KH wrote:
> > On Wed, Aug 13, 2025 at 06:25:32PM +0100, Jon Hunter wrote:
> > > On Wed, Aug 13, 2025 at 08:48:28AM -0700, Jon Hunter wrote:
> > > > On Tue, 12 Aug 2025 19:43:28 +0200, Greg Kroah-Hartman wrote:
> > > > > This is the start of the stable review cycle for the 6.15.10 release.
> > > > > There are 480 patches in this series, all will be posted as a response
> > > > > to this one.  If anyone has any issues with these being applied, please
> > > > > let me know.
> > > > > 
> > > > > Responses should be made by Thu, 14 Aug 2025 17:42:20 +0000.
> > > > > Anything received after that time might be too late.
> > > > > 
> > > > > The whole patch series can be found in one patch at:
> > > > > 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.15.10-rc1.gz
> > > > > or in the git tree and branch at:
> > > > > 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.15.y
> > > > > and the diffstat can be found below.
> > > > > 
> > > > > thanks,
> > > > > 
> > > > > greg k-h
> > > > 
> > > > Failures detected for Tegra ...
> > > > 
> > > > Test results for stable-v6.15:
> > > >      10 builds:	10 pass, 0 fail
> > > >      28 boots:	28 pass, 0 fail
> > > >      120 tests:	119 pass, 1 fail
> > > > 
> > > > Linux version:	6.15.10-rc1-g2510f67e2e34
> > > > Boards tested:	tegra124-jetson-tk1, tegra186-p2771-0000,
> > > >                  tegra186-p3509-0000+p3636-0001, tegra194-p2972-0000,
> > > >                  tegra194-p3509-0000+p3668-0000, tegra20-ventana,
> > > >                  tegra210-p2371-2180, tegra210-p3450-0000,
> > > >                  tegra30-cardhu-a04
> > > > 
> > > > Test failures:	tegra194-p2972-0000: boot.py
> > > 
> > > I am seeing the following kernel warning for both linux-6.15.y and linux-6.16.y …
> > > 
> > >   WARNING KERN sched: DL replenish lagged too much
> > > 
> > > I believe that this is introduced by …
> > > 
> > > Peter Zijlstra <peterz@infradead.org>
> > >      sched/deadline: Less agressive dl_server handling
> > > 
> > > This has been reported here: https://lore.kernel.org/all/CAMuHMdXn4z1pioTtBGMfQM0jsLviqS2jwysaWXpoLxWYoGa82w@mail.gmail.com/
> > 
> > I've now dropped this.
> > 
> > Is that causing the test failure for you?
> 
> Yes that is causing the test failure. Thanks!

Is the test just noticing the warning message?  Or is it a functional
failure?  Does it also fail on Linus's tree?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-20 14:33 Christian König
@ 2025-08-20 15:23 ` David Hildenbrand
  2025-08-21  8:10   ` Re: Christian König
  0 siblings, 1 reply; 1546+ messages in thread
From: David Hildenbrand @ 2025-08-20 15:23 UTC (permalink / raw)
  To: Christian König, intel-xe, intel-gfx, dri-devel, amd-gfx,
	x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

CCing Lorenzo

On 20.08.25 16:33, Christian König wrote:
> Hi everyone,
> 
> sorry for CCing so many people, but that rabbit hole turned out to be
> deeper than originally thought.
> 
> TTM always had problems with UC/WC mappings on 32bit systems and drivers
> often had to revert to hacks like using GFP_DMA32 to get things working
> while having no rational explanation why that helped (see the TTM AGP,
> radeon and nouveau driver code for that).
> 
> It turned out that the PAT implementation we use on x86 not only enforces
> the same caching attributes for pages in the linear kernel mapping, but
> also for highmem pages through a separate R/B tree.
> 
> That was unexpected and TTM never updated that R/B tree for highmem pages,
> so the function pgprot_set_cachemode() just overwrote the caching
> attributes drivers passed in to vmf_insert_pfn_prot() and that essentially
> caused all kind of random trouble.
> 
> An R/B tree is potentially not a good data structure to hold thousands if
> not millions of different attributes for each page, so updating that is
> probably not the way to solve this issue.
> 
> Thomas pointed out that the i915 driver is using apply_page_range()
> instead of vmf_insert_pfn_prot() to circumvent the PAT implementation and
> just fill in the page tables with what the driver things is the right
> caching attribute.

I assume you mean apply_to_page_range() -- same issue in patch subjects.

Oh this sounds horrible. Why oh why do we have these hacks in core-mm 
and have drivers abuse them :(

Honestly, apply_to_pte_range() is just the entry in doing all kinds of 
weird crap to page tables because "you know better".

All the sanity checks from vmf_insert_pfn(), gone.

Can we please fix the underlying issue properly?

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-20 15:23 ` David Hildenbrand
@ 2025-08-21  8:10   ` Christian König
  2025-08-25 19:10     ` Re: David Hildenbrand
  0 siblings, 1 reply; 1546+ messages in thread
From: Christian König @ 2025-08-21  8:10 UTC (permalink / raw)
  To: David Hildenbrand, intel-xe, intel-gfx, dri-devel, amd-gfx, x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 20.08.25 17:23, David Hildenbrand wrote:
> CCing Lorenzo
> 
> On 20.08.25 16:33, Christian König wrote:
>> Hi everyone,
>>
>> sorry for CCing so many people, but that rabbit hole turned out to be
>> deeper than originally thought.
>>
>> TTM always had problems with UC/WC mappings on 32bit systems and drivers
>> often had to revert to hacks like using GFP_DMA32 to get things working
>> while having no rational explanation why that helped (see the TTM AGP,
>> radeon and nouveau driver code for that).
>>
>> It turned out that the PAT implementation we use on x86 not only enforces
>> the same caching attributes for pages in the linear kernel mapping, but
>> also for highmem pages through a separate R/B tree.
>>
>> That was unexpected and TTM never updated that R/B tree for highmem pages,
>> so the function pgprot_set_cachemode() just overwrote the caching
>> attributes drivers passed in to vmf_insert_pfn_prot() and that essentially
>> caused all kind of random trouble.
>>
>> An R/B tree is potentially not a good data structure to hold thousands if
>> not millions of different attributes for each page, so updating that is
>> probably not the way to solve this issue.
>>
>> Thomas pointed out that the i915 driver is using apply_page_range()
>> instead of vmf_insert_pfn_prot() to circumvent the PAT implementation and
>> just fill in the page tables with what the driver things is the right
>> caching attribute.
> 
> I assume you mean apply_to_page_range() -- same issue in patch subjects.

Oh yes, of course. Sorry.

> Oh this sounds horrible. Why oh why do we have these hacks in core-mm and have drivers abuse them :(

Yeah I was also a bit hesitated to use that, but the performance advantage is so high that we probably can't avoid the general approach.

> Honestly, apply_to_pte_range() is just the entry in doing all kinds of weird crap to page tables because "you know better".

Exactly that's the problem I'm pointing out, drivers *do* know it better. The core memory management has applied incorrect values which caused all kind of the trouble.

The problem is not a bug in PAT nor TTM/drivers but rather how they interact with each other.

What I don't understand is why do we have the PAT in the first place? No other architecture does it this way.

Is that because of the of x86 CPUs which have problems when different page tables contain different caching attributes for the same physical memory?

> All the sanity checks from vmf_insert_pfn(), gone.
> 
> Can we please fix the underlying issue properly?

I'm happy to implement anything advised, my question is what should we solve this issue?

Regards,
Christian.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-21  8:10   ` Re: Christian König
@ 2025-08-25 19:10     ` David Hildenbrand
  2025-08-26  8:38       ` Re: Christian König
  0 siblings, 1 reply; 1546+ messages in thread
From: David Hildenbrand @ 2025-08-25 19:10 UTC (permalink / raw)
  To: Christian König, intel-xe, intel-gfx, dri-devel, amd-gfx,
	x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 21.08.25 10:10, Christian König wrote:
> On 20.08.25 17:23, David Hildenbrand wrote:
>> CCing Lorenzo
>>
>> On 20.08.25 16:33, Christian König wrote:
>>> Hi everyone,
>>>
>>> sorry for CCing so many people, but that rabbit hole turned out to be
>>> deeper than originally thought.
>>>
>>> TTM always had problems with UC/WC mappings on 32bit systems and drivers
>>> often had to revert to hacks like using GFP_DMA32 to get things working
>>> while having no rational explanation why that helped (see the TTM AGP,
>>> radeon and nouveau driver code for that).
>>>
>>> It turned out that the PAT implementation we use on x86 not only enforces
>>> the same caching attributes for pages in the linear kernel mapping, but
>>> also for highmem pages through a separate R/B tree.
>>>
>>> That was unexpected and TTM never updated that R/B tree for highmem pages,
>>> so the function pgprot_set_cachemode() just overwrote the caching
>>> attributes drivers passed in to vmf_insert_pfn_prot() and that essentially
>>> caused all kind of random trouble.
>>>
>>> An R/B tree is potentially not a good data structure to hold thousands if
>>> not millions of different attributes for each page, so updating that is
>>> probably not the way to solve this issue.
>>>
>>> Thomas pointed out that the i915 driver is using apply_page_range()
>>> instead of vmf_insert_pfn_prot() to circumvent the PAT implementation and
>>> just fill in the page tables with what the driver things is the right
>>> caching attribute.
>>
>> I assume you mean apply_to_page_range() -- same issue in patch subjects.
> 
> Oh yes, of course. Sorry.
> 
>> Oh this sounds horrible. Why oh why do we have these hacks in core-mm and have drivers abuse them :(
> 
> Yeah I was also a bit hesitated to use that, but the performance advantage is so high that we probably can't avoid the general approach.
> 
>> Honestly, apply_to_pte_range() is just the entry in doing all kinds of weird crap to page tables because "you know better".
> 
> Exactly that's the problem I'm pointing out, drivers *do* know it better. The core memory management has applied incorrect values which caused all kind of the trouble.
> 
> The problem is not a bug in PAT nor TTM/drivers but rather how they interact with each other.
> 
> What I don't understand is why do we have the PAT in the first place? No other architecture does it this way.

Probably because no other architecture has these weird glitches I assume 
... skimming over memtype_reserve() and friends there are quite some 
corner cases the code is handling (BIOS, ACPI, low ISA, system RAM, ...)

I did a lot of work on the higher PAT level functions, but I am no 
expert on the lower level management functions, and in particular all 
the special cases with different memory types.

IIRC, the goal of the PAT subsystem is to make sure that no two page 
tables map the same PFN with different caching attributes.

It treats ordinary system RAM (IORESOURCE_SYSTEM_RAM) usually in a 
special way: no special caching mode.

For everything else, it expects that someone first reserves a memory 
range for a specific caching mode.

For example, remap_pfn_range()...->pfnmap_track()->memtype_reserve() 
will make sure that there are no conflicts, to the call 
memtype_kernel_map_sync() to make sure the identity mapping is updated 
to the new type.

In case someone ends up calling pfnmap_setup_cachemode(), the 
expectation is that there was a previous call to memtype_reserve_io() or 
similar, such that pfnmap_setup_cachemode() will find that caching mode.

So my assumption would be that that is missing for the drivers here?

Last time I asked where this reservation is done, Peter Xu explained [1] 
it at least for VFIO:

vfio_pci_core_mmap
   pci_iomap
     pci_iomap_range
       ...
         __ioremap_caller
           memtype_reserve

Now, could it be that something like that is missing in these drivers 
(ioremap etc)?

[1] https://lkml.kernel.org/r/aBDXr-Qp4z0tS50P@x1.local

> 
> Is that because of the of x86 CPUs which have problems when different page tables contain different caching attributes for the same physical memory?

Yes, but I don't think x86 is special here.

-- 
Cheers

David / dhildenb

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-25 19:10     ` Re: David Hildenbrand
@ 2025-08-26  8:38       ` Christian König
  2025-08-26  8:46         ` Re: David Hildenbrand
  2025-08-26 12:37         ` Re: David Hildenbrand
  0 siblings, 2 replies; 1546+ messages in thread
From: Christian König @ 2025-08-26  8:38 UTC (permalink / raw)
  To: David Hildenbrand, intel-xe, intel-gfx, dri-devel, amd-gfx, x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 25.08.25 21:10, David Hildenbrand wrote:
> On 21.08.25 10:10, Christian König wrote:
>> On 20.08.25 17:23, David Hildenbrand wrote:
>>> CCing Lorenzo
>>>
>>> On 20.08.25 16:33, Christian König wrote:
>>>> Hi everyone,
>>>>
>>>> sorry for CCing so many people, but that rabbit hole turned out to be
>>>> deeper than originally thought.
>>>>
>>>> TTM always had problems with UC/WC mappings on 32bit systems and drivers
>>>> often had to revert to hacks like using GFP_DMA32 to get things working
>>>> while having no rational explanation why that helped (see the TTM AGP,
>>>> radeon and nouveau driver code for that).
>>>>
>>>> It turned out that the PAT implementation we use on x86 not only enforces
>>>> the same caching attributes for pages in the linear kernel mapping, but
>>>> also for highmem pages through a separate R/B tree.
>>>>
>>>> That was unexpected and TTM never updated that R/B tree for highmem pages,
>>>> so the function pgprot_set_cachemode() just overwrote the caching
>>>> attributes drivers passed in to vmf_insert_pfn_prot() and that essentially
>>>> caused all kind of random trouble.
>>>>
>>>> An R/B tree is potentially not a good data structure to hold thousands if
>>>> not millions of different attributes for each page, so updating that is
>>>> probably not the way to solve this issue.
>>>>
>>>> Thomas pointed out that the i915 driver is using apply_page_range()
>>>> instead of vmf_insert_pfn_prot() to circumvent the PAT implementation and
>>>> just fill in the page tables with what the driver things is the right
>>>> caching attribute.
>>>
>>> I assume you mean apply_to_page_range() -- same issue in patch subjects.
>>
>> Oh yes, of course. Sorry.
>>
>>> Oh this sounds horrible. Why oh why do we have these hacks in core-mm and have drivers abuse them :(
>>
>> Yeah I was also a bit hesitated to use that, but the performance advantage is so high that we probably can't avoid the general approach.
>>
>>> Honestly, apply_to_pte_range() is just the entry in doing all kinds of weird crap to page tables because "you know better".
>>
>> Exactly that's the problem I'm pointing out, drivers *do* know it better. The core memory management has applied incorrect values which caused all kind of the trouble.
>>
>> The problem is not a bug in PAT nor TTM/drivers but rather how they interact with each other.
>>
>> What I don't understand is why do we have the PAT in the first place? No other architecture does it this way.
> 
> Probably because no other architecture has these weird glitches I assume ... skimming over memtype_reserve() and friends there are quite some corner cases the code is handling (BIOS, ACPI, low ISA, system RAM, ...)
> 
> 
> I did a lot of work on the higher PAT level functions, but I am no expert on the lower level management functions, and in particular all the special cases with different memory types.
> 
> IIRC, the goal of the PAT subsystem is to make sure that no two page tables map the same PFN with different caching attributes.

Yeah, that actually makes sense. Thomas from Intel recently explained the technical background to me:

Some x86 CPUs write back cache lines even if they aren't dirty and what can happen is that because of the linear mapping the CPU speculatively loads a cache line which is elsewhere mapped uncached.

So the end result is that the writeback of not dirty cache lines potentially corrupts the data in the otherwise uncached system memory.

But that a) only applies to memory in the linear mapping and b) only to a handful of x86 CPU types (e.g. recently Intels Luna Lake, AMD Athlons produced before 2004, maybe others).

> It treats ordinary system RAM (IORESOURCE_SYSTEM_RAM) usually in a special way: no special caching mode.
> 
> For everything else, it expects that someone first reserves a memory range for a specific caching mode.
> 
> For example, remap_pfn_range()...->pfnmap_track()->memtype_reserve() will make sure that there are no conflicts, to the call memtype_kernel_map_sync() to make sure the identity mapping is updated to the new type.
> 
> In case someone ends up calling pfnmap_setup_cachemode(), the expectation is that there was a previous call to memtype_reserve_io() or similar, such that pfnmap_setup_cachemode() will find that caching mode.
> 
> 
> So my assumption would be that that is missing for the drivers here?

Well yes and no.

See the PAT is optimized for applying specific caching attributes to ranges [A..B] (e.g. it uses an R/B tree). But what drivers do here is that they have single pages (usually for get_free_page or similar) and want to apply a certain caching attribute to it.

So what would happen is that we completely clutter the R/B tree used by the PAT with thousands if not millions of entries.

> 
> Last time I asked where this reservation is done, Peter Xu explained [1] it at least for VFIO:
> 
> vfio_pci_core_mmap
>   pci_iomap
>     pci_iomap_range
>       ...
>         __ioremap_caller
>           memtype_reserve
> 
> 
> Now, could it be that something like that is missing in these drivers (ioremap etc)?

Well that would solve the issue temporary, but I'm pretty sure that will just go boom at a different place then :(

One possibility would be to say that the PAT only overrides the attributes if they aren't normal cached and leaves everything else alone.

What do you think?

Thanks,
Christian.

> 
> 
> 
> [1] https://lkml.kernel.org/r/aBDXr-Qp4z0tS50P@x1.local
> 
> 
>>
>> Is that because of the of x86 CPUs which have problems when different page tables contain different caching attributes for the same physical memory?
> 
> Yes, but I don't think x86 is special here.
> 


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-26  8:38       ` Re: Christian König
@ 2025-08-26  8:46         ` David Hildenbrand
  2025-08-26  9:00           ` Re: Christian König
  2025-08-26 12:37         ` Re: David Hildenbrand
  1 sibling, 1 reply; 1546+ messages in thread
From: David Hildenbrand @ 2025-08-26  8:46 UTC (permalink / raw)
  To: Christian König, intel-xe, intel-gfx, dri-devel, amd-gfx,
	x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 26.08.25 10:38, Christian König wrote:
> On 25.08.25 21:10, David Hildenbrand wrote:
>> On 21.08.25 10:10, Christian König wrote:
>>> On 20.08.25 17:23, David Hildenbrand wrote:
>>>> CCing Lorenzo
>>>>
>>>> On 20.08.25 16:33, Christian König wrote:
>>>>> Hi everyone,
>>>>>
>>>>> sorry for CCing so many people, but that rabbit hole turned out to be
>>>>> deeper than originally thought.
>>>>>
>>>>> TTM always had problems with UC/WC mappings on 32bit systems and drivers
>>>>> often had to revert to hacks like using GFP_DMA32 to get things working
>>>>> while having no rational explanation why that helped (see the TTM AGP,
>>>>> radeon and nouveau driver code for that).
>>>>>
>>>>> It turned out that the PAT implementation we use on x86 not only enforces
>>>>> the same caching attributes for pages in the linear kernel mapping, but
>>>>> also for highmem pages through a separate R/B tree.
>>>>>
>>>>> That was unexpected and TTM never updated that R/B tree for highmem pages,
>>>>> so the function pgprot_set_cachemode() just overwrote the caching
>>>>> attributes drivers passed in to vmf_insert_pfn_prot() and that essentially
>>>>> caused all kind of random trouble.
>>>>>
>>>>> An R/B tree is potentially not a good data structure to hold thousands if
>>>>> not millions of different attributes for each page, so updating that is
>>>>> probably not the way to solve this issue.
>>>>>
>>>>> Thomas pointed out that the i915 driver is using apply_page_range()
>>>>> instead of vmf_insert_pfn_prot() to circumvent the PAT implementation and
>>>>> just fill in the page tables with what the driver things is the right
>>>>> caching attribute.
>>>>
>>>> I assume you mean apply_to_page_range() -- same issue in patch subjects.
>>>
>>> Oh yes, of course. Sorry.
>>>
>>>> Oh this sounds horrible. Why oh why do we have these hacks in core-mm and have drivers abuse them :(
>>>
>>> Yeah I was also a bit hesitated to use that, but the performance advantage is so high that we probably can't avoid the general approach.
>>>
>>>> Honestly, apply_to_pte_range() is just the entry in doing all kinds of weird crap to page tables because "you know better".
>>>
>>> Exactly that's the problem I'm pointing out, drivers *do* know it better. The core memory management has applied incorrect values which caused all kind of the trouble.
>>>
>>> The problem is not a bug in PAT nor TTM/drivers but rather how they interact with each other.
>>>
>>> What I don't understand is why do we have the PAT in the first place? No other architecture does it this way.
>>
>> Probably because no other architecture has these weird glitches I assume ... skimming over memtype_reserve() and friends there are quite some corner cases the code is handling (BIOS, ACPI, low ISA, system RAM, ...)
>>
>>
>> I did a lot of work on the higher PAT level functions, but I am no expert on the lower level management functions, and in particular all the special cases with different memory types.
>>
>> IIRC, the goal of the PAT subsystem is to make sure that no two page tables map the same PFN with different caching attributes.
> 
> Yeah, that actually makes sense. Thomas from Intel recently explained the technical background to me:
> 
> Some x86 CPUs write back cache lines even if they aren't dirty and what can happen is that because of the linear mapping the CPU speculatively loads a cache line which is elsewhere mapped uncached.
> 
> So the end result is that the writeback of not dirty cache lines potentially corrupts the data in the otherwise uncached system memory.
> 
> But that a) only applies to memory in the linear mapping and b) only to a handful of x86 CPU types (e.g. recently Intels Luna Lake, AMD Athlons produced before 2004, maybe others).
> 
>> It treats ordinary system RAM (IORESOURCE_SYSTEM_RAM) usually in a special way: no special caching mode.
>>
>> For everything else, it expects that someone first reserves a memory range for a specific caching mode.
>>
>> For example, remap_pfn_range()...->pfnmap_track()->memtype_reserve() will make sure that there are no conflicts, to the call memtype_kernel_map_sync() to make sure the identity mapping is updated to the new type.
>>
>> In case someone ends up calling pfnmap_setup_cachemode(), the expectation is that there was a previous call to memtype_reserve_io() or similar, such that pfnmap_setup_cachemode() will find that caching mode.
>>
>>
>> So my assumption would be that that is missing for the drivers here?
> 
> Well yes and no.
> 
> See the PAT is optimized for applying specific caching attributes to ranges [A..B] (e.g. it uses an R/B tree). But what drivers do here is that they have single pages (usually for get_free_page or similar) and want to apply a certain caching attribute to it.
> 
> So what would happen is that we completely clutter the R/B tree used by the PAT with thousands if not millions of entries.
> 

Hm, above you're saying that there is no direct map, but now you are 
saying that the pages were obtained through get_free_page()?

I agree that what you describe here sounds suboptimal. But if the pages 
where obtained from the buddy, there surely is a direct map -- unless we 
explicitly remove it :(

If we're talking about individual pages without a directmap, I would 
wonder if they are actually part of a bigger memory region that can just 
be reserved in one go (similar to how remap_pfn_range()) would handle it.

Can you briefly describe how your use case obtains these PFNs, and how 
scattered tehy + their caching attributes might be?

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-26  8:46         ` Re: David Hildenbrand
@ 2025-08-26  9:00           ` Christian König
  2025-08-26  9:17             ` Re: David Hildenbrand
  0 siblings, 1 reply; 1546+ messages in thread
From: Christian König @ 2025-08-26  9:00 UTC (permalink / raw)
  To: David Hildenbrand, intel-xe, intel-gfx, dri-devel, amd-gfx, x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 26.08.25 10:46, David Hildenbrand wrote:
>>> So my assumption would be that that is missing for the drivers here?
>>
>> Well yes and no.
>>
>> See the PAT is optimized for applying specific caching attributes to ranges [A..B] (e.g. it uses an R/B tree). But what drivers do here is that they have single pages (usually for get_free_page or similar) and want to apply a certain caching attribute to it.
>>
>> So what would happen is that we completely clutter the R/B tree used by the PAT with thousands if not millions of entries.
>>
> 
> Hm, above you're saying that there is no direct map, but now you are saying that the pages were obtained through get_free_page()?

The problem only happens with highmem pages on 32bit kernels. Those pages are not in the linear mapping.

> I agree that what you describe here sounds suboptimal. But if the pages where obtained from the buddy, there surely is a direct map -- unless we explicitly remove it :(
> 
> If we're talking about individual pages without a directmap, I would wonder if they are actually part of a bigger memory region that can just be reserved in one go (similar to how remap_pfn_range()) would handle it.
> 
> Can you briefly describe how your use case obtains these PFNs, and how scattered tehy + their caching attributes might be?

What drivers do is to call get_free_page() or alloc_pages_node() with the GFP_HIGHUSER flag set.

For non highmem pages drivers then calls set_pages_wc/uc() which changes the caching of the linear mapping, but for highmem pages there is no linear mapping so set_pages_wc() or set_pages_uc() doesn't work and drivers avoid calling it.

Those are basically just random system memory pages. So they are potentially scattered over the whole memory address space.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-26  9:00           ` Re: Christian König
@ 2025-08-26  9:17             ` David Hildenbrand
  2025-08-26  9:56               ` Re: Christian König
  0 siblings, 1 reply; 1546+ messages in thread
From: David Hildenbrand @ 2025-08-26  9:17 UTC (permalink / raw)
  To: Christian König, intel-xe, intel-gfx, dri-devel, amd-gfx,
	x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 26.08.25 11:00, Christian König wrote:
> On 26.08.25 10:46, David Hildenbrand wrote:
>>>> So my assumption would be that that is missing for the drivers here?
>>>
>>> Well yes and no.
>>>
>>> See the PAT is optimized for applying specific caching attributes to ranges [A..B] (e.g. it uses an R/B tree). But what drivers do here is that they have single pages (usually for get_free_page or similar) and want to apply a certain caching attribute to it.
>>>
>>> So what would happen is that we completely clutter the R/B tree used by the PAT with thousands if not millions of entries.
>>>
>>
>> Hm, above you're saying that there is no direct map, but now you are saying that the pages were obtained through get_free_page()?
> 
> The problem only happens with highmem pages on 32bit kernels. Those pages are not in the linear mapping.

Right, in the common case there is a direct map.

> 
>> I agree that what you describe here sounds suboptimal. But if the pages where obtained from the buddy, there surely is a direct map -- unless we explicitly remove it :(
>>
>> If we're talking about individual pages without a directmap, I would wonder if they are actually part of a bigger memory region that can just be reserved in one go (similar to how remap_pfn_range()) would handle it.
>>
>> Can you briefly describe how your use case obtains these PFNs, and how scattered tehy + their caching attributes might be?
> 
> What drivers do is to call get_free_page() or alloc_pages_node() with the GFP_HIGHUSER flag set.
> 
> For non highmem pages drivers then calls set_pages_wc/uc() which changes the caching of the linear mapping, but for highmem pages there is no linear mapping so set_pages_wc() or set_pages_uc() doesn't work and drivers avoid calling it.
> 
> Those are basically just random system memory pages. So they are potentially scattered over the whole memory address space.

Thanks, that's valuable information.

So essentially these drivers maintain their own consistency and PAT is 
not aware of that.

And the real problem is ordinary system RAM.

There are various ways forward.

1) We use another interface that consumes pages instead of PFNs, like a
    vm_insert_pages_pgprot() we would be adding.

    Is there any strong requirement for inserting non-refcounted PFNs?

2) We add another interface that consumes PFNs, but explicitly states
    that it is only for ordinary system RAM, and that the user is
    required for updating the direct map.

    We could sanity-check the direct map in debug kernels.

3) We teach PAT code in pfnmap_setup_cachemode_pfn() about treating this
    system RAM differently.


There is also the option for a mixture between 1 and 2, where we get 
pages, but we map them non-refcounted in a VM_PFNMAP.

In general, having pages makes it easier to assert that they are likely 
ordinary system ram pages, and that the interface is not getting abused 
for something else.

We could also perform the set_pages_wc/uc() from inside that function, 
but maybe it depends on the use case whether we want to do that whenever 
we map them into a process?

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-26  9:17             ` Re: David Hildenbrand
@ 2025-08-26  9:56               ` Christian König
  2025-08-26 12:07                 ` Re: David Hildenbrand
  2025-08-26 14:27                 ` Re: Thomas Hellström
  0 siblings, 2 replies; 1546+ messages in thread
From: Christian König @ 2025-08-26  9:56 UTC (permalink / raw)
  To: David Hildenbrand, intel-xe, intel-gfx, dri-devel, amd-gfx, x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 26.08.25 11:17, David Hildenbrand wrote:
> On 26.08.25 11:00, Christian König wrote:
>> On 26.08.25 10:46, David Hildenbrand wrote:
>>>>> So my assumption would be that that is missing for the drivers here?
>>>>
>>>> Well yes and no.
>>>>
>>>> See the PAT is optimized for applying specific caching attributes to ranges [A..B] (e.g. it uses an R/B tree). But what drivers do here is that they have single pages (usually for get_free_page or similar) and want to apply a certain caching attribute to it.
>>>>
>>>> So what would happen is that we completely clutter the R/B tree used by the PAT with thousands if not millions of entries.
>>>>
>>>
>>> Hm, above you're saying that there is no direct map, but now you are saying that the pages were obtained through get_free_page()?
>>
>> The problem only happens with highmem pages on 32bit kernels. Those pages are not in the linear mapping.
> 
> Right, in the common case there is a direct map.
> 
>>
>>> I agree that what you describe here sounds suboptimal. But if the pages where obtained from the buddy, there surely is a direct map -- unless we explicitly remove it :(
>>>
>>> If we're talking about individual pages without a directmap, I would wonder if they are actually part of a bigger memory region that can just be reserved in one go (similar to how remap_pfn_range()) would handle it.
>>>
>>> Can you briefly describe how your use case obtains these PFNs, and how scattered tehy + their caching attributes might be?
>>
>> What drivers do is to call get_free_page() or alloc_pages_node() with the GFP_HIGHUSER flag set.
>>
>> For non highmem pages drivers then calls set_pages_wc/uc() which changes the caching of the linear mapping, but for highmem pages there is no linear mapping so set_pages_wc() or set_pages_uc() doesn't work and drivers avoid calling it.
>>
>> Those are basically just random system memory pages. So they are potentially scattered over the whole memory address space.
> 
> Thanks, that's valuable information.
> 
> So essentially these drivers maintain their own consistency and PAT is not aware of that.
> 
> And the real problem is ordinary system RAM.
> 
> There are various ways forward.
> 
> 1) We use another interface that consumes pages instead of PFNs, like a
>    vm_insert_pages_pgprot() we would be adding.
> 
>    Is there any strong requirement for inserting non-refcounted PFNs?

Yes, there is a strong requirement to insert non-refcounted PFNs.

We had a lot of trouble with KVM people trying to grab a reference to those pages even if the VMA had the VM_PFNMAP flag set.

> 2) We add another interface that consumes PFNs, but explicitly states
>    that it is only for ordinary system RAM, and that the user is
>    required for updating the direct map.
> 
>    We could sanity-check the direct map in debug kernels.

I would rather like to see vmf_insert_pfn_prot() fixed instead.

That function was explicitly added to insert the PFN with the given attributes and as far as I can see all users of that function expect exactly that.

> 
> 3) We teach PAT code in pfnmap_setup_cachemode_pfn() about treating this
>    system RAM differently.
> 
> 
> There is also the option for a mixture between 1 and 2, where we get pages, but we map them non-refcounted in a VM_PFNMAP.
> 
> In general, having pages makes it easier to assert that they are likely ordinary system ram pages, and that the interface is not getting abused for something else.

Well, exactly that's the use case here and that is not abusive at all as far as I can see.

What drivers want is to insert a PFN with a certain set of caching attributes regardless if it's system memory or iomem. That's why vmf_insert_pfn_prot() was created in the first place.

That drivers need to call set_pages_wc/uc() for the linear mapping on x86 manually is correct and checking that is clearly a good idea for debug kernels.

> We could also perform the set_pages_wc/uc() from inside that function, but maybe it depends on the use case whether we want to do that whenever we map them into a process?

It sounds like a good idea in theory, but I think it is potentially to much overhead to be applicable.

Thanks,
Christian.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-26  9:56               ` Re: Christian König
@ 2025-08-26 12:07                 ` David Hildenbrand
  2025-08-26 16:09                   ` Re: Christian König
  2025-08-26 14:27                 ` Re: Thomas Hellström
  1 sibling, 1 reply; 1546+ messages in thread
From: David Hildenbrand @ 2025-08-26 12:07 UTC (permalink / raw)
  To: Christian König, intel-xe, intel-gfx, dri-devel, amd-gfx,
	x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

>>
>> 1) We use another interface that consumes pages instead of PFNs, like a
>>     vm_insert_pages_pgprot() we would be adding.
>>
>>     Is there any strong requirement for inserting non-refcounted PFNs?
> 
> Yes, there is a strong requirement to insert non-refcounted PFNs.
> 
> We had a lot of trouble with KVM people trying to grab a reference to those pages even if the VMA had the VM_PFNMAP flag set.

Yes, KVM ignored (and maybe still does) VM_PFNMAP to some degree, which 
is rather nasty.

> 
>> 2) We add another interface that consumes PFNs, but explicitly states
>>     that it is only for ordinary system RAM, and that the user is
>>     required for updating the direct map.
>>
>>     We could sanity-check the direct map in debug kernels.
> 
> I would rather like to see vmf_insert_pfn_prot() fixed instead.
> 
> That function was explicitly added to insert the PFN with the given attributes and as far as I can see all users of that function expect exactly that.

It's all a bit tricky :(

> 
>>
>> 3) We teach PAT code in pfnmap_setup_cachemode_pfn() about treating this
>>     system RAM differently.
>>
>>
>> There is also the option for a mixture between 1 and 2, where we get pages, but we map them non-refcounted in a VM_PFNMAP.
>>
>> In general, having pages makes it easier to assert that they are likely ordinary system ram pages, and that the interface is not getting abused for something else.
> 
> Well, exactly that's the use case here and that is not abusive at all as far as I can see.
> 
> What drivers want is to insert a PFN with a certain set of caching attributes regardless if it's system memory or iomem. That's why vmf_insert_pfn_prot() was created in the first place.

I mean, the use case of "allocate pages from the buddy and fixup the 
linear map" sounds perfectly reasonable to me. Absolutely no reason to 
get PAT involved. Nobody else should be messing with that memory after all.

As soon as we are talking about other memory ranges (iomem) that are not 
from the buddy, it gets weird to bypass PAT, and the question I am 
asking myself is, when is it okay, and when not.

> 
> That drivers need to call set_pages_wc/uc() for the linear mapping on x86 manually is correct and checking that is clearly a good idea for debug kernels.

I'll have to think about this a bit: assuming only vmf_insert_pfn() 
calls pfnmap_setup_cachemode_pfn() but vmf_insert_pfn_prot() doesn't, 
how could we sanity check that somebody is doing something against the 
will of PAT.

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-26  8:38       ` Re: Christian König
  2025-08-26  8:46         ` Re: David Hildenbrand
@ 2025-08-26 12:37         ` David Hildenbrand
  1 sibling, 0 replies; 1546+ messages in thread
From: David Hildenbrand @ 2025-08-26 12:37 UTC (permalink / raw)
  To: Christian König, intel-xe, intel-gfx, dri-devel, amd-gfx,
	x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 26.08.25 10:38, Christian König wrote:
> On 25.08.25 21:10, David Hildenbrand wrote:
>> On 21.08.25 10:10, Christian König wrote:
>>> On 20.08.25 17:23, David Hildenbrand wrote:
>>>> CCing Lorenzo
>>>>
>>>> On 20.08.25 16:33, Christian König wrote:
>>>>> Hi everyone,
>>>>>
>>>>> sorry for CCing so many people, but that rabbit hole turned out to be
>>>>> deeper than originally thought.
>>>>>
>>>>> TTM always had problems with UC/WC mappings on 32bit systems and drivers
>>>>> often had to revert to hacks like using GFP_DMA32 to get things working
>>>>> while having no rational explanation why that helped (see the TTM AGP,
>>>>> radeon and nouveau driver code for that).
>>>>>
>>>>> It turned out that the PAT implementation we use on x86 not only enforces
>>>>> the same caching attributes for pages in the linear kernel mapping, but
>>>>> also for highmem pages through a separate R/B tree.
>>>>>
>>>>> That was unexpected and TTM never updated that R/B tree for highmem pages,
>>>>> so the function pgprot_set_cachemode() just overwrote the caching
>>>>> attributes drivers passed in to vmf_insert_pfn_prot() and that essentially
>>>>> caused all kind of random trouble.
>>>>>
>>>>> An R/B tree is potentially not a good data structure to hold thousands if
>>>>> not millions of different attributes for each page, so updating that is
>>>>> probably not the way to solve this issue.
>>>>>
>>>>> Thomas pointed out that the i915 driver is using apply_page_range()
>>>>> instead of vmf_insert_pfn_prot() to circumvent the PAT implementation and
>>>>> just fill in the page tables with what the driver things is the right
>>>>> caching attribute.
>>>>
>>>> I assume you mean apply_to_page_range() -- same issue in patch subjects.
>>>
>>> Oh yes, of course. Sorry.
>>>
>>>> Oh this sounds horrible. Why oh why do we have these hacks in core-mm and have drivers abuse them :(
>>>
>>> Yeah I was also a bit hesitated to use that, but the performance advantage is so high that we probably can't avoid the general approach.
>>>
>>>> Honestly, apply_to_pte_range() is just the entry in doing all kinds of weird crap to page tables because "you know better".
>>>
>>> Exactly that's the problem I'm pointing out, drivers *do* know it better. The core memory management has applied incorrect values which caused all kind of the trouble.
>>>
>>> The problem is not a bug in PAT nor TTM/drivers but rather how they interact with each other.
>>>
>>> What I don't understand is why do we have the PAT in the first place? No other architecture does it this way.
>>
>> Probably because no other architecture has these weird glitches I assume ... skimming over memtype_reserve() and friends there are quite some corner cases the code is handling (BIOS, ACPI, low ISA, system RAM, ...)
>>
>>
>> I did a lot of work on the higher PAT level functions, but I am no expert on the lower level management functions, and in particular all the special cases with different memory types.
>>
>> IIRC, the goal of the PAT subsystem is to make sure that no two page tables map the same PFN with different caching attributes.
> 
> Yeah, that actually makes sense. Thomas from Intel recently explained the technical background to me:
> 
> Some x86 CPUs write back cache lines even if they aren't dirty and what can happen is that because of the linear mapping the CPU speculatively loads a cache line which is elsewhere mapped uncached.
> 
> So the end result is that the writeback of not dirty cache lines potentially corrupts the data in the otherwise uncached system memory.
> 
> But that a) only applies to memory in the linear mapping and b) only to a handful of x86 CPU types (e.g. recently Intels Luna Lake, AMD Athlons produced before 2004, maybe others).
> 
>> It treats ordinary system RAM (IORESOURCE_SYSTEM_RAM) usually in a special way: no special caching mode.
>>
>> For everything else, it expects that someone first reserves a memory range for a specific caching mode.
>>
>> For example, remap_pfn_range()...->pfnmap_track()->memtype_reserve() will make sure that there are no conflicts, to the call memtype_kernel_map_sync() to make sure the identity mapping is updated to the new type.
>>
>> In case someone ends up calling pfnmap_setup_cachemode(), the expectation is that there was a previous call to memtype_reserve_io() or similar, such that pfnmap_setup_cachemode() will find that caching mode.
>>
>>
>> So my assumption would be that that is missing for the drivers here?
> 
> Well yes and no.
> 
> See the PAT is optimized for applying specific caching attributes to ranges [A..B] (e.g. it uses an R/B tree). But what drivers do here is that they have single pages (usually for get_free_page or similar) and want to apply a certain caching attribute to it.

One clarification after staring at PAT code once again: for pages (RAM), 
the caching attribute is stored in the page flags, not in the R/B tree.

If nothing was set, it defaults to _PAGE_CACHE_MODE_WB AFAIKs.

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-26  9:56               ` Re: Christian König
  2025-08-26 12:07                 ` Re: David Hildenbrand
@ 2025-08-26 14:27                 ` Thomas Hellström
  1 sibling, 0 replies; 1546+ messages in thread
From: Thomas Hellström @ 2025-08-26 14:27 UTC (permalink / raw)
  To: Christian König, David Hildenbrand, intel-xe, intel-gfx,
	dri-devel, amd-gfx, x86
  Cc: airlied, matthew.brost, dave.hansen, luto, peterz,
	Lorenzo Stoakes

Hi, Christian,

On Tue, 2025-08-26 at 11:56 +0200, Christian König wrote:
> On 26.08.25 11:17, David Hildenbrand wrote:
> > On 26.08.25 11:00, Christian König wrote:
> > > On 26.08.25 10:46, David Hildenbrand wrote:
> > > > > > So my assumption would be that that is missing for the
> > > > > > drivers here?
> > > > > 
> > > > > Well yes and no.
> > > > > 
> > > > > See the PAT is optimized for applying specific caching
> > > > > attributes to ranges [A..B] (e.g. it uses an R/B tree). But
> > > > > what drivers do here is that they have single pages (usually
> > > > > for get_free_page or similar) and want to apply a certain
> > > > > caching attribute to it.
> > > > > 
> > > > > So what would happen is that we completely clutter the R/B
> > > > > tree used by the PAT with thousands if not millions of
> > > > > entries.
> > > > > 
> > > > 
> > > > Hm, above you're saying that there is no direct map, but now
> > > > you are saying that the pages were obtained through
> > > > get_free_page()?
> > > 
> > > The problem only happens with highmem pages on 32bit kernels.
> > > Those pages are not in the linear mapping.
> > 
> > Right, in the common case there is a direct map.
> > 
> > > 
> > > > I agree that what you describe here sounds suboptimal. But if
> > > > the pages where obtained from the buddy, there surely is a
> > > > direct map -- unless we explicitly remove it :(
> > > > 
> > > > If we're talking about individual pages without a directmap, I
> > > > would wonder if they are actually part of a bigger memory
> > > > region that can just be reserved in one go (similar to how
> > > > remap_pfn_range()) would handle it.
> > > > 
> > > > Can you briefly describe how your use case obtains these PFNs,
> > > > and how scattered tehy + their caching attributes might be?
> > > 
> > > What drivers do is to call get_free_page() or alloc_pages_node()
> > > with the GFP_HIGHUSER flag set.
> > > 
> > > For non highmem pages drivers then calls set_pages_wc/uc() which
> > > changes the caching of the linear mapping, but for highmem pages
> > > there is no linear mapping so set_pages_wc() or set_pages_uc()
> > > doesn't work and drivers avoid calling it.
> > > 
> > > Those are basically just random system memory pages. So they are
> > > potentially scattered over the whole memory address space.
> > 
> > Thanks, that's valuable information.
> > 
> > So essentially these drivers maintain their own consistency and PAT
> > is not aware of that.
> > 
> > And the real problem is ordinary system RAM.
> > 
> > There are various ways forward.
> > 
> > 1) We use another interface that consumes pages instead of PFNs,
> > like a
> >    vm_insert_pages_pgprot() we would be adding.
> > 
> >    Is there any strong requirement for inserting non-refcounted
> > PFNs?
> 
> Yes, there is a strong requirement to insert non-refcounted PFNs.
> 
> We had a lot of trouble with KVM people trying to grab a reference to
> those pages even if the VMA had the VM_PFNMAP flag set.
> 
> > 2) We add another interface that consumes PFNs, but explicitly
> > states
> >    that it is only for ordinary system RAM, and that the user is
> >    required for updating the direct map.
> > 
> >    We could sanity-check the direct map in debug kernels.
> 
> I would rather like to see vmf_insert_pfn_prot() fixed instead.
> 
> That function was explicitly added to insert the PFN with the given
> attributes and as far as I can see all users of that function expect
> exactly that.
> 
> > 
> > 3) We teach PAT code in pfnmap_setup_cachemode_pfn() about treating
> > this
> >    system RAM differently.
> > 
> > 
> > There is also the option for a mixture between 1 and 2, where we
> > get pages, but we map them non-refcounted in a VM_PFNMAP.
> > 
> > In general, having pages makes it easier to assert that they are
> > likely ordinary system ram pages, and that the interface is not
> > getting abused for something else.
> 
> Well, exactly that's the use case here and that is not abusive at all
> as far as I can see.
> 
> What drivers want is to insert a PFN with a certain set of caching
> attributes regardless if it's system memory or iomem. That's why
> vmf_insert_pfn_prot() was created in the first place.
> 
> That drivers need to call set_pages_wc/uc() for the linear mapping on
> x86 manually is correct and checking that is clearly a good idea for
> debug kernels.

So where is this trending? Is the current suggestion to continue
disallowing aliased mappings with conflicting caching modes and enforce
checks in debug kernels?

/Thomas


> 
> > We could also perform the set_pages_wc/uc() from inside that
> > function, but maybe it depends on the use case whether we want to
> > do that whenever we map them into a process?
> 
> It sounds like a good idea in theory, but I think it is potentially
> to much overhead to be applicable.
> 
> Thanks,
> Christian.


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-26 12:07                 ` Re: David Hildenbrand
@ 2025-08-26 16:09                   ` Christian König
  0 siblings, 0 replies; 1546+ messages in thread
From: Christian König @ 2025-08-26 16:09 UTC (permalink / raw)
  To: David Hildenbrand, intel-xe, intel-gfx, dri-devel, amd-gfx, x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 26.08.25 14:07, David Hildenbrand wrote: 
>>
>>> 2) We add another interface that consumes PFNs, but explicitly states
>>>     that it is only for ordinary system RAM, and that the user is
>>>     required for updating the direct map.
>>>
>>>     We could sanity-check the direct map in debug kernels.
>>
>> I would rather like to see vmf_insert_pfn_prot() fixed instead.
>>
>> That function was explicitly added to insert the PFN with the given attributes and as far as I can see all users of that function expect exactly that.
> 
> It's all a bit tricky :(

I would rather say horrible complicated :(

>>>
>>> 3) We teach PAT code in pfnmap_setup_cachemode_pfn() about treating this
>>>     system RAM differently.
>>>
>>>
>>> There is also the option for a mixture between 1 and 2, where we get pages, but we map them non-refcounted in a VM_PFNMAP.
>>>
>>> In general, having pages makes it easier to assert that they are likely ordinary system ram pages, and that the interface is not getting abused for something else.
>>
>> Well, exactly that's the use case here and that is not abusive at all as far as I can see.
>>
>> What drivers want is to insert a PFN with a certain set of caching attributes regardless if it's system memory or iomem. That's why vmf_insert_pfn_prot() was created in the first place.
> 
> I mean, the use case of "allocate pages from the buddy and fixup the linear map" sounds perfectly reasonable to me. Absolutely no reason to get PAT involved. Nobody else should be messing with that memory after all.
> 
> As soon as we are talking about other memory ranges (iomem) that are not from the buddy, it gets weird to bypass PAT, and the question I am asking myself is, when is it okay, and when not.

Ok let me try to explain parts of the history and the big picture for at least the graphics use case on x86.

In 1996/97 Intel came up with the idea of AGP: https://en.wikipedia.org/wiki/Accelerated_Graphics_Port

At that time the CPUs, PCI bus and system memory were all connected together through the north bridge: https://en.wikipedia.org/wiki/Northbridge_(computing)

The problem was that AGP also introduced the concept of putting large amounts of data for the video controller (PCI device) into system memory when you don't have enough local device memory (VRAM).

But that meant when that memory is cached that the north bridge always had to snoop the CPU cache over the front side bus for every access the video controller made. This meant a huge performance bottleneck, so the idea was born to access that data uncached.

Well that was nearly 30years ago, PCI, AGP and front side bus are long gone, but the concept of putting video controller (GPU) stuff into uncached system memory has prevailed.

So for example even modern AMD CPU based laptops need uncached system memory if their local memory is not large enough to contain the picture to display on the monitor. And with modern 8k monitors that can actually happen quite fast...

What drivers do today is to call vmf_insert_pfn_prot() either with the PFN of their local memory (iomem) or uncached/wc system memory.

To summarize that we have an interface to fill in the page tables with either iomem or system memory is actually part of the design. That's how the HW driver is expected to work.

>> That drivers need to call set_pages_wc/uc() for the linear mapping on x86 manually is correct and checking that is clearly a good idea for debug kernels.
> 
> I'll have to think about this a bit: assuming only vmf_insert_pfn() calls pfnmap_setup_cachemode_pfn() but vmf_insert_pfn_prot() doesn't, how could we sanity check that somebody is doing something against the will of PAT.

I think the most defensive approach for a quick fix is this change here:

 static inline void pgprot_set_cachemode(pgprot_t *prot, enum page_cache_mode pcm)
 {
-       *prot = __pgprot((pgprot_val(*prot) & ~_PAGE_CACHE_MASK) |
-                        cachemode2protval(pcm));
+       if (pcm != _PAGE_CACHE_MODE_WB)
+               *prot = __pgprot((pgprot_val(*prot) & ~_PAGE_CACHE_MASK) |
+                                cachemode2protval(pcm));
 }

This applies the PAT value if it's anything else than _PAGE_CACHE_MODE_WB but still allows callers to use something different on normal WB system memory.

What do you think?

Regards,
Christian

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-08-27 14:43 ` Zhang Tengfei
@ 2025-08-27 21:37   ` Pablo Neira Ayuso
  0 siblings, 0 replies; 1546+ messages in thread
From: Pablo Neira Ayuso @ 2025-08-27 21:37 UTC (permalink / raw)
  To: Zhang Tengfei
  Cc: ja, coreteam, davem, dsahern, edumazet, fw, horms, kadlec, kuba,
	lvs-devel, netfilter-devel, pabeni, syzbot+1651b5234028c294c339

On Wed, Aug 27, 2025 at 10:43:42PM +0800, Zhang Tengfei wrote:
> Hi everyone,
> 
> Here is the v2 patch that incorporates the feedback.

Patch without subject will not fly too far, I'm afraid you will have
to resubmit. One more comment below.

> Many thanks to Julian for his thorough review and for providing 
> the detailed plan for this new version, and thanks to Florian 
> and Eric for suggestions.
> 
> Subject: [PATCH v2] net/netfilter/ipvs: Use READ_ONCE/WRITE_ONCE for
>  ipvs->enable
> 
> KCSAN reported a data-race on the `ipvs->enable` flag, which is
> written in the control path and read concurrently from many other
> contexts.
> 
> Following a suggestion by Julian, this patch fixes the race by
> converting all accesses to use `WRITE_ONCE()/READ_ONCE()`.
> This lightweight approach ensures atomic access and acts as a
> compiler barrier, preventing unsafe optimizations where the flag
> is checked in loops (e.g., in ip_vs_est.c).
> 
> Additionally, the now-obsolete `enable` checks in the fast path
> hooks (`ip_vs_in_hook`, `ip_vs_out_hook`, `ip_vs_forward_icmp`)
> are removed. These are unnecessary since commit 857ca89711de
> ("ipvs: register hooks only with services").
> 
> Reported-by: syzbot+1651b5234028c294c339@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=1651b5234028c294c339
> Suggested-by: Julian Anastasov <ja@ssi.bg>
> Link: https://lore.kernel.org/lvs-devel/2189fc62-e51e-78c9-d1de-d35b8e3657e3@ssi.bg/
> Signed-off-by: Zhang Tengfei <zhtfdev@gmail.com>
> 
> ---
> v2:
> - Switched from atomic_t to the suggested READ_ONCE()/WRITE_ONCE().
> - Removed obsolete checks from the packet processing hooks.
> - Polished commit message based on feedback.
> ---
>  net/netfilter/ipvs/ip_vs_conn.c |  4 ++--
>  net/netfilter/ipvs/ip_vs_core.c | 11 ++++-------
>  net/netfilter/ipvs/ip_vs_ctl.c  |  6 +++---
>  net/netfilter/ipvs/ip_vs_est.c  | 16 ++++++++--------
>  4 files changed, 17 insertions(+), 20 deletions(-)
[...]
> diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
> index c7a8a08b7..5ea7ab8bf 100644
> --- a/net/netfilter/ipvs/ip_vs_core.c
> +++ b/net/netfilter/ipvs/ip_vs_core.c
> @@ -1353,9 +1353,6 @@ ip_vs_out_hook(void *priv, struct sk_buff *skb, const struct nf_hook_state *stat
>  	if (unlikely(!skb_dst(skb)))
>  		return NF_ACCEPT;
>  
> -	if (!ipvs->enable)
> -		return NF_ACCEPT;

Patch does say why is this going away? If you think this is not
necessary, then make a separated patch and example why this is needed?

Thanks

>  	ip_vs_fill_iph_skb(af, skb, false, &iph);
>  #ifdef CONFIG_IP_VS_IPV6
>  	if (af == AF_INET6) {

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
  2025-08-29  2:01 xinpeng.wang
@ 2025-08-29  2:42 ` bluez.test.bot
  0 siblings, 0 replies; 1546+ messages in thread
From: bluez.test.bot @ 2025-08-29  2:42 UTC (permalink / raw)
  To: linux-bluetooth, wangxinpeng

[-- Attachment #1: Type: text/plain, Size: 382 bytes --]

This is an automated email and please do not reply to this email.

Dear Submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
While preparing the CI tests, the patches you submitted couldn't be applied to the current HEAD of the repository.

----- Output -----

Please resolve the issue and submit the patches again.

---
Regards,
Linux Bluetooth

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-09-01  4:05 ` Kaiwan N Billimoria
@ 2025-09-01  5:57   ` Kaiwan N Billimoria
  0 siblings, 0 replies; 1546+ messages in thread
From: Kaiwan N Billimoria @ 2025-09-01  5:57 UTC (permalink / raw)
  To: tglx
  Cc: Llillian, agordeev, akpm, alexander.shishkin, anna-maria, bigeasy,
	catalin.marinas, chenhuacai, francesco, frederic,
	guoweikang.kernel, jstultz, kpsingh, linux-arm-kernel,
	linux-kernel, mark.rutland, maz, mingo, pmladek, rrangel, sboyd,
	urezki, v-singh1, will

Apologies, subject missing; it's:
time: introduce BOOT_TIME_TRACKER and minimal boot timestamp


On Mon, Sep 1, 2025 at 9:36 AM Kaiwan N Billimoria
<kaiwan.billimoria@gmail.com> wrote:
>
> > What the heck is BOOT SIG Initiative?
> Very, very briefly: it's an initiative that plans to measure the complete or
> unified boot time, i.e., the time it takes to boot the system completely. This
> includes (or plans to) track the time taken for:
> - Boot from CPU power-on, ROM code execution
> - 1st, 2nd, (and possibly) 3rd stage bootloader(s)
> - Linux kernel upto running the PID 1 process
> - Include time taken for onboard MCUs (and their apps to come up).
>
> The plan is to be able to show the cumulative and individual times taken across
> all of these. Then report it via ASCII text and a GUI system (as of now, a HTML
> file).
> For anyone interested, here's the PDF of a super presentation on this topic by
> Vishnu P Singh (OP) this August at the OSS EU:
> https://static.sched.com/hosted_files/osseu2025/a2/EOSS_2025_Unified%20Boot%20Time%20log%20based%20measurement.pdf
> As mentioned by Vishnu, the work is in the very early dev stages.
>
> > -     pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n",
> > -             bits, r, r_unit, res, wrap);
> > +     pr_info("sched_clock: %pS: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns hwcnt: %llu\n",
> > +             read, bits, r, r_unit, res, wrap, read());
> --snip--
> > So let's assume this give you
> >
> > [    0.000008] sched_clock: 56 bits at 19MHz, resolution 52ns, wraps
> >                             every 3579139424256ns hwcnt: 19000000
> >
> > Which means that the counter accumulated 19000000 increments since the
> > hardware was powered up, no?
> I agree with your approach Thomas (tglx)! (eye-roll)... So, following this
> approach, here's the resulting tiny patch:
>
> From 1e687ab12269f5f129b17eb7e9c3c5c2cec441b7 Mon Sep 17 00:00:00 2001
> From: Kaiwan N Billimoria <kaiwan.billimoria@gmail.com>
> Date: Mon, 1 Sep 2025 09:17:57 +0530
> Subject: [PATCH] [sched-clock] Extend printk to show h/w counter in a
>  parseable manner
>
> Signed-off-by: Kaiwan N Billimoria <kaiwan.billimoria@gmail.com>
> ---
>  kernel/time/sched_clock.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
> index cc15fe293719..e4fe900d6b60 100644
> --- a/kernel/time/sched_clock.c
> +++ b/kernel/time/sched_clock.c
> @@ -236,16 +236,14 @@ sched_clock_register(u64 (*read)(void), int bits, unsigned long rate)
>         /* Calculate the ns resolution of this counter */
>         res = cyc_to_ns(1ULL, new_mult, new_shift);
>
> -       pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n",
> -               bits, r, r_unit, res, wrap);
> +       pr_info("sched_clock: %pS: bits=%u,freq=%lu %cHz,resolution=%llu ns,wraps every=%llu ns,hwcnt=%llu\n",
> +                read, bits, r, r_unit, res, wrap, read());
>
>         /* Enable IRQ time accounting if we have a fast enough sched_clock() */
>         if (irqtime > 0 || (irqtime == -1 && rate >= 1000000))
>                 enable_sched_clock_irqtime();
>
>         local_irq_restore(flags);
> -
> -       pr_debug("Registered %pS as sched_clock source\n", read);
>  }
>
>  void __init generic_sched_clock_init(void)
> --
> 2.43.0
>
> Of course, this is almost identical to what Thomas has already shown. I've
> added some formatting to make for easier parsing. A sample output obtained with
> this code on a patched kernel for the BeaglePlay k3 am625 board:
> [    0.000001] sched_clock: arch_counter_get_cntpct+0x0/0x18: bits=58,freq=200 MHz,resolution=5 ns,wraps every=4398046511102 ns,hwcnt=1409443611
>
> This printk format allows us to easily parse it; f.e. to obtain the hwcnt value:
> debian@BeagleBone:~$ dmesg |grep sched_clock |awk -F, '{print $5}'
> hwcnt=1409443611
>
> So, just confirming: here 1409443611 divided by 200 MHz gives us 7.047218055s
> since boot, and thus the actual timestamp here is that plus 0.000001s yes?
> (Over 7s here? yes, it's just that I haven't yet setup U-Boot properly for uSD
> card boot, thus am manually loading commands in U-Boot to boot up, that's all).
>
> A question (perhaps very stupid): will the hwcnt - the output of the read() -
> be guaranteed to be (close to) the number of increments since processor
> power-up (or reset)? Meaning, it's simply a hardware feature and agnostic to
> what code the core was executing (ROM/BL/kernel), yes?
> If so, I guess we can move forward with this approach... Else, or otherwise,
> suggestions are welcome,
>
> Regards,
> Kaiwan.


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-09-08  9:54 ` hariconscious
@ 2025-09-08 13:23   ` Jonathan Corbet
  0 siblings, 0 replies; 1546+ messages in thread
From: Jonathan Corbet @ 2025-09-08 13:23 UTC (permalink / raw)
  To: hariconscious
  Cc: catalin.marinas, hariconscious, linux-arm-kernel, linux-doc,
	linux-kernel, shuah, will

hariconscious@gmail.com writes:

> Thanks for the suggestion, will correct and send the patch again.
> And my real name is "HariKrishna" and see that it is mentioned in Signed-off-by tag.
> Do I need to add surname as well ? Please let me know.

Yes, signoffs should give your full name.

Thanks,

jon


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-09-15 19:52 Yury Norov (NVIDIA)
@ 2025-09-16 14:48 ` Simon Horman
  2025-09-16 15:22   ` Re: Yury Norov
  0 siblings, 1 reply; 1546+ messages in thread
From: Simon Horman @ 2025-09-16 14:48 UTC (permalink / raw)
  To: Yury Norov (NVIDIA)
  Cc: Yoshihiro Shimoda, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Nikita Yushchenko,
	Michal Swiatkowski, Geert Uytterhoeven, Uwe Kleine-König,
	netdev, linux-renesas-soc, linux-kernel

On Mon, Sep 15, 2025 at 03:52:31PM -0400, Yury Norov (NVIDIA) wrote:
> Subject: [PATCH net-next v2] net: renesas: rswitch: simplify rswitch_stop()
> 
> rswitch_stop() opencodes for_each_set_bit().
> 
> CC: Simon Horman <horms@kernel.org>
> Reviewed-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
> ---
> v1: https://lore.kernel.org/all/20250913181345.204344-1-yury.norov@gmail.com/
> v2: Rebase on top of net-next/main
> 
>  drivers/net/ethernet/renesas/rswitch_main.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)

Hi Yury,

I see this marked as Changes Requested in Patchwork.
But no response on the netdev ML. So I'll provide one.

Unfortunately it seems that the posting is slightly mangled,
there was no Subject in the header (or an empty one), and what
was supposed to be the Subject ended up at the top of the body.

I'm wondering if you could repost with that addressed,
being sure to observe the 24h delay between postings.

Thanks!

...

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-09-16 14:48 ` Simon Horman
@ 2025-09-16 15:22   ` Yury Norov
  0 siblings, 0 replies; 1546+ messages in thread
From: Yury Norov @ 2025-09-16 15:22 UTC (permalink / raw)
  To: Simon Horman
  Cc: Yoshihiro Shimoda, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Nikita Yushchenko,
	Michal Swiatkowski, Geert Uytterhoeven, Uwe Kleine-König,
	netdev, linux-renesas-soc, linux-kernel

On Tue, Sep 16, 2025 at 03:48:13PM +0100, Simon Horman wrote:
> On Mon, Sep 15, 2025 at 03:52:31PM -0400, Yury Norov (NVIDIA) wrote:
> > Subject: [PATCH net-next v2] net: renesas: rswitch: simplify rswitch_stop()
> > 
> > rswitch_stop() opencodes for_each_set_bit().
> > 
> > CC: Simon Horman <horms@kernel.org>
> > Reviewed-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
> > Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
> > ---
> > v1: https://lore.kernel.org/all/20250913181345.204344-1-yury.norov@gmail.com/
> > v2: Rebase on top of net-next/main
> > 
> >  drivers/net/ethernet/renesas/rswitch_main.c | 4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> Hi Yury,
> 
> I see this marked as Changes Requested in Patchwork.
> But no response on the netdev ML. So I'll provide one.
> 
> Unfortunately it seems that the posting is slightly mangled,

Yeah, bad luck.

> there was no Subject in the header (or an empty one), and what
> was supposed to be the Subject ended up at the top of the body.
> 
> I'm wondering if you could repost with that addressed,
> being sure to observe the 24h delay between postings.

Sure, will do shortly.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-09-16 21:23 Jay Vosburgh
@ 2025-09-16 21:56 ` Jay Vosburgh
  0 siblings, 0 replies; 1546+ messages in thread
From: Jay Vosburgh @ 2025-09-16 21:56 UTC (permalink / raw)
  Cc: netdev, Jamal Hadi Salim, Stephen Hemminger, David Ahern

Jay Vosburgh <jay.vosburgh@canonical.com> wrote:

>
>
>Subject: [PATCH v2 0/4 iproute2-next] tc/police: Allow 64 bit burst size
>
>	In summary, this patchset changes the user space handling of the
>tc police burst parameter to permit burst sizes that exceed 4 GB when the
>specified rate is high enough that the kernel API for burst can accomodate
>such.

	Ignore this, will fix and repost.

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-10-05 14:16 ssrane_b23
@ 2025-10-05 14:16 ` syzbot
  0 siblings, 0 replies; 1546+ messages in thread
From: syzbot @ 2025-10-05 14:16 UTC (permalink / raw)
  To: ssrane_b23
  Cc: linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat,
	rostedt, ssrane_b23, syzkaller-bugs

> #syz test on: linux-next

This crash does not have a reproducer. I cannot test it.

>
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-10-08  7:06 ` Kamel Bouhara
@ 2025-10-08 20:46   ` Bence Csókás
  0 siblings, 0 replies; 1546+ messages in thread
From: Bence Csókás @ 2025-10-08 20:46 UTC (permalink / raw)
  To: Kamel Bouhara, Dharma Balasubiramani, g
  Cc: William Breathitt Gray, Bence Csókás, linux-arm-kernel,
	linux-iio, linux-kernel

Hi,

> On Mon, Oct 06, 2025 at 04:21:50PM +0530, Dharma Balasubiramani wrote:
> 
> Hello Dharma,
> 
>> Mark the interrupt as IRQF_SHARED to permit multiple counter channels to
>> share the same TCB IRQ line.
>>
>> Each Timer/Counter Block (TCB) instance shares a single IRQ line among its
>> three internal channels. When multiple counter channels (e.g., counter@0
>> and counter@1) within the same TCB are enabled, the second call to
>> devm_request_irq() fails because the IRQ line is already requested by the
>> first channel.
>>
>> Fixes: e5d581396821 ("counter: microchip-tcb-capture: Add IRQ handling")
>> Signed-off-by: Dharma Balasubiramani <dharma.b@microchip.com>
>> ---
>>   drivers/counter/microchip-tcb-capture.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/counter/microchip-tcb-capture.c b/drivers/counter/microchip-tcb-capture.c
>> index 1a299d1f350b..19d457ae4c3b 100644
>> --- a/drivers/counter/microchip-tcb-capture.c
>> +++ b/drivers/counter/microchip-tcb-capture.c
>> @@ -451,7 +451,7 @@ static void mchp_tc_irq_remove(void *ptr)
>>   static int mchp_tc_irq_enable(struct counter_device *const counter, int irq)
>>   {
>>   	struct mchp_tc_data *const priv = counter_priv(counter);
>> -	int ret = devm_request_irq(counter->parent, irq, mchp_tc_isr, 0,
>> +	int ret = devm_request_irq(counter->parent, irq, mchp_tc_isr, IRQF_SHARED,
>>   				   dev_name(counter->parent), counter);
>>
>>   	if (ret < 0)
>>
>> ---
>> base-commit: fd94619c43360eb44d28bd3ef326a4f85c600a07
>> change-id: 20251006-microchip-tcb-edd8aeae36c4
>>
> 
> This change makes sense, thanks !
> 
> Reviewed-by: Kamel Bouhara <kamel.bouhara@bootlin.com>
> 
>> Best regards,
>> --
>> Dharma Balasubiramani <dharma.b@microchip.com>
>>
> 
> --
> Kamel Bouhara, Bootlin
> Embedded Linux and kernel engineering
> https://bootlin.com

Looks reasonable to me as well.

Reviewed-by: Bence Csókás <bence98@sch.bme.hu>


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-11-04  9:22 Michael Roach
@ 2025-11-04 10:24 ` Kristoffer Haugsbakk
  2025-11-05 14:55   ` Re: Lucas Seiki Oshiro
  0 siblings, 1 reply; 1546+ messages in thread
From: Kristoffer Haugsbakk @ 2025-11-04 10:24 UTC (permalink / raw)
  To: Michael Roach, git

On Tue, Nov 4, 2025, at 10:22, Michael Roach wrote:
>[snip]
> For one of my files named `ensure-string-env.rb` was printed with part
> of the path in colour,
> and the first dash of the filename replaced with a colon.

I have seen something similar when using the Delta pager. I’m pretty
sure that it replaced a hyphen with a colon.

https://github.com/dandavison/delta

I don’t think I’ve seen this behavior with `git --no-pager`.

Don’t know about the coloring part (despite `--color=never`).

>[snip]

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-11-05  3:38 niklaus.liu
@ 2025-11-05  8:56 ` AngeloGioacchino Del Regno
  0 siblings, 0 replies; 1546+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-11-05  8:56 UTC (permalink / raw)
  To: niklaus.liu, Matthias Brugger
  Cc: linux-kernel, linux-arm-kernel, linux-mediatek,
	Project_Global_Chrome_Upstream_Group, sirius.wang, vince-wl.liu,
	jh.hsu, zhigang.qin, sen.chu

Il 05/11/25 04:38, niklaus.liu ha scritto:
> Refer to the discussion in the link:
> v3: https://patchwork.kernel.org/project/linux-mediatek/patch/20251104071252.12539-2-Niklaus.Liu@mediatek.com/
> 
> Subject: [PATCH v4 0/1] soc: mediatek: mtk-regulator-coupler: Add support for MT8189

The subject of this email is .. empty. That's really bad, and it's the second time
that it happens. Please make sure that you're sending emails the right way, and/or
fix your client.

While at it, please also fix your name, as it should appear as "Niklaus Liu" and
not as "niklaus.liu".

> 
> changes in v4:
>   - reply comment:

Niklaus, please just reply inline to the emails instead of sending an entirely new
version just for a reply: it's easier for everyone to follow, and it's also easier
for me to read, and for you to send a reply by clicking one button :-)

> 1. MTK hardware requires that vsram_gpu must be higher than vgpu; this rule must be satisfied.
> 
> 2. When the GPU powers on, the mtcmos driver first calls regulator_enable to turn on vgpu, then calls regulator_enable to
> turn on vsram_gpu. When enabling vgpu, mediatek_regulator_balance_voltage sets the voltages for both vgpu and vsram_gpu.
> However, when enabling vsram_gpu, mediatek_regulator_balance_voltage is also executed, and at this time, the vsram_gpu voltage
> is set to the minimum voltage specified in the DTS, which does not comply with the requirement that vsram_gpu must be higher than vgpu.
> 

2. -> There's your problem! VSRAM_GPU has to be turned on *first*, VGPU has to be
turned on *last* instead.

Logically, you need SRAM up before the GPU is up as if the GPU tries to use SRAM
it'll produce unpowered access issues: even though it's *very* unlikely for that
to happen on older Mali, it's still a logical mistake that might, one day, come
back at us and create instabilities.

Now, the easy fix is to just invert the regulators in MFG nodes. As I explained
*multiple* times, you have a misconfiguration in your DT.

GPU subsystem main MFG -> VSRAM
GPU core complex MFG -> VGPU
GPU per-core MFG -> nothing

> 3.During suspend, the voltages of vgpu and vsram_gpu should remain unchanged, and when resuming, vgpu and vsram_gpu should be
> restored to their previous voltages. When the vgpu voltage is adjusted, mediatek_regulator_balance_voltage already synchronizes the
> adjustment of vsram_gpu voltage. Therefore, adjusting the vsram_gpu voltage again in mediatek_regulator_balance_voltage is redundant.

If you fix your DT, N.3 won't happen.

Regards,
Angelo

> 
> 
> changes in v3:
>   - modify for comment[add the new entry by alphabetical order]
> 
> changes in v2:
>   - change title for patch
>   - reply comment: This is a software regulator coupler mechanism, and the regulator-coupled-with
> configuration has been added in the MT8189 device tree. This patchaddresses an issue reported by a
> Chromebook customer. When the GPU regulator is turned on, mediatek_regulator_balance_voltage already
> sets both the GPU and GPU_SRAM voltages at the same time, so there is no need to adjust the GPU_SRAM
> voltage again in a second round. Therefore, a return is set for MT8189.
> If the user calls mediatek_regulator_balance_voltage again for GPU_SRAM, it may cause abnormal behavior of GPU_SRAM.
> 
> 
> changes in v1:
>   - mediatek-regulator-coupler mechanism for platform MT8189
> 
> *** BLURB HERE ***
> 
> Niklaus Liu (1):
>    soc: mediatek: mtk-regulator-coupler: Add support for MT8189
> 
>   drivers/soc/mediatek/mtk-regulator-coupler.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-11-04 10:24 ` Kristoffer Haugsbakk
@ 2025-11-05 14:55   ` Lucas Seiki Oshiro
  2025-11-05 15:01     ` Re: Kristoffer Haugsbakk
  0 siblings, 1 reply; 1546+ messages in thread
From: Lucas Seiki Oshiro @ 2025-11-05 14:55 UTC (permalink / raw)
  To: Kristoffer Haugsbakk, Michael Roach; +Cc: git

> I have seen something similar when using the Delta pager. I’m pretty
> sure that it replaced a hyphen with a colon.

It's a known bug in Delta:

https://github.com/dandavison/delta/issues/1259

> I don’t think I’ve seen this behavior with `git --no-pager`.

I think it is a good idea to also see what happens when using another
pager, for example, less (`git -c core.pager=less ...`) or cat
(`git -c core.pager=cat ...`).

Michael, can you run with those three mentioned options and see what
happens? Last year I spent some hours trying to find the cause of the
same bug in Git but then I found out that it was actually a bug in
Delta.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-11-05 14:55   ` Re: Lucas Seiki Oshiro
@ 2025-11-05 15:01     ` Kristoffer Haugsbakk
  2025-11-06  0:05       ` Re: Lucas Seiki Oshiro
  0 siblings, 1 reply; 1546+ messages in thread
From: Kristoffer Haugsbakk @ 2025-11-05 15:01 UTC (permalink / raw)
  To: Lucas Seiki Oshiro, Michael Roach; +Cc: git

On Wed, Nov 5, 2025, at 15:55, Lucas Seiki Oshiro wrote:
>> I have seen something similar when using the Delta pager. I’m pretty
>> sure that it replaced a hyphen with a colon.
>
> It's a known bug in Delta:
>
> https://github.com/dandavison/delta/issues/1259
>
>> I don’t think I’ve seen this behavior with `git --no-pager`.
>
> I think it is a good idea to also see what happens when using another
> pager, for example, less (`git -c core.pager=less ...`) or cat
> (`git -c core.pager=cat ...`).
>
> Michael, can you run with those three mentioned options and see what
> happens? Last year I spent some hours trying to find the cause of the
> same bug in Git but then I found out that it was actually a bug in
> Delta.

Sorry, I didn’t see that he only replied to me previously:

On Tue, Nov 4, 2025, at 12:15, Michael Roach wrote:
> Just when I thought I had considered all the factors before reporting 
> this, you got it.
> I am using Delta as my pager. My tests with other git versions were via 
> Docker, so there was no pager.
>
> I confirmed that using `git --no-pager grep` doesn't have this issue.
>
> Sorry for the bug report noise and thanks for figuring that out.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-11-05 15:01     ` Re: Kristoffer Haugsbakk
@ 2025-11-06  0:05       ` Lucas Seiki Oshiro
  2025-11-06  8:09         ` Re: Michael Roach
  0 siblings, 1 reply; 1546+ messages in thread
From: Lucas Seiki Oshiro @ 2025-11-06  0:05 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: Michael Roach, git


> Sorry, I didn’t see that he only replied to me previously:

Thanks for forwarding that!

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2025-11-06  0:05       ` Re: Lucas Seiki Oshiro
@ 2025-11-06  8:09         ` Michael Roach
  0 siblings, 0 replies; 1546+ messages in thread
From: Michael Roach @ 2025-11-06  8:09 UTC (permalink / raw)
  To: Lucas Seiki Oshiro, Kristoffer Haugsbakk; +Cc: git

Sorry I didn't do a reply to all on Kristoffer's response. Can you tell it's my first time here? 
Thank you both for your time on this!



November 6, 2025 at 01:05, "Lucas Seiki Oshiro" <lucasseikioshiro@gmail.com mailto:lucasseikioshiro@gmail.com?to=%22Lucas%20Seiki%20Oshiro%22%20%3Clucasseikioshiro%40gmail.com%3E > wrote:


> 
> > 
> > Sorry, I didn’t see that he only replied to me previously:
> > 
> Thanks for forwarding that!
>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2026-01-11 21:10 Wesley B
@ 2026-01-12 13:28 ` Miguel Ojeda
  0 siblings, 0 replies; 1546+ messages in thread
From: Miguel Ojeda @ 2026-01-12 13:28 UTC (permalink / raw)
  To: Wesley B; +Cc: rust-for-linux

On Sun, Jan 11, 2026 at 10:10 PM Wesley B <atticusfinch570@gmail.com> wrote:
>
> From 71c099600448cdb639136bb15bdd40767dbfc0fd Mon Sep 17 00:00:00 2001
> From: Wesley Bott <atticusfinch570@gmail.com>
> Date: Sat, 10 Jan 2026 18:26:47 -0700
> Subject: [PATCH] rust: Restore __new INVARIANT comment, updated docs as needed

These headers should be used to send an email with that subject etc.,
rather than embedding it. I would suggest trying to use `git
send-email` or `b4`.

> Link: https://github.com/Rust-for-Linux/linux/issues/1217
> Suggested-by: Miguel Ojeda <ojeda@kernel.org>

The suggestion was to remove a paragraph, not to edit it. Perhaps the
confusion is that the suggestion goes on top of the `rust-fixes`
branch, not mainline.

Thanks!

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2026-02-02 10:53 Anshumali Gaur
@ 2026-02-03  0:34 ` Jacob Keller
  0 siblings, 0 replies; 1546+ messages in thread
From: Jacob Keller @ 2026-02-03  0:34 UTC (permalink / raw)
  To: Anshumali Gaur
  Cc: netdev, linux-kernel, Sunil Goutham, Linu Cherian,
	Geetha sowjanya, Jerin Jacob, hariprasad, Subbaraya Sundeep,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni



On 2/2/2026 2:53 AM, Anshumali Gaur wrote:
> On 2026-01-29 at 23:02:43, Jacob Keller (jacob.e.keller@intel.com) wrote:
>>
>>
>> On 1/29/2026 1:19 AM, Anshumali Gaur wrote:
>>> When both AF and PF drivers are built as modules, the PF driver in the
>>> kexec kernel may probe before the AF driver is ready. This leads to
>>> a crash due to uninitialized hardware state.
>>>
>>> This patch ensures the PF driver properly detects and waits for AF
>>> driver readiness before proceeding with initialization.
>>>
>>
>> To me, the patch description is not sufficient to describe the what and why
>> of this change.
>>
>> Could you please provide a better explanation of how the addition of the
>> provided shutdown handler fixes initialization?
>>
> Hi Jacob,
> The issue being addressed here is specific to kexec and persistent AF
> hardware state across kernel transitions. When both AF and PF drivers
> are built as modules and a kexec kernel is performed, the PF driver in
> the new kernel may probe before the AF driver has completed probing and
> reinitializing the RVU hardware. In this scenario, the hardware state
> left behind by the AF driver in the old kernel is still visible to the
> PF driver in the new kernel resulting in crash due to stale state.
>>> Fixes: 54494aa5d1e6 ("octeontx2-af: Add Marvell OcteonTX2 RVU AF driver")
>>> Signed-off-by: Anshumali Gaur <agaur@marvell.com>
>>> ---
>>>    drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 11 +++++++++++
>>>    1 file changed, 11 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
>>> index 747fbdf2a908..8530df8b3fda 100644
>>> --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
>>> +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
>>> @@ -3632,11 +3632,22 @@ static void rvu_remove(struct pci_dev *pdev)
>>>    	devm_kfree(&pdev->dev, rvu);
>>>    }
>>> +static void rvu_shutdown(struct pci_dev *pdev)
>>> +{
>>> +	struct rvu *rvu = pci_get_drvdata(pdev);
>>> +
>>> +	if (!rvu)
>>> +		return;
>>> +
>>> +	rvu_clear_rvum_blk_revid(rvu);
>>
>> Here, I guess you are clearing some data about the device status. Does that
>> mean that when you initialize later you will wait for the AF driver to
>> finish probing and configure this? It would be nice to explain how this
>> change fixes initialization.
>>
> The RVUM block revision field acts as an implicit indication that the AF
> driver has completed its initialization. If this value is left uncleared
> during kexec kernel booting, the PF driver may observe a non-zero/valid
> RVUM block revision and incorrectly assume that the AF is already
> initialized and ready, even though the AF driver in the kexec kernel has
> not yet probed. This leads to PF initialization proceeding against
> partially initialized hardware, resulting in a crash.

Makes sense. When shutting down you need to explicitly clear the stale 
data so that booting up (without a powercycle as in the kexec case) does 
not lead to stale data.

I'd appreciate a little more of this detail in the commit message 
personally. However, functionally it makes sense, so:

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2026-02-25 19:40 ` Bhargav Joshi
@ 2026-02-25 19:43   ` Andy Shevchenko
  0 siblings, 0 replies; 1546+ messages in thread
From: Andy Shevchenko @ 2026-02-25 19:43 UTC (permalink / raw)
  To: Bhargav Joshi
  Cc: lars, Michael.Hennerich, jic23, dlechner, nuno.sa, andy,
	linux-iio, linux-kernel, Andy Shevchenko

On Wed, Feb 25, 2026 at 9:42 PM Bhargav Joshi <rougueprince47@gmail.com> wrote:
>
> Subject: [PATCH v5 3/3] iio: frequency: ad9523: fix checkpatch warnings
> for symbolic permissions

Something went wrong...

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2026-03-13 11:01 ` Vyacheslav Vahnenko
@ 2026-03-13 12:04   ` Greg KH
  0 siblings, 0 replies; 1546+ messages in thread
From: Greg KH @ 2026-03-13 12:04 UTC (permalink / raw)
  To: Vyacheslav Vahnenko; +Cc: linux-usb

On Fri, Mar 13, 2026 at 02:01:41PM +0300, Vyacheslav Vahnenko wrote:
> Add USB_QUIRK_NO_BOS for ezcap401 capture card, without it dmesg will show "unable to get BOS descriptor or descriptor too short"
> and "unable to read config index 0 descriptor/start: -71" errors and device will not able to work at full speed at 10gbs
> 
> Subject: ezcap401 needs USB_QUIRK_NO_BOS to function on 10gbs usb speed

Close, this should actually be the subject line, you forgot to have one
entirely in your email :)

And can you wrap the changelog text at 72 columns, like your editor
should have asked you to when making this change?  If you run your patch
through scripts/checkpatch.pl before sending it, it should tell you
about these types of things.

> Signed-off-by: Vyacheslav Vahnenko <vahnenko2003@gmail.com>
> ---
>  drivers/usb/core/quirks.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
> index 9e7e49712..0010f41a3 100644
> --- a/drivers/usb/core/quirks.c
> +++ b/drivers/usb/core/quirks.c
> @@ -574,6 +574,9 @@ static const struct usb_device_id usb_quirk_list[] = {
>  	/* Alcor Link AK9563 SC Reader used in 2022 Lenovo ThinkPads */
>  	{ USB_DEVICE(0x2ce3, 0x9563), .driver_info = USB_QUIRK_NO_LPM },
>  
> +	/* ezcap401 - BOS descriptor fetch hangs at SuperSpeed Plus */
> +	{ USB_DEVICE(0x32ed, 0x0401), .driver_info = USB_QUIRK_NO_BOS },

This looks good, thanks for putting it in the right place.

Almost there!

greg k-h

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2026-03-31 11:14 ` Wang Jun
@ 2026-03-31 12:09   ` Eric Dumazet
  0 siblings, 0 replies; 1546+ messages in thread
From: Eric Dumazet @ 2026-03-31 12:09 UTC (permalink / raw)
  To: Wang Jun
  Cc: Andrew Lunn, David S . Miller, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel, gszhai, 25125332, 25125283, 23120469

On Tue, Mar 31, 2026 at 4:15 AM Wang Jun <1742789905@qq.com> wrote:
>
> Hi Paolo Abeni,
>
> This is v2 of the DMA mapping error handling fix for ns83820. Changes since v1:
>
> - Added queue restart check in error path to avoid potential TX queue stall
>   (as pointed out by the AI review)
> - Adjusted variable declarations to follow reverse christmas tree order
> - Switched from dma_unmap_single to dma_unmap_page for fragments
>
> Thanks to reviewers for the feedback.
>
> Subject: [PATCH v2] net: ns83820: fix DMA mapping error handling in
>  hard_start_xmit
>

This is not how a new version of a patch needs to be submitted.

Look at https://patchwork.kernel.org/project/netdevbpf/list/ and
compare your patch with others...

Thanks.

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re;
@ 2026-04-12  6:24 Erick Lorch
  0 siblings, 0 replies; 1546+ messages in thread
From: Erick Lorch @ 2026-04-12  6:24 UTC (permalink / raw)
  To: bridge

Good day

I reviewed your reputable profile which gives me the intuition that you will be a potential business partner in a crude oil venture with my company worth two Million (2,000 000) barrels monthly involving the NATIONAL OIL CORPORATION OF LIBYA (NOC) and our Oil Refinery Company. This crude oil venture will gain commissions value  in revenue approximately per trade with the NOC;

REPLY IF YOU ARE INTERESTED

Regards,
Erick Lorch
Procurement Supervisor 

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re;
@ 2026-04-12 10:09 Erick Lorch
  0 siblings, 0 replies; 1546+ messages in thread
From: Erick Lorch @ 2026-04-12 10:09 UTC (permalink / raw)
  To: bridge

Good day

I reviewed your reputable profile which gives me the intuition that you will be a potential business partner in a crude oil venture with my company worth two Million (2,000 000) barrels monthly involving the NATIONAL OIL CORPORATION OF LIBYA (NOC) and our Oil Refinery Company. This crude oil venture will gain commissions value  in revenue approximately per trade with the NOC;

REPLY IF YOU ARE INTERESTED

Regards,
Erick Lorch
Procurement Supervisor 

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re;
@ 2026-04-12 13:42 Erick Lorch
  0 siblings, 0 replies; 1546+ messages in thread
From: Erick Lorch @ 2026-04-12 13:42 UTC (permalink / raw)
  To: bridge

Good day

I reviewed your reputable profile which gives me the intuition that you will be a potential business partner in a crude oil venture with my company worth two Million (2,000 000) barrels monthly involving the NATIONAL OIL CORPORATION OF LIBYA (NOC) and our Oil Refinery Company. This crude oil venture will gain commissions value  in revenue approximately per trade with the NOC;

REPLY IF YOU ARE INTERESTED

Regards,
Erick Lorch
Procurement Supervisor 

^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2026-04-28 18:24 Fabio M. De Francesco
@ 2026-05-01 22:01 ` Dave Jiang
  0 siblings, 0 replies; 1546+ messages in thread
From: Dave Jiang @ 2026-05-01 22:01 UTC (permalink / raw)
  To: Fabio M. De Francesco, linux-cxl
  Cc: Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Bjorn Helgaas, linux-kernel, linux-pci



On 4/28/26 11:24 AM, Fabio M. De Francesco wrote:
> Subject: [PATCH 0/2] PCI/CXL: Recover CXL Downstream Ports from PM Init failure
> 
> CXL r4.0 sec 8.1.5.1 Implementation Note describes a scenario in which a
> Secondary Bus Reset, a Link Down, or Downstream Port Containment on a

I'm not sure if this series covers a Link Down event (i.e. hotplug). As I recall, cxl_reset_bus_function() only happens via sysfs trigger.

DJ


> CXL Downstream Port prevents Port PM Init from completing when ACS
> Source Validation is enabled on the Downstream Port. The spec states
> that another SBR alone does not recover the port and describes a
> software recovery sequence.  
> 
> Patch 1 extends cxl_reset_bus_function(), the helper backing the cxl_bus
> PCI/CXL reset method exposed to userspace via sysfs. It saves, clears,
> and restores ACS Source Validation and Bus Master Enable on the CXL
> Downstream Port around the SBR it issues. This keeps the userspace
> cxl_bus reset path from leaving the port unable to complete PM Init.
> 
> Patch 2 adds a recovery pass during CXL enumeration. For each CXL
> Downstream Port in a memdev's ancestry, the CXL core checks whether PM
> Init has completed. If it has not, regardless of what caused the
> failure, it invokes cxl_reset_bus_function() on the child below the port
> in the hope of restoring the port to a usable state. CXL enumeration
> re-runs after events that tear down and re-probe the memdev, including
> DPC, AER, and Link Down, so those paths reach this recovery.
> 
> This small series is developed from an old RFC v3:
> https://lore.kernel.org/linux-cxl/20260330193347.25072-1-fabio.m.de.francesco@linux.intel.com/
> 
> Fabio M. De Francesco (2):
>   PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled
>   cxl/core: Recover from PM Init failure via cxl_reset_bus_function()
> 
> drivers/cxl/core/pci.c        | 30 ++++++++++++++++++++
>  drivers/cxl/core/port.c       | 22 +++++++++++++++
>  drivers/cxl/cxlpci.h          |  3 ++
>  drivers/pci/pci.c             | 52 ++++++++++++++++++++++++++++++++++-
>  include/linux/pci.h           |  1 + 
>  include/uapi/linux/pci_regs.h |  2 ++
>  6 files changed, 109 insertions(+), 1 deletion(-)
> 


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* RE:
  2026-04-28 14:48           ` Yongchao Wu
@ 2026-05-04  9:15             ` Pawel Laszczak
  0 siblings, 0 replies; 1546+ messages in thread
From: Pawel Laszczak @ 2026-05-04  9:15 UTC (permalink / raw)
  To: Yongchao Wu, Peter Chen (CIX)
  Cc: rogerq@kernel.org, gregkh@linuxfoundation.org,
	linux-usb@vger.kernel.org, stable@vger.kernel.org

>>
>>>
>>> On 26-04-27 09:01:47, Pawel Laszczak wrote:
>>>>>
>>>>>
>>>>> On 26-04-24 00:06:01, Yongchao Wu wrote:
>>>>>> According to the cdns3 datasheet, the EPRST (Endpoint Reset)
>>>>>> command causes the DMA engine to reposition its internal pointer
>>>>>> to the next Transfer Descriptor (TD) if it was already processing one.
>>>>>>
>>>>>> This issue is consistently observed during the ADB identification
>>>>>> process on macOS hosts, where the host issues a Clear_Halt. Although
>>>>>> commit 4bf2dd65135a ("usb: cdns3: gadget: toggle cycle bit before
>reset
>>>>>> reset endpoint") attempted to avoid DMA advance by toggling the
>>>>>> cycle bit, trace logs show that on certain hosts like macOS, the
>>>>>> DMA pointer
>>>>>> (EP_TRADDR) still shifts after EPRST:
>>>>>>
>>>>>>    cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
>>>>>>    cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030  <- Should be
>f9c04000
>>>>>>    cdns3_gadget_giveback: ep1out: req: ... length: 16384/16384
>>>>>>
>>>>>> As shown above, the DMA pointer jumped to index 3 (offset 0x30),
>>>>>> causing the controller to skip the initial TRBs of the request.
>>>>>> This leads to data misalignment and ADB protocol hangs on macOS.
>>>>>
>>>>> Pawel, Is it a hardware issue? The cycle bit has already been
>>>>> toggled before the endpoint has been reset, why the DMA pointer still
>advances?
>>>>
>>>> Yongchao, could you confirm if the TD consists of three TRBs?
>>> In our case, each TD consists of 4 TRBs.
>>> The DMA pointer appears to advance within the same TD after EPRST.
>>>
>>> Each 16KB request is split into 4 TRBs (4KB each):
>>> - TRB0 - TRB2: CHAIN
>>> - TRB3: IOC (last TRB of the TD)
>>>
>>> After enqueue, the initial EP_TRADDR points to the first TRB:
>>>    EP_TRADDR = 0xf9c04000 (TRB0)
>>>
>>> After Clear_Halt (EPRST), it becomes:
>>>    EP_TRADDR = 0xf9c04030 (TRB3)
>>>
>>> Since each TRB is 12 bytes, the offset 0x30 corresponds to 4 TRBs.
>>> This indicates that after EPRST, the DMA pointer skipped the entire
>>> current Request and jumped directly to the start of the next Request
>>> at 0xf9c04030
>>>
>>> Below is the relevant trace (trimmed):
>>>
>>> // enqueue request (16KB -> 4 TRBs)
>>> cdns3_prepare_trb: dma buf: 0xf7abc000, size: 4096, ctrl: 0x00200415
>>> cdns3_prepare_trb: dma buf: 0xf7abd000, size: 4096, ctrl: 0x00000415
>>> cdns3_prepare_trb: dma buf: 0xf7abe000, size: 4096, ctrl: 0x00000415
>>> cdns3_prepare_trb: dma buf: 0xf7abf000, size: 4096, ctrl: 0x00000425
>>>
>>> cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04000
>>>
>>> // Clear_Halt
>>> cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
>>> cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030
>>>
>>
>> Can you confirm whether the host had already sent some data for this
>> TD prior to the endpoint reset operation?
>>
>
>I confirm that the host sent no data prior to or during the EPRST operation.

According to the specification, the controller may fetch TRB descriptors after
the endpoint has been initialized.
In complex Transfer Descriptors (TDs) consisting of several TRBs with the CH=1
bit set, the controller may fetch additional TRBs because it treats them as a
single logical entity.

I have not been able to determine exactly how many TRBs can be prefetched
in such a situation. 

According to the description of the EPRST bit:
After endpoint reset the software is responsible for it to re-set the Endpoint
TRADDR.

This fix looks correct to me, 

Can you confirm which version of controller do you have in usb_cap6 register?

Pawel

>
>TotalPhase Trace:
>0,HS,2700,0:06.078.671,2.057.666 ms,0 B,,13,00,Set
>Configuration,Configuration=1
>0,HS,2710,0:06.080.811,1.125.266 ms,,,,,[10 SOF],[Frames: 1243.7 - 1245.0]
>0,HS,2711,0:06.080.955,992.550 us,2 B,,13,00,Get String Descriptor,Index=5
>Length=2
>0,HS,2733,0:06.082.061,125.083 us,,,,,[2 SOF],[Frames: 1245.1 - 1245.2]
>0,HS,2734,0:06.082.119,104.566 us,28 B,,13,00,Get String Descriptor,Index=5
>Length=28
>0,HS,2756,0:06.082.311,355.935.283 ms,,,,,[2848 SOF],[Frames: 1245.3 -
>1601.2]
>0,HS,2757,0:06.438.196,105.033 us,4 B,,13,00,Get String Descriptor,Index=0
>Length=256
>0,HS,2778,0:06.438.371,875.233 us,,,,,[8 SOF],[Frames: 1601.3 - 1602.2] //1.
>Host issues Clear_Halt
>0,HS,2779,0:06.439.278,51.433 us,0 B,,13,00,Clear Endpoint Feature,Halt
>Endpoint 01 OUT
>0,HS,2789,0:06.439.371,500.150 us,,,,,[5 SOF],[Frames: 1602.3 - 1602.7]
>0,HS,2790,0:06.439.874,51.416 us,0 B,,13,00,Clear Endpoint Feature,Halt
>Endpoint 01 IN
>0,HS,2800,0:06.439.996,250.116 us,,,,,[3 SOF],[Frames: 1603.0 - 1603.2] //2.
>First OUT transaction happens
>0,HS,2801,0:06.440.350,1.066 us,24 B,,13,01,OUT txn,43 4E 58 4E 01 00 00 01
>00 00 10 00..
>0,HS,2805,0:06.440.371,66 ns,,,,,[1 SOF],[Frame: 1603.3]
>0,HS,2806,0:06.440.453,4.283 us,218 B,,13,01,OUT txn,68 6F 73 74 3A 3A 66 65
>61 74 75 72..
>
>> Pawel
>>
>>> Best regards,
>>> Yongchao
>>> Best regards,
>>> Yongchao


^ permalink raw reply	[flat|nested] 1546+ messages in thread

* Re:
  2026-05-09 18:01 Andrea Righi
@ 2026-05-09 18:07 ` Andrea Righi
  0 siblings, 0 replies; 1546+ messages in thread
From: Andrea Righi @ 2026-05-09 18:07 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, K Prateek Nayak, Christian Loehle, Phil Auld,
	Koba Ko, Felix Abecassis, Balbir Singh, Joel Fernandes,
	Shrikanth Hegde, linux-kernel

On Sat, May 09, 2026 at 08:01:20PM +0200, Andrea Righi wrote:
> This series attempts to improve SD_ASYM_CPUCAPACITY scheduling by introducing
> SMT awareness.

Somehow I messed up the subject in the cover letter, I'll re-send.
Ignore this one and sorry for the noise.

-Andrea

^ permalink raw reply	[flat|nested] 1546+ messages in thread

end of thread, other threads:[~2026-05-09 18:07 UTC | newest]

Thread overview: 1546+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-09 13:14 [PATCH 0/24] make atomic_read() behave consistently across all architectures Chris Snook
2007-08-09 12:41 ` Arnd Bergmann
2007-08-09 14:29   ` Chris Snook
2007-08-09 15:30     ` Arnd Bergmann
2007-08-14 22:31 ` Christoph Lameter
2007-08-14 22:45   ` Chris Snook
2007-08-14 22:51     ` Christoph Lameter
2007-08-14 23:08   ` Satyam Sharma
2007-08-14 23:04     ` Chris Snook
2007-08-14 23:14       ` Christoph Lameter
2007-08-15  6:49       ` Herbert Xu
2007-08-15  6:49         ` Herbert Xu
2007-08-15  8:18         ` Heiko Carstens
2007-08-15 13:53           ` Stefan Richter
2007-08-15 14:35             ` Satyam Sharma
2007-08-15 14:52               ` Herbert Xu
2007-08-15 16:09                 ` Stefan Richter
2007-08-15 16:27                   ` Paul E. McKenney
2007-08-15 17:13                     ` Satyam Sharma
2007-08-15 18:31                     ` Segher Boessenkool
2007-08-15 18:57                       ` Paul E. McKenney
2007-08-15 19:54                         ` Satyam Sharma
2007-08-15 20:17                           ` Paul E. McKenney
2007-08-15 20:52                             ` Segher Boessenkool
2007-08-15 22:42                               ` Paul E. McKenney
2007-08-15 20:47                           ` Segher Boessenkool
2007-08-16  0:36                             ` Satyam Sharma
2007-08-16  0:32                               ` your mail Herbert Xu
2007-08-16  0:58                                 ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Satyam Sharma
2007-08-16  0:51                                   ` Herbert Xu
2007-08-16  1:18                                     ` Satyam Sharma
2007-08-16  1:38                               ` Segher Boessenkool
2007-08-15 21:05                         ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Segher Boessenkool
2007-08-15 22:44                           ` Paul E. McKenney
2007-08-16  1:23                             ` Segher Boessenkool
2007-08-16  2:22                               ` Paul E. McKenney
2007-08-15 19:58               ` Stefan Richter
2007-08-16  0:39           ` [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert() Satyam Sharma
2007-08-24 11:59             ` Denys Vlasenko
2007-08-24 12:07               ` Andi Kleen
2007-08-24 12:12               ` Kenn Humborg
2007-08-24 12:12                 ` Kenn Humborg
2007-08-24 14:25                 ` Denys Vlasenko
2007-08-24 17:34                   ` Linus Torvalds
2007-08-24 13:30               ` Satyam Sharma
2007-08-24 17:06                 ` Christoph Lameter
2007-08-24 20:26                   ` Denys Vlasenko
2007-08-24 20:34                     ` Chris Snook
2007-08-24 16:19               ` Luck, Tony
2007-08-24 16:19                 ` Luck, Tony
2007-08-15 16:13         ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Chris Snook
2007-08-15 23:40           ` Herbert Xu
2007-08-15 23:51             ` Paul E. McKenney
2007-08-16  1:30               ` Segher Boessenkool
2007-08-16  2:30                 ` Paul E. McKenney
2007-08-16 19:33                   ` Segher Boessenkool
2007-08-16  1:26             ` Segher Boessenkool
2007-08-16  2:23               ` Nick Piggin
2007-08-16 19:32                 ` Segher Boessenkool
2007-08-17  2:19                   ` Nick Piggin
2007-08-17  3:16                     ` Paul Mackerras
2007-08-17  3:32                       ` Nick Piggin
2007-08-17  3:50                         ` Linus Torvalds
2007-08-17 23:59                           ` Paul E. McKenney
2007-08-18  0:09                             ` Herbert Xu
2007-08-18  1:08                               ` Paul E. McKenney
2007-08-18  1:24                                 ` Christoph Lameter
2007-08-18  1:41                                   ` Satyam Sharma
2007-08-18  4:13                                     ` Linus Torvalds
2007-08-18 13:36                                       ` Satyam Sharma
2007-08-18 21:54                                       ` Paul E. McKenney
2007-08-18 22:41                                         ` Linus Torvalds
2007-08-18 23:19                                           ` Paul E. McKenney
2007-08-24 12:19                                       ` Denys Vlasenko
2007-08-24 17:19                                         ` Linus Torvalds
2007-08-18 21:56                                   ` Paul E. McKenney
2007-08-20 13:31                                   ` Chris Snook
2007-08-20 22:04                                     ` Segher Boessenkool
2007-08-20 22:48                                       ` Russell King
2007-08-20 23:02                                         ` Segher Boessenkool
2007-08-21  0:05                                           ` Paul E. McKenney
2007-08-21  7:08                                             ` Russell King
2007-08-21  7:05                                           ` Russell King
2007-08-21  9:33                                             ` Paul Mackerras
2007-08-21 11:37                                               ` Andi Kleen
2007-08-21 14:48                                               ` Segher Boessenkool
2007-08-21 16:16                                                 ` Paul E. McKenney
2007-08-21 22:51                                                   ` Valdis.Kletnieks
2007-08-22  0:50                                                     ` Paul E. McKenney
2007-08-22 21:38                                                     ` Adrian Bunk
2007-08-21 14:39                                             ` Segher Boessenkool
2007-08-17  3:42                       ` Linus Torvalds
2007-08-17  5:18                         ` Paul E. McKenney
2007-08-17  5:56                         ` Satyam Sharma
2007-08-17  7:26                           ` Nick Piggin
2007-08-17  8:47                             ` Satyam Sharma
2007-08-17  9:15                               ` Nick Piggin
2007-08-17 10:12                                 ` Satyam Sharma
2007-08-17 12:14                                   ` Nick Piggin
2007-08-17 13:05                                     ` Satyam Sharma
2007-08-17  9:48                               ` Paul Mackerras
2007-08-17 10:23                                 ` Satyam Sharma
2007-08-17 22:49                           ` Segher Boessenkool
2007-08-17 23:51                             ` Satyam Sharma
2007-08-17 23:55                               ` Segher Boessenkool
2007-08-17  6:42                         ` Geert Uytterhoeven
2007-08-17  8:52                         ` Andi Kleen
2007-08-17 10:08                           ` Satyam Sharma
2007-08-17 22:29                         ` Segher Boessenkool
2007-08-17 17:37                     ` Segher Boessenkool
2007-08-14 23:26     ` Paul E. McKenney
2007-08-15 10:35     ` Stefan Richter
2007-08-15 12:04       ` Herbert Xu
2007-08-15 12:31       ` Satyam Sharma
2007-08-15 13:08         ` Stefan Richter
2007-08-15 13:11           ` Stefan Richter
2007-08-15 13:47           ` Satyam Sharma
2007-08-15 14:25             ` Paul E. McKenney
2007-08-15 15:33               ` Herbert Xu
2007-08-15 16:08                 ` Paul E. McKenney
2007-08-15 17:18                   ` Satyam Sharma
2007-08-15 17:33                     ` Paul E. McKenney
2007-08-15 18:05                       ` Satyam Sharma
2007-08-15 18:19                 ` David Howells
2007-08-15 18:45                   ` Paul E. McKenney
2007-08-15 23:41                     ` Herbert Xu
2007-08-15 23:53                       ` Paul E. McKenney
2007-08-16  0:12                         ` Herbert Xu
2007-08-16  0:23                           ` Paul E. McKenney
2007-08-16  0:30                             ` Herbert Xu
2007-08-16  0:49                               ` Paul E. McKenney
2007-08-16  0:53                                 ` Herbert Xu
2007-08-16  1:14                                   ` Paul E. McKenney
2007-08-15 17:55               ` Satyam Sharma
2007-08-15 19:07                 ` Paul E. McKenney
2007-08-15 21:07                   ` Segher Boessenkool
2007-08-15 20:58                 ` Segher Boessenkool
2007-08-15 18:31         ` Segher Boessenkool
2007-08-15 19:40           ` Satyam Sharma
2007-08-15 20:42             ` Segher Boessenkool
2007-08-16  1:23               ` Satyam Sharma
2007-08-15 23:22         ` Paul Mackerras
2007-08-16  0:26           ` Christoph Lameter
2007-08-16  0:34             ` Paul Mackerras
2007-08-16  0:40               ` Christoph Lameter
2007-08-16  0:39             ` Paul E. McKenney
2007-08-16  0:42               ` Christoph Lameter
2007-08-16  0:53                 ` Paul E. McKenney
2007-08-16  0:59                   ` Christoph Lameter
2007-08-16  1:14                     ` Paul E. McKenney
2007-08-16  1:41                       ` Christoph Lameter
2007-08-16  2:15                         ` Satyam Sharma
2007-08-16  2:08                           ` Herbert Xu
2007-08-16  2:18                             ` Christoph Lameter
2007-08-16  3:23                               ` Paul Mackerras
2007-08-16  3:33                                 ` Herbert Xu
2007-08-16  3:48                                   ` Paul Mackerras
2007-08-16  4:03                                     ` Herbert Xu
2007-08-16  4:34                                       ` Paul Mackerras
2007-08-16  5:37                                         ` Herbert Xu
2007-08-16  6:00                                           ` Paul Mackerras
2007-08-16 18:50                                             ` Christoph Lameter
2007-08-16 18:48                                 ` Christoph Lameter
2007-08-16 19:44                                 ` Segher Boessenkool
2007-08-16  2:18                             ` Chris Friesen
2007-08-16  2:32                         ` Paul E. McKenney
2007-08-16  1:51                 ` Paul Mackerras
2007-08-16  2:00                   ` Herbert Xu
2007-08-16  2:05                     ` Paul Mackerras
2007-08-16  2:11                       ` Herbert Xu
2007-08-16  2:35                         ` Paul E. McKenney
2007-08-16  3:15                         ` Paul Mackerras
2007-08-16  3:43                           ` Herbert Xu
2007-08-16  2:15                       ` Christoph Lameter
2007-08-16  2:17                         ` Christoph Lameter
2007-08-16  2:33                       ` Satyam Sharma
2007-08-16  3:01                         ` Satyam Sharma
2007-08-16  4:11                           ` Paul Mackerras
2007-08-16  5:39                             ` Herbert Xu
2007-08-16  6:56                               ` Paul Mackerras
2007-08-16  7:09                                 ` Herbert Xu
2007-08-16  8:06                                   ` Stefan Richter
2007-08-16  8:10                                     ` Herbert Xu
2007-08-16  9:54                                       ` Stefan Richter
2007-08-16 10:31                                         ` Stefan Richter
2007-08-16 10:42                                           ` Herbert Xu
2007-08-16 16:34                                             ` Paul E. McKenney
2007-08-16 23:59                                               ` Herbert Xu
2007-08-17  1:01                                                 ` Paul E. McKenney
2007-08-17  7:39                                                   ` Satyam Sharma
2007-08-17 14:31                                                     ` Paul E. McKenney
2007-08-17 18:31                                                       ` Satyam Sharma
2007-08-17 18:56                                                         ` Paul E. McKenney
2007-08-17  3:15                                               ` Nick Piggin
2007-08-17  4:02                                                 ` Paul Mackerras
2007-08-17  4:39                                                   ` Nick Piggin
2007-08-17  7:25                                                 ` Stefan Richter
2007-08-17  8:06                                                   ` Nick Piggin
2007-08-17  8:58                                                     ` Satyam Sharma
2007-08-17  9:15                                                       ` Nick Piggin
2007-08-17 10:03                                                         ` Satyam Sharma
2007-08-17 11:50                                                           ` Nick Piggin
2007-08-17 12:50                                                             ` Satyam Sharma
2007-08-17 12:56                                                               ` Nick Piggin
2007-08-18  2:15                                                                 ` Satyam Sharma
2007-08-17 10:48                                                     ` Stefan Richter
2007-08-17 10:58                                                       ` Stefan Richter
2007-08-18 14:35                                                     ` LDD3 pitfalls (was Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures) Stefan Richter
2007-08-20 13:28                                                       ` Chris Snook
2007-08-17 22:14                                                 ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Segher Boessenkool
2007-08-17  5:04                                             ` Paul Mackerras
2007-08-16 10:35                                         ` Herbert Xu
2007-08-16 19:48                                       ` Chris Snook
2007-08-17  0:02                                         ` Herbert Xu
2007-08-17  2:04                                           ` Chris Snook
2007-08-17  2:13                                             ` Herbert Xu
2007-08-17  2:31                                             ` Nick Piggin
2007-08-17  5:09                                       ` Paul Mackerras
2007-08-17  5:32                                         ` Herbert Xu
2007-08-17  5:41                                           ` Paul Mackerras
2007-08-17  8:28                                             ` Satyam Sharma
2007-08-16 14:48                                   ` Ilpo Järvinen
2007-08-16 16:19                                     ` Stefan Richter
2007-08-16 19:55                                     ` Chris Snook
2007-08-16 20:20                                       ` Christoph Lameter
2007-08-17  1:02                                         ` Paul E. McKenney
2007-08-17  1:28                                           ` Herbert Xu
2007-08-17  5:07                                             ` Paul E. McKenney
2007-08-17  2:16                                         ` Paul Mackerras
2007-08-17  3:03                                           ` Linus Torvalds
2007-08-17  3:43                                             ` Paul Mackerras
2007-08-17  3:53                                               ` Herbert Xu
2007-08-17  6:26                                                 ` Satyam Sharma
2007-08-17  8:38                                                   ` Nick Piggin
2007-08-17  9:14                                                     ` Satyam Sharma
2007-08-17  9:31                                                       ` Nick Piggin
2007-08-17 10:55                                                         ` Satyam Sharma
2007-08-17 12:39                                                           ` Nick Piggin
2007-08-17 13:36                                                             ` Satyam Sharma
2007-08-17 16:48                                                             ` Linus Torvalds
2007-08-17 18:50                                                               ` Chris Friesen
2007-08-17 18:54                                                                 ` Arjan van de Ven
2007-08-17 19:49                                                                   ` Paul E. McKenney
2007-08-17 19:49                                                                     ` Arjan van de Ven
2007-08-17 20:12                                                                       ` Paul E. McKenney
2007-08-17 19:08                                                                 ` Linus Torvalds
2007-08-20 13:15                                                               ` Chris Snook
2007-08-20 13:32                                                                 ` Herbert Xu
2007-08-20 13:38                                                                   ` Chris Snook
2007-08-20 22:07                                                                     ` Segher Boessenkool
2007-08-21  5:46                                                                 ` Linus Torvalds
2007-08-21  7:04                                                                   ` David Miller
2007-08-21 13:50                                                                     ` Chris Snook
2007-08-21 14:59                                                                       ` Segher Boessenkool
2007-08-21 16:31                                                                       ` Satyam Sharma
2007-08-21 16:43                                                                       ` Linus Torvalds
2007-09-09 18:02                                                               ` Denys Vlasenko
2007-09-09 18:18                                                                 ` Arjan van de Ven
2007-09-10 10:56                                                                   ` Denys Vlasenko
2007-09-10 11:15                                                                     ` Herbert Xu
2007-09-10 12:22                                                                     ` Kyle Moffett
2007-09-10 13:38                                                                       ` Denys Vlasenko
2007-09-10 14:16                                                                         ` Denys Vlasenko
2007-09-10 15:09                                                                           ` Linus Torvalds
2007-09-10 16:46                                                                             ` Denys Vlasenko
2007-09-10 19:59                                                                               ` Kyle Moffett
2007-09-10 18:59                                                                             ` Christoph Lameter
2007-09-10 23:19                                                                             ` [PATCH] Document non-semantics of atomic_read() and atomic_set() Chris Snook
2007-09-10 23:44                                                                               ` Paul E. McKenney
2007-09-11 19:35                                                                               ` Christoph Lameter
2007-09-10 14:51                                                                     ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Arjan van de Ven
2007-09-10 14:38                                                                       ` Denys Vlasenko
2007-09-10 17:02                                                                         ` Arjan van de Ven
2007-08-17 11:08                                                     ` Stefan Richter
2007-08-17 22:09                                             ` Segher Boessenkool
2007-08-17 17:41                                         ` Segher Boessenkool
2007-08-17 18:38                                           ` Satyam Sharma
2007-08-17 23:17                                             ` Segher Boessenkool
2007-08-17 23:55                                               ` Satyam Sharma
2007-08-18  0:04                                                 ` Segher Boessenkool
2007-08-18  1:56                                                   ` Satyam Sharma
2007-08-18  2:15                                                     ` Segher Boessenkool
2007-08-18  3:33                                                       ` Satyam Sharma
2007-08-18  5:18                                                         ` Segher Boessenkool
2007-08-18 13:20                                                           ` Satyam Sharma
2007-09-10 18:59                                           ` Christoph Lameter
2007-09-10 20:54                                             ` Paul E. McKenney
2007-09-10 21:36                                               ` Christoph Lameter
2007-09-10 21:50                                                 ` Paul E. McKenney
2007-09-11  2:27                                             ` Segher Boessenkool
2007-08-16 21:08                                       ` Luck, Tony
2007-08-16 21:08                                         ` Luck, Tony
2007-08-16 19:55                                     ` Chris Snook
2007-08-16 18:54                             ` Christoph Lameter
2007-08-16 20:07                               ` Paul E. McKenney
2007-08-16  3:05                         ` Paul Mackerras
2007-08-16 19:39                           ` Segher Boessenkool
2007-08-16  2:07                   ` Segher Boessenkool
2007-08-24 12:50           ` Denys Vlasenko
2007-08-24 17:15             ` Christoph Lameter
2007-08-24 20:21               ` Denys Vlasenko
2007-08-16  3:37         ` Bill Fink
2007-08-16  5:20           ` Satyam Sharma
2007-08-16  5:57             ` Satyam Sharma
2007-08-16  9:25               ` Satyam Sharma
2007-08-16 21:00               ` Segher Boessenkool
2007-08-17  4:32                 ` Satyam Sharma
2007-08-17 22:38                   ` Segher Boessenkool
2007-08-18 14:42                     ` Satyam Sharma
2007-08-16 20:50             ` Segher Boessenkool
2007-08-16 22:40               ` David Schwartz
2007-08-17  4:36                 ` Satyam Sharma
2007-08-17  4:24               ` Satyam Sharma
2007-08-17 22:34                 ` Segher Boessenkool
2007-08-15 19:59       ` Christoph Lameter
  -- strict thread matches above, loose matches on Subject: below --
2011-09-26  4:23 (unknown), Kenn
2011-09-26  4:52 ` NeilBrown
2011-09-26  7:03   ` Re: Roman Mamedov
2011-09-26 23:23     ` Re: Kenn
2011-09-26  7:42   ` Re: Kenn
2011-09-26  8:04     ` Re: NeilBrown
2011-09-26 18:04       ` Re: Kenn
2011-09-26 19:56         ` Re: David Brown
2011-10-20  0:40 Re: Wayne Johnson
2011-10-25  5:55 (unknown), Renjun Qu
     [not found] ` <CAPu47WTjxrrF+tHGRJOgKohD-sijBvX8iC-gBUnbsRw_KS4K5g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-01 11:52   ` Harald Hoyer
2011-10-26 20:51 Re: bfeely
2011-10-28 15:55 Re: Young Chang
2011-10-28 16:03 Re: Young Chang
2011-11-08  1:58 linux-next: manual merge of the bluetooth tree with Linus tree Stephen Rothwell
2011-11-08  2:26 ` (unknown) Wu Fengguang
2011-11-08  4:40   ` Stephen Rothwell
2011-11-09 11:58 Re: pradeep Annavarapu
2011-11-09 11:58 ` Re: pradeep Annavarapu
2011-11-21 15:22 No subject Jimmy Pan
2011-11-22 16:41 ` Jimmy Pan
2011-12-11  8:41 Re: James Brown
2011-12-13  2:58 Re: Matt Shaw
2011-12-13  2:58 ` Re: Matt Shaw
2011-12-21 13:54 "btrfs: open_ctree failed" error Malwina Bartoszynska
2011-12-21 19:06 ` Chris Mason
2011-12-22  9:43   ` Malwina Bartoszynska
2012-01-31 15:53     ` Max
     [not found] <CAPt03ozqf3zKPK90q_EsvnmfxUq5Qq=LDHTz0EYNs37uEPrQDg@mail.gmail.com>
     [not found] ` <CAOzFzEhVs=sm26wspdAH1rcc-S9nVW1xLok9ho--LnzxJXnNsw@mail.gmail.com>
     [not found]   ` <CAOzFzEhVs=sm26wspdAH1rcc-S9nVW1xLok9ho--LnzxJXnNsw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-01-16 15:49     ` Re: Joseph Glanville
2012-01-30 19:43 Laurent Bonnans
2012-01-31  5:58 ` Mohammed Shafi
2012-02-01 11:14   ` Re: Mohammed Shafi
2012-02-01 16:27     ` Re: John W. Linville
2012-02-01 17:04       ` Re: Felix Fietkau
2012-02-02  5:37         ` Re: Mohammed Shafi
2012-02-02 12:28           ` Re: Felix Fietkau
2012-02-03 10:12             ` Re: Mohammed Shafi
2012-02-03 14:44             ` Re: Laurent Bonnans
2012-02-15 21:17 Re: Irish Lotto
2012-02-22  6:50 Vlatka Petričec
2012-02-22 15:28 ` Larry Finger
2012-02-23 15:39 Pierre Frenkiel
2012-02-23 16:34 ` Brad Midgley
2012-05-08  0:54 (unknown), Tim Flavin
2012-05-17 21:10 ` Josh Durgin
2012-05-20 22:20 Re: Mr. Peter Wong
2012-05-20 22:20 Re: Mr. Peter Wong
2012-05-20 22:20 Re: Mr. Peter Wong
2012-05-22 14:39 Re: skoffman
2012-05-30 23:55 Re: Yuniya
2012-06-06 10:33 Sascha Hauer
2012-06-06 14:39 ` Artem Bityutskiy
2012-06-07 10:11   ` Re: Sascha Hauer
2012-06-07 12:45     ` Re: Artem Bityutskiy
     [not found] <4FD71854.6060503@hastexo.com>
2012-06-12 10:44 ` "Radosgw installation and administration" docs Florian Haas
2012-06-12 16:47   ` Yehuda Sadeh
2012-06-12 18:11     ` Florian Haas
2012-06-12 18:54       ` Yehuda Sadeh
2012-07-01 20:22         ` Chuanyu
2012-07-02  9:35           ` Chuanyu Tsai
2012-07-06 16:57 Pablo Trujillo
2012-07-07  9:08 ` Vladimir 'φ-coder/phcoder' Serbinenko
2012-07-31 23:52 (unknown), Ricardo Neri
2012-07-31 23:58 ` Ricardo Neri
2012-08-06 16:59 anish kumar
2012-08-06 17:05 ` Maarten Lankhorst
2012-08-15 10:12 State of nocow file attribute Lluís Batlle i Rossell
2012-08-17  1:45 ` Liu Bo
2012-08-17 14:59   ` David Sterba
2012-08-17 15:30     ` Liu Bo
     [not found] <s5hmx1526mg.wl%tiwai@suse.de>
2012-09-06  6:02 ` Re: Markus Trippelsdorf
2012-09-06  6:33   ` (no subject) Daniel Mack
2012-09-06  6:45     ` Markus Trippelsdorf
2012-09-06  6:48     ` (no subject) Takashi Iwai
2012-09-06  6:53       ` Markus Trippelsdorf
2012-10-03 16:02 James M Leddy
2012-10-03 17:53 ` Luis R. Rodriguez
2012-10-03 18:15   ` Re: James M Leddy
2012-10-06 23:15 (unknown), David Howells
2012-10-07  6:36 ` Geert Uytterhoeven
2012-10-11  9:57   ` Re: Will Deacon
2012-10-23  4:12 (unknown), jie sun
2012-10-23 11:50 ` Wido den Hollander
2012-10-24  5:48   ` Re: jie sun
2012-10-24  5:58     ` Re: Gregory Farnum
     [not found]       ` <CAB6Jr7SbbAE=yEVgg+UupTmavKfvFvGj8j7C9M0Ya2FocNmw9w@mail.gmail.com>
2012-10-25 12:15         ` Re: Gregory Farnum
2012-10-25 14:36           ` Re: Alex Elder
2012-10-25 15:38             ` Re: Sage Weil
2012-10-25 21:28               ` Re: Dan Mick
2012-10-25 22:15                 ` Re: Alex Elder
2012-10-26  3:08           ` Re: jie sun
2012-10-30  4:02 [PATCH v3 7/8] ACPI, PCI: add hostbridge removal function Bjorn Helgaas
2012-10-30 17:42 ` (unknown), Yinghai Lu
2012-11-02  0:17   ` Rafael J. Wysocki
2012-11-05 22:27     ` Re: Bjorn Helgaas
2012-11-05 22:49       ` Re: Yinghai Lu
2012-11-06  5:03   ` Taku Izumi
2012-11-06  5:03     ` RE: Taku Izumi
2012-11-14 10:21 Felipe López
2012-11-14 18:27 ` Pat Erley
2012-11-17 14:07   ` Re: Hauke Mehrtens
2012-11-19 15:24     ` Re: Felipe López
2012-11-17 11:37 UNITED NATION
2012-11-30 13:58 Naresh Bhat
2012-11-30 14:27 ` Daniel Mack
2012-12-14 14:09   ` Re: Naresh Bhat
2012-12-14 14:35     ` Re: Sven Neumann
2012-12-17  0:59 (unknown), Maik Purwin
2012-12-17  3:55 ` Phil Turmel
2012-12-25  0:12 (unknown), bobzer
2012-12-25  5:38 ` Phil Turmel
     [not found]   ` <CADzS=ar9c7hC1Z7HT9pTUEnoPR+jeo8wdexrrsFbVfPnZ9Tbmg@mail.gmail.com>
2012-12-26  2:15     ` Re: Phil Turmel
2012-12-26 11:29       ` Re: bobzer
2013-01-13 19:58 Re: Michael A. Purwoadi
2013-02-04  0:47 Re: JUMBO PROMO
     [not found] <[PATCH 00/14] mac80211: HT/VHT handling>
2013-02-11 12:38 ` Johannes Berg
2013-02-14 17:40   ` Johannes Berg
2013-02-17 13:21 (unknown), Somchai Smythe
2013-02-17 22:42 ` Eric Sandeen
2013-02-18  3:59   ` Re: Theodore Ts'o
2013-02-25  6:59 Re: Kiyoshi Ishiyama
2013-03-25 20:00 Re: Jonna Birgit Jacobsen
2013-04-12  7:08 No subject Callum Hutchinson
2013-04-15 10:30 ` Rafał Miłecki
2013-04-27  9:42 Peter Würtz
2013-05-02  3:00 ` Lin Ming
2013-05-08  6:25 (unknown), kedari appana
2013-05-08  7:11 ` Wolfgang Grandegger
2013-05-14 14:38   ` Re: kedari appana
2013-05-08  8:11 ` Re: Yegor Yefremov
2013-06-09 21:57 Abraham Lincon
2013-06-09 21:58 RE: Abraham Lincon
2013-06-09 22:01 RE: Abraham Lincon
2013-06-09 22:03 RE: Abraham Lincon
2013-06-09 22:06 RE: Abraham Lincon
2013-06-09 22:06 RE: Abraham Lincon
2013-07-08 21:52 Jeffrey (Sheng-Hui) Chu
2013-07-08 22:04 ` Joe Perches
2013-07-09 13:22 ` Re: Arend van Spriel
2013-07-10  9:12   ` Re: Samuel Ortiz
2013-07-28 14:21 piuvatsa
2013-07-28  9:49 ` Tomas Pospisek
2013-08-23  6:18 info
     [not found] <B719EF0A9FB7A247B5147CD67A83E60E011FEB76D1@EXCH10-MB3.paterson.k12.nj.us>
2013-08-23 10:47 ` Ruiz, Irma
2013-08-23 10:47 ` RE: Ruiz, Irma
2013-08-23 10:47 ` RE: Ruiz, Irma
2013-08-28 11:07 Marc Murphy
2013-08-28 11:23 ` Sedat Dilek
2013-09-03 23:50 (unknown), Matthew Garrett
     [not found] ` <1378252218-18798-1-git-send-email-matthew.garrett-05XSO3Yj/JvQT0dZR+AlfA@public.gmane.org>
2013-09-04 15:53   ` Kees Cook
     [not found]     ` <CAGXu5jLCTU1MG4fYDzpT=TAP9DRAUuRuhZNB+edJsOzN4iXbDw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-04 16:05       ` Re: Josh Boyer
2013-10-10 14:38 陶治江
2013-10-10 14:46 ` Lucas De Marchi
2013-11-09  5:14 reply15
2013-12-20 11:49 RE: Unify Loan Company
2013-12-21 16:48 (unknown), Alex Barattini
2013-12-23  1:44 ` Aaron Lu
2013-12-23 16:24   ` Re: Alex Barattini
2014-01-20  9:24 Re: Mark Reyes Guus
2014-01-20  9:35 Re: Mark Reyes Guus
2014-01-20  9:35 Re: Mark Reyes Guus
2014-02-10 14:35 Viswanatham, RaviTeja
2014-02-10 18:35 ` Marcel Holtmann
2014-02-11  7:13   ` Re: Andrei Emeltchenko
2014-02-23 16:22 tigran.mkrtchyan
2014-02-23 16:41 ` Trond Myklebust
2014-02-23 18:04   ` Re: Mkrtchyan, Tigran
2014-03-01  6:56 Re: Anton 'EvilMan' Danilov
2014-03-16 12:01 Re; Nieuwenhuis,Sonja S.B.M.
     [not found] <blk-mq updates>
     [not found] ` <1397464212-4454-1-git-send-email-hch@lst.de>
2014-04-15 20:16   ` Jens Axboe
2014-05-02  9:42 "csum failed" that was not detected by scrub Jaap Pieroen
2014-05-02 10:20 ` Duncan
2014-05-02 17:48   ` Jaap Pieroen
2014-05-03 13:31     ` Re: Frank Holton
2014-06-15 20:36 Re: Angela D.Dawes
2014-06-16  7:10 Re: Angela D.Dawes
2014-07-03 16:30 W. Cheung
2014-07-24  8:35 Richard Wong
2014-07-24  8:35 Re: Richard Wong
2014-07-24  8:36 Re: Richard Wong
2014-07-24  8:37 Re: Richard Wong
2014-08-06 12:06 (unknown), Daniel Smedegaard Buus
2014-08-06 17:10 ` Slava Pestov
2014-08-06 17:50   ` Re: Daniel Smedegaard Buus
2014-08-18 15:38 Re: Mrs. Hajar Vaserman.
     [not found] <6A286AB51AD8EC4180C4B2E9EF1D0A027AAD7EFF1E@exmb01.wrschool.net>
2014-09-08 17:36 ` Deborah Mayher
2014-09-08 17:36 ` RE: Deborah Mayher
2014-09-08 17:36 ` RE: Deborah Mayher
2014-10-13  6:18 geohughes
2014-10-13  6:18 Re: geohughes-q6EoVN9bke7vnOemgxGiVw
     [not found] <BEC3AE959B8BB340894B239B5A7882B929B02748@LPPTCPMXMBX02.LPCH.NET>
2014-10-30  9:26 ` Tarzon, Megan
2014-11-14 18:56 milke-Bd11Sj57+SE
2014-11-14 18:56 re: milke
2014-11-14 20:49 salim-Re5JQEeQqe8AvxtiuMwx3w
2014-11-14 20:50 Re: salim
2014-11-17 20:11 Re: salim
2014-11-26 18:38 (unknown), Travis Williams
2014-11-26 20:49 ` NeilBrown
2014-11-29 15:08   ` Re: Peter Grandi
2014-12-01 13:02 Re: Quan Han
2014-12-01 13:02 Re: Quan Han
2014-12-01 13:02 Re: Quan Han
2014-12-01 13:02 Re: Quan Han
2014-12-01 13:02 Re: Quan Han
2014-12-06 13:18 Re: Quan Han
2015-01-28  7:15 "brcmfmac: brcmf_sdio_htclk: HT Avail timeout" on Thinkpad Tablet 10 Sébastien Bourdeauducq
2015-01-30 14:40 ` Arend van Spriel
2015-09-09 16:55   ` Oleg Kostyuchenko
2015-02-28 11:21 Jonathan Cameron
2015-02-28 11:22 ` Jonathan Cameron
     [not found] <CA+yqC4Y2oi4ji-FHuOrXEsxLoYsnckFoX2WYHZwqh5ZGuq7snA@mail.gmail.com>
2015-05-12 15:04 ` Re: Sam Leffler
     [not found] <E1Yz4NQ-0000Cw-B5@feisty.vs19.net>
2015-05-31 15:37 ` Re: Roman Volkov
2015-05-31 15:53   ` Re: Hans de Goede
     [not found] <132D0DB4B968F242BE373429794F35C22559D38329@NHS-PCLI-MBC011.AD1.NHS.NET>
2015-06-08 11:09 ` Practice Trinity (NHS SOUTH SEFTON CCG)
2015-06-08 11:09   ` RE: Practice Trinity (NHS SOUTH SEFTON CCG)
2015-06-10 18:17 RE: Robert Reynolds
     [not found] <CAHxZcryF7pNoENh8vpo-uvcEo5HYA5XgkZFWrLEHM5Hhf5ay+Q@mail.gmail.com>
2015-07-05 16:38 ` t0021
     [not found] <CACy=+DtdZOUT4soNZ=zz+_qhCfM=C8Oa0D5gjRC7QM3nYi4oEw@mail.gmail.com>
2015-07-11 18:37 ` Re: Mustapha Abiola
2015-07-28 18:54 FREELOTTO-u79uwXL29TY76Z2rM5mHXA, PROMO-u79uwXL29TY76Z2rM5mHXA
2015-08-11 10:57 zso2bytom
2015-08-19 13:01 christain147
2015-08-19 13:01 Re: christain147
2015-08-19 13:01 Re: christain147
2015-08-19 13:01 Re: christain147
2015-08-19 14:04 Re: christain147
2015-09-01 12:01 Re: Zariya
2015-09-01 12:01 Re: Zariya
2015-09-01 16:06 Re: Zariya
2015-09-01 16:06 Re: Zariya
2015-09-30 12:06 Apple-Free-Lotto
2015-10-24  5:02 RE: JO Bower
2015-10-24  5:02 RE: JO Bower
2015-10-26  7:30 Davies
2015-10-29  2:40 Unknown, 
2015-11-01 20:03 RE: Mario, Franco
     [not found] <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D@MGCCCMAIL2010-5.mgccc.cc.ms.us>
     [not found] ` <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D-np6RRm/yoI0WMyNdQYMtvx125T75Kgqw2GnX7Qjzz7g@public.gmane.org>
2015-11-24 13:21   ` RE: Amis, Ryann
2015-11-24 13:21 ` RE: Amis, Ryann
2016-01-13 11:34 Alexey Ivanov
2016-01-13 13:12 ` Michal Kazior
     [not found]   ` <CAGvpMW9d8RZGpfBd2H0W35fVUQoi9jcZvQmTC7ztW+dPVcxOhg@mail.gmail.com>
2016-01-13 14:05     ` Re: Michal Kazior
2016-01-13 14:45       ` Re: Alexey Ivanov
2016-01-13 14:54         ` Re: Michal Kazior
2016-01-14  5:36           ` Re: Alexey Ivanov
2016-01-14  7:21             ` Re: Michal Kazior
2016-01-14 11:14               ` Re: Alexey Ivanov
2016-01-14 11:26                 ` Re: Shajakhan, Mohammed Shafi (Mohammed Shafi)
2016-01-14 12:33                   ` Re: Alexey Ivanov
2016-01-14 17:45             ` Re: Peter Oh
2016-01-13 12:46 Adam Richter
     [not found] <569A640D.801@gmail.com>
2016-01-22  7:40 ` (unknown) mr. sindar
2016-01-22  9:24   ` Ralf Mardorf
2016-02-26  1:19 Fredrick Prashanth John Berchmans
2016-02-26  7:37 ` Richard Weinberger
2016-04-22  8:25 (unknown) Daniel Lezcano
2016-04-22  8:27 ` Daniel Lezcano
     [not found] <E5ACCB586875944EB0AE0E3EFA32B4F526FAD24C@exchange0.winona.edu>
2016-05-16 23:02 ` Weichert, Brian
2016-05-18 16:26 (unknown), Warner Losh
     [not found] ` <CANCZdfow154vh3kHqUNUM6CoBsC9Vu3_+SEjFG1dz=FOkc9vsg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-05-18 18:02   ` Rob Herring
     [not found]     ` <CAL_Jsq+s3PjzKCaT03EaqNCoyuwDQ6dXHDF808+U=hjvvfRYdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-05-18 22:01       ` Re: Warner Losh
2016-06-14  7:06 Raphael Poggi
2016-06-24  8:17 ` Raphaël Poggi
2016-06-24 11:49   ` Re: Sascha Hauer
2016-07-15 18:16 Re: Arnold Zeigler
2016-09-01  2:02 Fennec Fox
2016-09-01  3:10 ` Jeff Mahoney
2016-09-01 19:32   ` Re: Kai Krakow
2016-09-10 21:51 Re: Michelle Ouellette
2016-09-27 16:50 Rajat Jain
2016-09-27 16:57 ` Rajat Jain
2016-11-06 21:00 (unknown), Dennis Dataopslag
2016-11-07 16:50 ` Wols Lists
2016-11-07 17:13   ` Re: Wols Lists
2016-11-17 20:33 ` Re: Dennis Dataopslag
2016-11-17 22:12   ` Re: Wols Lists
2016-11-09 17:55 bepi
2016-11-10  6:57 ` Alex Powell
2016-11-10 13:00   ` Re: bepi
2016-11-15  4:40 Apply
2017-02-16 19:41 simran singhal
2017-02-16 19:44 ` SIMRAN SINGHAL
2017-02-23 15:09 Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 ` RE: Qin's Yanjun
2017-02-23 15:09 ` RE: Qin's Yanjun
2017-02-23 15:09 ` RE: Qin's Yanjun
2017-02-23 15:09 ` RE: Qin's Yanjun
2017-02-23 15:09 ` RE: Qin's Yanjun
2017-02-23 15:09 ` RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:10 RE: Qin's Yanjun
2017-02-23 15:13 RE: Qin's Yanjun
2017-02-23 15:15 RE: Qin's Yanjun
2017-03-19 15:00 Ilan Schwarts
2017-03-23 17:12 ` Jeff Mahoney
2017-04-01  5:31 Re: USPS Delivery
2017-04-10  3:17 Qin Yan jun
2017-04-11 14:37 USPS Priority Delivery
2017-04-13 15:58 (unknown), Scott Ellentuch
     [not found] ` <CAK2H+efb3iKA5P3yd7uRqJomci6ENvrB1JRBBmtQEpEvyPMe7w@mail.gmail.com>
2017-04-13 16:38   ` Scott Ellentuch
     [not found] <CALDO+SZPQGmp4VH0LvCh95uXWvwzAoj+wN-rm0pGu5e0wCcyNw@mail.gmail.com>
2017-04-19 18:13 ` Re: Joe Stringer
2017-04-28  8:20 (unknown), Anatolij Gustschin
2017-04-28  8:43 ` Linus Walleij
2017-04-28  9:26   ` Re: Anatolij Gustschin
2017-04-28 18:27 Re: USPS Ground Support
2017-04-29 22:53 Re: USPS Station Management
2017-05-03  5:59 Re: H.A
2017-05-03  6:23 Re: H.A
2017-05-03  6:23 Re: H.A
2017-05-03  6:23 Re: H.A
2017-05-03  6:23 Re: H.A
2017-05-03  6:23 Re: H.A
2017-05-03  6:23 Re: H.A
2017-05-03  6:23 Re: H.A
2017-05-03 11:26 Re: Paul Lopez-Bravo
     [not found] <CAMj-D2DO_CfvD77izsGfggoKP45HSC9aD6auUPAYC9Yeq_aX7w@mail.gmail.com>
2017-05-04 16:44 ` Re: gengdongjiu
2017-05-04 23:57 Re: Tammy
2017-05-16 22:46 Re: USPS Parcels Delivery
     [not found] <20170519213731.21484-1-mrugiero@gmail.com>
2017-05-20  8:48 ` Re: Boris Brezillon
2017-07-07 17:04 Mrs Alice Walton
2017-07-19  8:03 Lynne Smith
2017-07-19  8:03 Re: Lynne Smith
2017-07-27  1:12 Re: Marie Angèle Ouattara
2017-09-24 16:59 Estrin, Alex
     [not found] ` <F3529576D8E232409F431C309E29399336CD9886-8k97q/ur5Z1cIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-09-25  5:48   ` Leon Romanovsky
2017-10-01 10:53 Pierre
2017-10-18 14:31 Mrs. Marie Angèle O
2017-11-01 14:57 Mrs Hsu Wealther
2017-11-13 14:42 Amos Kalonzo
2017-11-13 14:44 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 ` Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:56 Re: Amos Kalonzo
2017-11-13 14:57 Re: Amos Kalonzo
2017-11-13 15:00 Re: Amos Kalonzo
2017-11-13 15:01 Re: Amos Kalonzo
2017-11-13 15:04 Re: Amos Kalonzo
2018-01-24 19:54 Re: Amy Riddering
2018-01-24 22:11 Re: Amy Riddering
2018-01-27  3:56 Re: Emile Kenold
2018-02-27 13:39 [Outreachy kernel] Re: Re: [PATCH] h [Patch] Fixed unnecessary typecasting to in. Error found with checkpatch. Signed-off-by: Nishka Dasgupta <nishka.dasgupta_ug18@ashoka.edu.in> Julia Lawall
2018-02-27 13:53 ` Nishka Dasgupta
2018-02-27 13:58 [Outreachy kernel] Re: Julia Lawall
2018-02-27 14:07 ` Re: Nishka Dasgupta
2018-03-01 19:33 [PATCH v2] staging: vc04_services: bcm2835-camera: Add blank line Greg KH
2018-03-01 20:20 ` Nishka Dasgupta
2018-03-01 20:31   ` Re: Greg KH
2018-03-08 18:23     ` Re: Nishka Dasgupta
2018-03-08 18:33       ` Re: Greg KH
2018-03-01 20:04 [Outreachy kernel] [PATCH v2] staging: ks7010: Remove spaces after typecast to int Julia Lawall
2018-03-01 21:20 ` Nishka Dasgupta
2018-03-01 21:17 Re: Nishka Dasgupta
2018-03-02 18:01 [Outreachy kernel] Help with Mutt Julia Lawall
2018-03-03 18:27 ` Nishka Dasgupta
2018-03-03 18:38   ` Re: Julia Lawall
     [not found] <[PATCH xf86-video-amdgpu 0/3] Add non-desktop and leasing support>
2018-03-03  4:49 ` (unknown), Keith Packard
     [not found]   ` <20180303044931.6902-1-keithp-aN4HjG94KOLQT0dZR+AlfA@public.gmane.org>
2018-03-05 10:02     ` Michel Dänzer
     [not found]       ` <82fc592b-f680-c663-1a0f-7b522ca932d2-otUistvHUpPR7s880joybQ@public.gmane.org>
2018-03-05 16:41         ` Re: Keith Packard
     [not found] <CABxXbAeQTGbiAEaFHK4RUTFGxt0A+KnCCmhJNU9XDivW5=SL-Q@mail.gmail.com>
2018-03-08 18:23 ` Ivan Lapuz
2018-03-08 18:33   ` Tommy Bowditch
2018-03-08 18:36   ` Re: Ibrahim Tachijian
     [not found] <CAAEAJfB76xseRqnYQfRihXY6g0Jyqwt8zfddU1W7CXDg3xEFFg@mail.gmail.com>
2018-04-02 11:20 ` Re: Ratheendran R
2018-04-02 17:19   ` Re: Steve deRosier
2018-04-04  7:31     ` Re: Arend van Spriel
2018-08-28 17:34 Bills, Jason M
2018-08-28 17:59 ` Brad Bishop
2018-08-28 23:26   ` Bills, Jason M
2018-09-04 20:46     ` Brad Bishop
2018-09-04 21:28       ` Re: Ed Tanous
2018-09-04 22:34         ` Re: Brad Bishop
2018-09-04 23:18           ` Re: Ed Tanous
2018-09-04 23:42             ` Re: Brad Bishop
2018-09-05 21:20               ` Re: Bills, Jason M
2018-10-26 12:54 Mohanraj B
2018-10-27 16:55 ` Jens Axboe
2018-10-29 14:20 Re: Beierl, Mark
2018-10-29 14:37 ` Re: Mohanraj B
2018-11-06  1:21 RE, Miss Juliet Muhammad
2018-11-11  4:20 RE, Miss Juliet Muhammad
     [not found] <CACikiw1uNCYKzo9vjG=AZHpARWv-nzkCX=D-aWBssM7vYZrQdQ@mail.gmail.com>
2018-11-12 10:09 ` Ravi Kumar
2018-11-15 13:11 ` Re: Ondrej Mosnacek
     [not found] <CAJUWh6qyHerKg=-oaFN+USa10_Aag5+SYjBOeLCX1qM+WcDUwA@mail.gmail.com>
2018-11-23  7:52 ` Re: Chris Murphy
2018-11-23  9:34   ` Re: Andy Leadbetter
2018-11-24 14:03 RE, Miss Sharifah Ahmad Mustahfa
2018-11-24 14:16 RE, Miss Sharifah Ahmad Mustahfa
2018-11-24 14:16 RE, Miss Sharifah Ahmad Mustahfa
2018-11-24 14:19 RE, Miss Sharifah Ahmad Mustahfa
     [not found] <20181130011234.32674-1-axboe@kernel.dk>
2018-11-30  2:09 ` Jens Axboe
2018-12-04  2:28 RE, Ms Sharifah Ahmad Mustahfa
2018-12-21 15:22 kenneth johansson
2018-12-22  8:18 ` Richard Weinberger
2019-01-07 17:28 [PATCH] arch/arm/mm: Remove duplicate header Souptick Joarder
2019-01-17 11:23 ` Souptick Joarder
2019-01-17 11:28   ` Mike Rapoport
2019-01-31  5:54     ` Souptick Joarder
2019-01-31 12:58       ` Vladimir Murzin
2019-02-01 12:32         ` Re: Souptick Joarder
2019-02-01 12:36           ` Re: Vladimir Murzin
2019-02-01 12:41             ` Re: Souptick Joarder
2019-02-01 13:02               ` Re: Vladimir Murzin
2019-02-01 15:15               ` Re: Russell King - ARM Linux admin
2019-02-01 15:22                 ` Re: Russell King - ARM Linux admin
2019-02-16  0:08 Re: Graham Loan Firm
2019-02-16  4:17 Re; Richard Wahl
2019-02-18 23:41 Pablo Mancilla
2019-02-19  2:20 Re: Pablo Mancilla
2019-05-21  0:06 [PATCH v6 0/3] add new ima hook ima_kexec_cmdline to measure kexec boot cmdline args Prakhar Srivastava
2019-05-21  0:06 ` [PATCH v6 2/3] add a new ima template field buf Prakhar Srivastava
2019-05-24 15:12   ` Mimi Zohar
2019-05-24 15:42     ` Roberto Sassu
2019-05-24 15:47       ` Re: Roberto Sassu
2019-05-24 18:09         ` Re: Mimi Zohar
2019-05-24 19:00           ` Re: prakhar srivastava
2019-05-24 19:15             ` Re: Mimi Zohar
2019-06-13  7:02 Re: Erling Persson Foundation
     [not found] <DM5PR19MB165765D43BE979AB51A9897E9EEB0@DM5PR19MB1657.namprd19.prod.outlook.com>
2019-06-18  9:41 ` Re: Enrico Weigelt, metux IT consult
     [not found] <20190703063132.GA27292@ls3530.dellerweb.de>
2019-07-03  6:38 ` Re: Helge Deller
2019-08-30 19:54 [PATCH] Revert "asm-generic: Remove unneeded __ARCH_WANT_SYS_LLSEEK macro" Arnd Bergmann
     [not found] ` <20190830202959.3539-1-msuchanek@suse.de>
2019-08-30 20:32   ` Arnd Bergmann
     [not found] <CAGkTAxsV0zS_E64criQM-WtPKpSyW2PL=+fjACvnx2=m7piwXg@mail.gmail.com>
2019-09-27  6:37 ` Re: Michael Kerrisk (man-pages)
2019-10-27 21:47 Re: Margaret Kwan Wing Han
2019-11-14 11:37 SGV INVESTMENT
     [not found] <20191205030032.GA26925@ray.huang@amd.com>
2019-12-09  1:26 ` RE: Quan, Evan
2019-12-19 12:31 liming wu
2019-12-20  1:13 ` Andreas Dilger
2019-12-20 17:06 [alsa-devel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples Ben Dooks
2019-12-22 17:08 ` Dmitry Osipenko
2020-01-05  0:04   ` Ben Dooks
2020-01-05  1:48     ` Dmitry Osipenko
2020-01-05 10:53       ` Ben Dooks
2020-01-06 19:00         ` [alsa-devel] [Linux-kernel] " Ben Dooks
2020-01-07  1:39           ` Dmitry Osipenko
2020-01-23 19:38             ` Ben Dooks
2020-01-24 16:50               ` Jon Hunter
2020-01-27 19:20                 ` Dmitry Osipenko
2020-01-28 12:13                   ` Mark Brown
2020-01-28 17:42                     ` Dmitry Osipenko
2020-01-28 18:19                       ` Jon Hunter
2020-01-29  0:17                         ` Dmitry Osipenko
     [not found]                           ` <2ff97414-f0a5-7224-0e53-6cad2ed0ccd2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2020-01-30  8:05                             ` Ben Dooks
     [not found] <mailman.6.1579205674.8101.b.a.t.m.a.n@lists.open-mesh.org>
2020-01-17  7:44 ` Re: Simon Wunderlich
2020-02-06  2:24 Re: Viviane Jose Pereira
2020-02-06  6:36 Re: Viviane Jose Pereira
2020-02-11 22:34 (unknown) Rajat Jain
     [not found] ` <20200211223400.107604-1-rajatja-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2020-02-12  9:30   ` Jarkko Nikula
2020-02-12  9:30     ` Re: Jarkko Nikula
     [not found]     ` <b3397374-0cb8-cf6c-0555-34541a1c108c-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2020-02-12 10:24       ` Re: Andy Shevchenko
2020-02-12 10:24         ` Re: Andy Shevchenko
     [not found] <20200224173733.16323-1-axboe@kernel.dk>
2020-02-24 17:38 ` Re: Jens Axboe
2020-02-26 11:57 (no subject) Ville Syrjälä
2020-02-26 12:08 ` Linus Walleij
2020-02-26 14:34   ` Re: Ville Syrjälä
2020-02-26 14:56     ` Re: Linus Walleij
2020-02-26 15:08       ` Re: Ville Syrjälä
2020-03-03 15:27 Gene Chen
2020-03-04 14:56 ` Matthias Brugger
2020-03-04 14:56   ` Re: Matthias Brugger
2020-03-04 15:15   ` Re: Lee Jones
2020-03-04 15:15     ` Re: Lee Jones
2020-03-04 18:00     ` Re: Matthias Brugger
2020-03-04 18:00       ` Re: Matthias Brugger
2020-03-08 17:19 Re: Francois Pinault
2020-03-08 17:33 Re: Francois Pinault
2020-03-08 17:33 ` Re: Francois Pinault
2020-03-08 17:33 Re: Francois Pinault
2020-03-08 19:12 Re: Francois Pinault
2020-03-16 23:07 Sankalp Bhardwaj
2020-03-17  9:13 ` Valdis Klētnieks
2020-03-17 10:10   ` Re: suvrojit
     [not found] <CALeDE9OeBx6v6nGVjeydgF1vpfX1Bus319h3M1=49PMETdaCtw@mail.gmail.com>
2020-03-20 11:49 ` Re: Josh Boyer
2020-03-27  8:36 (unknown) chenanqing
2020-03-27  8:59 ` Ilya Dryomov
2020-03-27  9:20 (unknown) chenanqing
     [not found] ` <5e7dc543.vYG3wru8B/me1sOV%chenanqing-Oq79sGaMObY@public.gmane.org>
2020-03-27 15:53   ` Lee Duncan
2020-03-27 15:53     ` Re: Lee Duncan
     [not found] <CAPXXXSDVGeEK_NCSkDMwTpuvVxYkWGdQk=L=bz+RN4XLiGZmcg@mail.gmail.com>
     [not found] ` <CAPXXXSBYcU1QamovmP-gVTXms67Xi_QpMCV=V3570q1nnuWqNw@mail.gmail.com>
2020-04-04 21:05   ` Re: Ruslan Bilovol
2020-04-05  1:27     ` Re: Alan Stern
2020-04-05  1:27       ` Re: Alan Stern
     [not found]       ` <CAPXXXSBLHYdHNSS4aM2Ax07+GQSB1WbPziOrk0iVWf-LXLmQRg@mail.gmail.com>
     [not found]         ` <CAPXXXSAajets4AqcBKt8aRd8V1AL4bjAmCyuBOKr8qBG-AHO1A@mail.gmail.com>
2020-04-05  2:51           ` Re: Colin Williams
2020-04-18 12:26 Re: Levi Brown
2020-05-06  5:52 Jiaxun Yang
2020-05-06 17:17 ` Nick Desaulniers
2020-05-14  8:17 Maksim Iushchenko
2020-05-14 10:29 ` fboehm
2020-05-21  0:22 Re: STOREBRAND
2020-06-24 13:54 Re; test02
2020-06-24 13:54 Re; test02
2020-06-30 17:56 (unknown) Vasiliy Kupriakov
2020-07-10 20:36 ` Andy Shevchenko
2020-07-16 21:22 Mauro Rossi
2020-07-20  9:00 ` Christian König
2020-07-20  9:59   ` Re: Mauro Rossi
2020-07-22  2:51     ` Re: Alex Deucher
2020-07-22  7:56       ` Re: Mauro Rossi
2020-07-24 18:31         ` Re: Alex Deucher
2020-07-26 15:31           ` Re: Mauro Rossi
2020-07-27 18:31             ` Re: Alex Deucher
2020-07-27 19:46               ` Re: Mauro Rossi
2020-07-27 19:54                 ` Re: Alex Deucher
2020-08-05 11:02 [PATCH v4] arm64: dts: qcom: Add support for Xiaomi Poco F1 (Beryllium) Amit Pundir
2020-08-06 22:31 ` Konrad Dybcio
2020-08-12 13:37   ` Amit Pundir
2020-08-12 10:54 Re: Alex Anadi
2020-11-06 10:44 Luis Gerhorst
2020-11-06 14:34 ` Pavel Begunkov
2020-11-30 10:31 Oleksandr Tyshchenko
2020-11-30 16:21 ` Alex Bennée
2020-12-29 15:32   ` Re: Roger Pau Monné
2020-12-02  1:10 [PATCH] lib/find_bit: Add find_prev_*_bit functions Yun Levi
2020-12-02  9:47 ` Andy Shevchenko
2020-12-02 10:04   ` Rasmus Villemoes
2020-12-02 11:50     ` Yun Levi
     [not found]       ` <CAAH8bW-jUeFVU-0OrJzK-MuGgKJgZv38RZugEQzFRJHSXFRRDA@mail.gmail.com>
2020-12-02 18:22         ` Yun Levi
2020-12-02 21:26           ` Yury Norov
2020-12-02 22:51             ` Yun Levi
2020-12-03  1:23               ` Yun Levi
2020-12-03  8:33                 ` Rasmus Villemoes
2020-12-03  9:47                   ` Re: Yun Levi
2020-12-03 18:46                     ` Re: Yury Norov
2020-12-03 18:52                       ` Re: Willy Tarreau
2020-12-04  1:36                         ` Re: Yun Levi
2020-12-04 18:14                           ` Re: Yury Norov
2020-12-05  0:45                             ` Re: Yun Levi
2020-12-05 11:10                       ` Re: Rasmus Villemoes
2020-12-05 18:20                         ` Re: Yury Norov
     [not found] <CAGMNF6W8baS_zLYL8DwVsbfPWTP2ohzRB7xutW0X=MUzv93pbA@mail.gmail.com>
2020-12-02 17:09 ` Re: Kun Yi
2020-12-02 17:09   ` Re: Kun Yi
2021-01-08 10:32 Misono Tomohiro
2021-01-08 12:30 ` Arnd Bergmann
2021-01-08 12:30   ` Re: Arnd Bergmann
2021-01-08 10:35 misono.tomohiro
     [not found] <w2q9lf-sait7s-qswxlnzeof4i-7j13q0-zgu9pt-xk3x5enp994p-kewn2p-o86qyug0mutj-91m157sheva0-4k2l8v20kyjp-heu04baxqdc7op987-9zc0bxi0jcgo-wyl26layz5p9-esqncc-g48ass.1610618007875@email.android.com>
2021-01-14 10:09 ` Alexander Kapshuk
2021-01-19  0:10 David Howells
2021-01-20 14:46 ` Jarkko Sakkinen
     [not found] <CAMCTd2kkax9P-OFNHYYz8nKuaKOOkz-zoJ7h2nZ6maUGmjXC-g@mail.gmail.com>
2021-03-16 12:16 ` Re: westjoshuaalan
2021-04-05  0:01 Mitali Borkar
2021-04-06  7:03 ` Arnd Bergmann
2021-04-05 21:12 David Villasana Jiménez
2021-04-06  5:17 ` Greg KH
2021-04-15 13:41 Emmanuel Blot
2021-04-15 16:07 ` Palmer Dabbelt
2021-04-15 22:27 ` Re: Alistair Francis
     [not found] <b84772b0-e009-3b68-4e74-525ad8531f95@gmail.com>
2021-04-23 13:57 ` Re: Ivan Koveshnikov
2021-04-23 20:35   ` Re: Kajetan Puchalski
     [not found] <CAJr+-6ZR2oH0J4D_Ou13JvX8HLUUK=MKQwD0Kn53cmvAuT99bg@mail.gmail.com>
2021-04-27  7:56 ` Re: Fox Chen
2021-05-15 22:57 Dmitry Baryshkov
2021-06-02 21:45 ` Dmitry Baryshkov
     [not found] <60a57e3a.lbqA81rLGmtH2qoy%Radisson97@gmx.de>
2021-05-21 11:04 ` Re: Alejandro Colomar (man-pages)
2021-06-06 19:19 Davidlohr Bueso
2021-06-07 16:02 ` André Almeida
     [not found] <CAFBCWQJX4Xy8Sot7en5JBTuKrzy=_6xFkc+QgOxJEC7G6x+jzg@mail.gmail.com>
2021-06-12  3:43 ` Re: Ammar Faizi
2021-07-16 17:07 Subhasmita Swain
2021-07-16 18:15 ` Lukas Bulwahn
2021-07-27  2:59 [PATCH v9] iomap: Support file tail packing Gao Xiang
2021-07-27 15:10 ` Darrick J. Wong
2021-07-27 15:23   ` Andreas Grünbacher
2021-07-27 15:30   ` Re: Gao Xiang
     [not found] <CAKPXbjesQH_k1Z7k4kNwpoAf-jYgbUaPqPCgNTJZ35peVBy_pA@mail.gmail.com>
2021-08-29 12:01 ` Re: Lukas Bulwahn
2021-09-03 20:51 Mr. James Khmalo
2021-10-08  1:24 Dmitry Baryshkov
2021-10-12 23:59 ` Linus Walleij
2021-10-13  3:46   ` Re: Dmitry Baryshkov
2021-10-13 23:39     ` Re: Linus Walleij
2021-10-17 16:54   ` Re: Bjorn Andersson
2021-10-17 21:31     ` Re: Linus Walleij
2021-10-17 21:35 ` Re: Linus Walleij
     [not found] <20211011231530.GA22856@t>
2021-10-12  1:23 ` James Bottomley
2021-10-12  2:30   ` Bart Van Assche
2021-11-02  9:48 [PATCH v5 00/11] Add support for X86/ACPI camera sensor/PMIC setup with clk and regulator platform data Hans de Goede
2021-11-02  9:49 ` [PATCH v5 05/11] clk: Introduce clk-tps68470 driver Hans de Goede
     [not found]   ` <163588780885.2993099.2088131017920983969@swboyd.mtv.corp.google.com>
2021-11-25 15:01     ` Hans de Goede
     [not found] <CAGGnn3JZdc3ETS_AijasaFUqLY9e5Q1ZHK3+806rtsEBnAo5Og@mail.gmail.com>
2021-11-23 17:20 ` Re: Christian COMMARMOND
     [not found] <20211126221034.21331-1-lukasz.bartosik@semihalf.com--annotate>
2021-11-29 21:59 ` Re: sean.wang
2021-11-29 21:59   ` Re: sean.wang
2021-12-20  6:46 Ralf Beck
2021-12-20  7:55 ` Greg KH
2021-12-20 10:01 ` Re: Oliver Neukum
     [not found] <20211229092443.GA10533@L-PF27918B-1352.localdomain>
2022-01-05  6:05 ` Re: Jason Wang
2022-01-05  6:27   ` Re: Jason Wang
2022-01-13 17:53 Varun Sethi
2022-01-14 17:17 ` Fabio Estevam
2022-01-20 15:28 Myrtle Shah
2022-01-20 15:37 ` Vitaly Wool
2022-01-20 23:29   ` Re: Damien Le Moal
2022-02-04 21:45   ` Re: Palmer Dabbelt
2022-01-24 12:43 Arınç ÜNAL
2022-01-25 14:03 ` Sergio Paracuellos
2022-01-25 15:24   ` Re: Arınç ÜNAL
2022-01-25 15:50     ` Re: Sergio Paracuellos
     [not found] <10b1995b392e490aaa2db645f219015e@dji.com>
2022-01-17 12:54 ` 转发: Caine Chen
2022-02-03 11:49   ` Daniel Vacek
2022-02-10  7:10 [PATCH] net/failsafe: Fix crash due to global devargs syntax parsing from secondary process madhuker.mythri
2022-02-10 15:00 ` Ferruh Yigit
2022-02-10 16:08   ` Gaëtan Rivet
2022-02-11 15:06 Re: Caine Chen
2022-02-13 22:40 Ronnie Sahlberg
2022-02-14  7:52 ` ronnie sahlberg
2022-03-04  8:47 Re: Harald Hauge
     [not found] <20220301070226.2477769-1-jaydeepjd.8914>
2022-03-06 11:10 ` Jaydeep P Das
2022-03-06 11:22   ` Jaydeep Das
     [not found] <Yj1hkpyUqJE9sQ2p@redhat.com>
2022-03-25  7:52 ` Re: Jason Wang
2022-03-25  9:10   ` Re: Michael S. Tsirkin
2022-03-25  9:20     ` Re: Jason Wang
2022-03-25 10:09       ` Re: Michael S. Tsirkin
2022-03-28  4:56         ` Re: Jason Wang
2022-03-28  5:59           ` Re: Michael S. Tsirkin
2022-03-28  6:18             ` Re: Jason Wang
2022-03-28 10:40               ` Re: Michael S. Tsirkin
2022-03-29  7:12                 ` Re: Jason Wang
2022-03-29 14:08                   ` Re: Michael S. Tsirkin
2022-03-30  2:40                     ` Re: Jason Wang
2022-03-30  5:14                       ` Re: Michael S. Tsirkin
2022-03-30  5:53                         ` Re: Jason Wang
2022-03-29  8:35                 ` Re: Thomas Gleixner
2022-03-29 14:37                   ` Re: Michael S. Tsirkin
2022-03-29 18:13                     ` Re: Thomas Gleixner
2022-03-29 22:04                       ` Re: Michael S. Tsirkin
2022-03-30  2:38                         ` Re: Jason Wang
2022-03-30  5:09                           ` Re: Michael S. Tsirkin
2022-03-30  5:53                             ` Re: Jason Wang
2022-04-12  6:55                   ` Re: Michael S. Tsirkin
2022-04-05 21:41 rcu_sched self-detected stall on CPU Miguel Ojeda
2022-04-06  9:31 ` Zhouyi Zhou
2022-04-06 17:00   ` Paul E. McKenney
2022-04-08  7:23     ` Michael Ellerman
2022-04-08 14:42       ` Michael Ellerman
2022-04-13  5:11         ` Nicholas Piggin
2022-04-22 15:53           ` Thomas Gleixner
2022-04-22 15:53             ` Re: Thomas Gleixner
2022-04-23  2:29             ` Re: Nicholas Piggin
2022-04-23  2:29               ` Re: Nicholas Piggin
2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
2022-04-17 17:43 ` [PATCH v3 06/60] target/arm: Change CPUArchState.aarch64 to bool Richard Henderson
2022-04-19 11:17   ` Alex Bennée
2022-05-15 20:36 [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Jiri Olsa
2023-05-20  9:47 ` Ze Gao
2023-05-21  3:58   ` Yonghong Song
2023-05-21 15:10     ` Re: Ze Gao
2023-05-21 20:26       ` Re: Jiri Olsa
2023-05-22  1:36         ` Re: Masami Hiramatsu
2023-05-22  2:07         ` Re: Ze Gao
2023-05-23  4:38           ` Re: Yonghong Song
2023-05-23  5:30           ` Re: Masami Hiramatsu
2023-05-23  6:59             ` Re: Paul E. McKenney
2023-05-25  0:13               ` Re: Masami Hiramatsu
2023-05-21  8:08   ` Re: Jiri Olsa
2023-05-21 10:09     ` Re: Masami Hiramatsu
2023-05-21 14:19       ` Re: Ze Gao
2022-06-06  5:33 Fenil Jain
2022-06-06  5:51 ` Greg Kroah-Hartman
2022-08-26 22:03 Zach O'Keefe
2022-08-31 21:47 ` Yang Shi
2022-09-01  0:24   ` Re: Zach O'Keefe
2022-08-28 21:01 Nick Neumann
2022-09-01 17:44 ` Nick Neumann
2022-09-12 12:36 Christian König
2022-09-13  2:04 ` Alex Deucher
2022-09-14 13:12 Amjad Ouled-Ameur
2022-09-14 13:18 ` Amjad Ouled-Ameur
2022-09-14 13:18   ` Re: Amjad Ouled-Ameur
2022-11-09 14:34 Denis Arefev
2022-11-09 14:44 ` Greg Kroah-Hartman
2022-11-18  2:00 Jiamei Xie
2022-11-18  7:47 ` Michal Orzel
2022-11-18  9:02   ` Re: Julien Grall
2022-11-18 18:11 Re: Mr. JAMES
2022-11-18 19:33 Re: Mr. JAMES
2022-11-21 11:11 Denis Arefev
2022-11-21 14:28 ` Jason Yan
2023-01-18 20:59 [PATCH v5 0/5] CXL Poison List Retrieval & Tracing alison.schofield
2023-01-27  1:59 ` Dan Williams
2023-01-27 16:10   ` Alison Schofield
2023-01-27 19:16     ` Re: Dan Williams
2023-01-27 21:36       ` Re: Alison Schofield
2023-01-27 22:04         ` Re: Dan Williams
     [not found] <20230122193117.GA28689@Debian-50-lenny-64-minimal>
2023-01-22 21:42 ` Re: Alejandro Colomar
2023-01-24 20:01   ` Re: Helge Kreutzmann
2023-02-28  6:32 Re: Mahmut Akten
2023-03-12  6:52 [PATCH v2] uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS583Gen 2 Greg Kroah-Hartman
2023-03-27 13:54 ` Yaroslav Furman
2023-03-27 14:19   ` Greg Kroah-Hartman
2023-05-11 12:58 Ryan Roberts
2023-05-11 13:13 ` Ryan Roberts
2023-05-30  1:31 RE; Olena Shevchenko
2023-05-30  2:46 RE; Olena Shevchenko
     [not found] <CAKEZqKKdQ9EhRobSmq0sV76arfpk6m5XqA-=XQP_M3VRG=M-eg@mail.gmail.com>
2023-06-08  8:13 ` chenlei0x
     [not found] <010d01d999f4$257ae020$7070a060$@mirroredgenetworks.com>
     [not found] ` <CAEhhANphwWt5iOMc5Yqp1tT1HGoG_GsCuUWBWeVX4zxL6JwUiw@mail.gmail.com>
     [not found]   ` <CAEhhANom-MGPCqEk5LXufMkxvnoY0YRUrr0r07s0_7F=eCQH5Q@mail.gmail.com>
2023-06-08 10:51     ` Re: Daniel Little
2023-06-27 11:10 Alvaro a-m
2023-06-27 11:15 ` Michael Kjörling
     [not found] <64b09dbb.630a0220.e80b9.e2ed@mx.google.com>
2023-07-14  8:05 ` Re: Andy Shevchenko
     [not found] <TXJgqLzlM6oCfTXKSqrSBk@txt.att.net>
2023-08-09  5:12 ` Re: Luna Jernberg
     [not found] <c8d43894-7e66-4a01-88fc-10708dc53b6b@amd.com>
     [not found] ` <878r7z4kb4.ffs@tglx>
     [not found]   ` <e79dea49-0c07-4ca2-b359-97dd1bc579c8@amd.com>
     [not found]     ` <87ttqhcotn.ffs@tglx>
     [not found]       ` <87v8avawe0.ffs@tglx>
     [not found]         ` <32bcaa8a-0413-4aa4-97a0-189830da8654@amd.com>
     [not found]           ` <ZTkzYA3w2p3L4SVA@localhost>
     [not found]             ` <87jzra6235.ffs@tglx>
     [not found]               ` <875y2u5s8g.ffs@tglx>
2023-10-25 22:11                 ` Re: Mario Limonciello
2023-10-26  9:27                   ` Re: Thomas Gleixner
2023-11-30 21:51 [NDCTL PATCH v3 2/2] cxl: Add check for regions before disabling memdev Dave Jiang
2024-04-17  6:46 ` Yao Xingtao
2024-04-17 18:14   ` Verma, Vishal L
2024-04-22  7:26     ` Re: Xingtao Yao (Fujitsu)
2023-12-07  4:40 Emma Tebibyte
2023-12-07  5:00 ` Christoph Anton Mitterer
2023-12-07  5:29   ` Re: Lawrence Velázquez
2024-01-16  6:46 meir elisha
2024-01-16  7:05 ` Dan Carpenter
2024-01-18 22:19 [RFC] [PATCH 0/3] xfs: use large folios for buffers Dave Chinner
2024-01-22 10:13 ` Andi Kleen
2024-01-22 11:53   ` Dave Chinner
2024-01-24  0:14 [PATCH v3 0/7] net/gve: RSS Support for GVE Driver Joshua Washington
     [not found] ` <20240126173317.2779230-1-joshwash@google.com>
2024-01-31 14:58   ` Ferruh Yigit
2024-03-07  6:07 KR Kim
2024-03-07  8:01 ` Miquel Raynal
2024-03-08  1:27   ` Re: Kyeongrho.Kim
     [not found]   ` <SE2P216MB210205B301549661575720CC833A2@SE2P216MB2102.KORP216.PROD.OUTLOOK.COM>
2024-03-29  4:41     ` Re: Kyeongrho.Kim
2024-04-19 15:46 George Guo
2024-04-23 16:48 ` Greg KH
2024-04-23 14:21 [PATCH dovetail] x86: irq_pipeline: Add missing definition for !CONFIG_IRQ_PIPELINE Philippe Gerum
2024-04-24  8:58 ` Fabian Scheler
2024-04-24  9:02   ` Scheler, Fabian
     [not found] <CGME20240520102002epcas2p3d0944968114a664556cbd74d53beddee@epcas2p3.samsung.com>
2024-05-20 10:09 ` Minwoo Im
2024-05-20 13:34   ` Vincent Fu
2024-05-21  0:00     ` Re: Minwoo Im
2024-06-11 16:54 Jacob Pan
2024-06-12  2:04 ` Sean Christopherson
2024-06-12  2:55   ` Re: Xin Li
2024-06-26  6:11 Totoro W
2024-06-26  7:09 ` Eduard Zingerman
2024-07-06 11:20 [PATCH v2 09/60] i2c: cp2615: reword according to newest specification Wolfram Sang
2024-07-10  6:41 ` [PATCH v3] " Wolfram Sang
2024-07-10 17:51   ` Bence Csókás
2024-07-14 19:59 raschupkin.ri
2024-07-15 20:20 ` Joe Lawrence
2024-07-15 22:45   ` Re: Roman Rashchupkin
2024-07-16  9:28   ` Re: Nicolai Stange
     [not found]   ` <66963d60.170a0220.70a9a.8866SMTPIN_ADDED_BROKEN@mx.google.com>
2024-07-16  9:53     ` Re: Roman Rashchupkin
2024-07-25 14:52       ` Re: Joe Lawrence
2024-07-16 17:33 ` Re: Song Liu
2024-07-15 21:06 Phil Dennis-Jordan
2024-07-16  6:07 ` Akihiko Odaki
2024-07-17 11:16   ` Re: Phil Dennis-Jordan
2024-08-14  8:03 howard_wang
2024-08-14 15:04 ` Stephen Hemminger
2024-08-16 11:07 Xi Ruoyao
2024-08-19 12:40 ` Huacai Chen
2024-08-19 13:01   ` Re: Jason A. Donenfeld
2024-08-19 15:22     ` Re: Xi Ruoyao
2024-08-19 15:54       ` Re: Xi Ruoyao
2024-08-19 15:22   ` Re: Xi Ruoyao
2024-08-27  9:45 ` Re: Jason A. Donenfeld
2024-08-22 20:54 [PATCH 2/2] scsi: ufs: core: Fix the code for entering hibernation Bao D. Nguyen
2024-08-22 21:08 ` Bart Van Assche
2024-08-23 12:01   ` Manivannan Sadhasivam
2024-08-23 14:23     ` Bart Van Assche
2024-08-23 14:58       ` Manivannan Sadhasivam
2024-08-23 16:07         ` Bart Van Assche
2024-08-23 16:48           ` Manivannan Sadhasivam
2024-08-23 18:05             ` Bart Van Assche
2024-08-24  2:29               ` Manivannan Sadhasivam
2024-08-24  2:48                 ` Bart Van Assche
2024-08-24  3:03                   ` Manivannan Sadhasivam
2024-08-26  6:48                     ` Can Guo
2024-09-13 17:11 David Hunter
2024-09-13 20:39 ` Shuah Khan
2024-09-17  7:10 Akhil P Oommen
2024-09-17  7:24 ` Dmitry Baryshkov
2024-10-10 22:44 Re: PRIVATE
2024-10-15 22:48 Daniel Yang
2024-10-16  1:27 ` Jakub Kicinski
2024-10-17  9:09 Paulo Miguel Almeida
2024-10-17  9:12 ` Paulo Miguel Almeida
2024-11-23  1:39 the Hide
2024-11-23  7:32 ` Christoph Biedl
2024-11-25 19:23 Re: Robert Harewood
2024-11-25 20:13 Re: Robert Harewood
2025-01-08 13:59 Jiang Liu
2025-01-08 14:10 ` Christian König
2025-01-08 16:33 ` Re: Mario Limonciello
2025-01-09  5:34   ` Re: Gerry Liu
2025-01-09 17:10     ` Re: Mario Limonciello
2025-01-13  1:19       ` Re: Gerry Liu
2025-01-13 21:59         ` Re: Mario Limonciello
2025-04-18  7:46 Shung-Hsi Yu
2025-04-18  7:49 ` Shung-Hsi Yu
2025-04-23 17:30 ` Re: patchwork-bot+netdevbpf
2025-04-22  1:53 [PATCH bpf-next] bpf: Remove bpf_get_smp_processor_id_proto Alexei Starovoitov
2025-04-22  8:04 ` Feng Yang
2025-04-22 14:37   ` Alexei Starovoitov
2025-04-24  0:40 Cong Wang
2025-04-24  0:59 ` Jiayuan Chen
2025-04-24  9:19   ` Re: Jiayuan Chen
2025-05-09 17:38 Shawn Anastasio
2025-05-10 19:50 ` Trilok Soni
2025-05-14 20:21 Nicolas Pitre
2025-05-15  8:33 ` Jiri Slaby
2025-07-01 13:44 Emanuele Ghidoli
2025-07-11  2:21 ` Fabio Estevam
     [not found] <CADU64hCr7mshqfBRE2Wp8zf4BHBdJoLLH=VJt2MrHeR+zHOV4w@mail.gmail.com>
2025-07-20 18:26 ` >
2025-07-20 19:30   ` David Lechner
2025-07-20 19:30     ` Re: David Lechner
2025-07-21  6:52     ` Re: Krzysztof Kozlowski
2025-07-21  6:52       ` Re: Krzysztof Kozlowski
     [not found]       ` <CADU64hDZeyaCpHXBmSG1rtHjpxmjejT7asK9oGBUMF55eYeh4w@mail.gmail.com>
2025-07-21 14:09         ` Re: David Lechner
2025-07-21 14:09           ` Re: David Lechner
2025-07-21  7:52   ` Re: Andy Shevchenko
2025-07-21  7:52     ` Re: Andy Shevchenko
2025-08-06  3:34 Sang-Heon Jeon
2025-08-06  3:44 ` Sang-Heon Jeon
2025-08-12 13:34 Baoquan He
2025-08-12 13:49 ` Baoquan He
2025-08-13 15:48 [PATCH 6.15 000/480] 6.15.10-rc1 review Jon Hunter
2025-08-13 17:25 ` Jon Hunter
2025-08-14 15:36   ` Greg KH
2025-08-15 16:20     ` Re: Jon Hunter
2025-08-15 16:53       ` Re: Greg KH
2025-08-18 16:08 [PATCH] documentation/arm64 : kdump fixed typo errors Jonathan Corbet
2025-09-08  9:54 ` hariconscious
2025-09-08 13:23   ` Jonathan Corbet
2025-08-20 14:33 Christian König
2025-08-20 15:23 ` David Hildenbrand
2025-08-21  8:10   ` Re: Christian König
2025-08-25 19:10     ` Re: David Hildenbrand
2025-08-26  8:38       ` Re: Christian König
2025-08-26  8:46         ` Re: David Hildenbrand
2025-08-26  9:00           ` Re: Christian König
2025-08-26  9:17             ` Re: David Hildenbrand
2025-08-26  9:56               ` Re: Christian König
2025-08-26 12:07                 ` Re: David Hildenbrand
2025-08-26 16:09                   ` Re: Christian König
2025-08-26 14:27                 ` Re: Thomas Hellström
2025-08-26 12:37         ` Re: David Hildenbrand
2025-08-23 22:53 [RFC PATCH] time: introduce BOOT_TIME_TRACKER and minimal boot timestamp Thomas Gleixner
2025-09-01  4:05 ` Kaiwan N Billimoria
2025-09-01  5:57   ` Kaiwan N Billimoria
2025-08-27  6:48 [PATCH] net/netfilter/ipvs: Fix data-race in ip_vs_add_service / ip_vs_out_hook Julian Anastasov
2025-08-27 14:43 ` Zhang Tengfei
2025-08-27 21:37   ` Pablo Neira Ayuso
2025-08-29  2:01 xinpeng.wang
2025-08-29  2:42 ` bluez.test.bot
2025-09-15 19:52 Yury Norov (NVIDIA)
2025-09-16 14:48 ` Simon Horman
2025-09-16 15:22   ` Re: Yury Norov
2025-09-16 21:23 Jay Vosburgh
2025-09-16 21:56 ` Jay Vosburgh
2025-10-05 14:16 ssrane_b23
2025-10-05 14:16 ` syzbot
2025-10-06 10:51 [PATCH] counter: microchip-tcb-capture: Allow shared IRQ for multi-channel TCBs Dharma Balasubiramani
2025-10-08  7:06 ` Kamel Bouhara
2025-10-08 20:46   ` Bence Csókás
2025-11-04  9:22 Michael Roach
2025-11-04 10:24 ` Kristoffer Haugsbakk
2025-11-05 14:55   ` Re: Lucas Seiki Oshiro
2025-11-05 15:01     ` Re: Kristoffer Haugsbakk
2025-11-06  0:05       ` Re: Lucas Seiki Oshiro
2025-11-06  8:09         ` Re: Michael Roach
2025-11-05  3:38 niklaus.liu
2025-11-05  8:56 ` AngeloGioacchino Del Regno
2026-01-11 21:10 Wesley B
2026-01-12 13:28 ` Miguel Ojeda
2026-02-02 10:53 Anshumali Gaur
2026-02-03  0:34 ` Jacob Keller
2026-02-25 19:40 [PATCH v5 0/3] iio: frequency: ad9523: fix checkpatch warnings Bhargav Joshi
2026-02-25 19:40 ` Bhargav Joshi
2026-02-25 19:43   ` Andy Shevchenko
2026-03-13 10:11 ezcap401 needs USB_QUIRK_NO_BOS to function on 10gbs usb speed Greg KH
2026-03-13 11:01 ` Vyacheslav Vahnenko
2026-03-13 12:04   ` Greg KH
2026-03-31 10:05 [PATCH] net: ns83820: check DMA mapping errors in hard_start_xmit Paolo Abeni
2026-03-31 11:14 ` Wang Jun
2026-03-31 12:09   ` Eric Dumazet
2026-04-12  6:24 Re; Erick Lorch
2026-04-12 10:09 Re; Erick Lorch
2026-04-12 13:42 Re; Erick Lorch
2026-04-23 16:06 [PATCH] usb: cdns3: gadget: fix request skipping after clearing halt Yongchao Wu
2026-04-27  1:22 ` Peter Chen (CIX)
2026-04-27  9:01   ` Pawel Laszczak
2026-04-27 22:59     ` Peter Chen (CIX)
2026-04-27 23:59       ` Yongchao Wu
2026-04-28  9:58         ` Pawel Laszczak
2026-04-28 14:48           ` Yongchao Wu
2026-05-04  9:15             ` Pawel Laszczak
2026-04-28 18:24 Fabio M. De Francesco
2026-05-01 22:01 ` Dave Jiang
2026-05-09 18:01 Andrea Righi
2026-05-09 18:07 ` Andrea Righi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.