public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix)
       [not found]               ` <20131223163410.GA28220@redhat.com>
@ 2013-12-23 17:27                 ` Oleg Nesterov
  2013-12-23 18:12                   ` Linus Torvalds
  2013-12-23 18:23                   ` Ingo Molnar
  0 siblings, 2 replies; 12+ messages in thread
From: Oleg Nesterov @ 2013-12-23 17:27 UTC (permalink / raw)
  To: Jason Seba, Peter Zijlstra, Ingo Molnar, Linus Torvalds
  Cc: Tomas Henzl, Jack Wang, Suresh Thiagarajan, Viswas G,
	linux-scsi@vger.kernel.org, JBottomley@parallels.com,
	Vasanthalakshmi Tharmarajan, linux-kernel

On 12/23, Oleg Nesterov wrote:
>
> Perhaps we should ask the maintainers upstream? Even if this works, I am
> not sure this is _supposed_ to work. I mean, in theory spin_lock_irqave()
> can be changed as, say
>
> 	#define spin_lock_irqsave(lock, flags)		\
> 		do {    				\
> 			local_irq_save(flags);		\
> 			spin_lock(lock);		\
> 		} while (0)
>
> (and iirc it was defined this way a long ago). In this case "flags" is
> obviously not protected.

Yes, lets ask the maintainers.

In short, is this code

	spinlock_t LOCK;
	unsigned long FLAGS;

	void my_lock(void)
	{
		spin_lock_irqsave(&LOCK, FLAGS);
	}

	void my_unlock(void)
	{
		spin_unlock_irqrestore(&LOCK, FLAGS);
	}

correct or not?

Initially I thought that this is obviously wrong, irqsave/irqrestore
assume that "flags" is owned by the caller, not by the lock. And iirc
this was certainly wrong in the past.

But when I look at spinlock.c it seems that this code can actually work.
_irqsave() writes to FLAGS after it takes the lock, and _irqrestore()
has a copy of FLAGS before it drops this lock.

And it turns out, some users assume this should work, for example

	arch/arm/mach-omap2/powerdomain.c:
		pwrdm_lock() and pwrdm_unlock()

	drivers/net/wireless/brcm80211/brcmfmac/fwsignal.c:
		brcmf_fws_lock() and brcmf_fws_unlock()

seem to do exactly this. Plus the pending patch for drivers/scsi/pm8001/.

So is it documented somewhere that this sequence is correct, or the code
above should be changed even if it happens to work?

Oleg.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix)
  2013-12-23 17:27                 ` spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix) Oleg Nesterov
@ 2013-12-23 18:12                   ` Linus Torvalds
  2013-12-23 18:24                     ` Oleg Nesterov
  2013-12-23 18:23                   ` Ingo Molnar
  1 sibling, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2013-12-23 18:12 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jason Seba, Peter Zijlstra, Ingo Molnar, Tomas Henzl, Jack Wang,
	Suresh Thiagarajan, Viswas G, linux-scsi@vger.kernel.org,
	JBottomley@parallels.com, Vasanthalakshmi Tharmarajan,
	Linux Kernel Mailing List

On Mon, Dec 23, 2013 at 9:27 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>
> In short, is this code
>
>         spinlock_t LOCK;
>         unsigned long FLAGS;
>
>         void my_lock(void)
>         {
>                 spin_lock_irqsave(&LOCK, FLAGS);
>         }
>
>         void my_unlock(void)
>         {
>                 spin_unlock_irqrestore(&LOCK, FLAGS);
>         }
>
> correct or not?

Hell no. "flags" needs to be a thread-private variable, or at least
protected some way (ie the above could work if everything is inside a
bigger lock, to serialize access to FLAGS).

                    Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix)
  2013-12-23 17:27                 ` spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix) Oleg Nesterov
  2013-12-23 18:12                   ` Linus Torvalds
@ 2013-12-23 18:23                   ` Ingo Molnar
  2013-12-23 18:33                     ` Oleg Nesterov
  1 sibling, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2013-12-23 18:23 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jason Seba, Peter Zijlstra, Ingo Molnar, Linus Torvalds,
	Tomas Henzl, Jack Wang, Suresh Thiagarajan, Viswas G,
	linux-scsi@vger.kernel.org, JBottomley@parallels.com,
	Vasanthalakshmi Tharmarajan, linux-kernel


* Oleg Nesterov <oleg@redhat.com> wrote:

> On 12/23, Oleg Nesterov wrote:
> >
> > Perhaps we should ask the maintainers upstream? Even if this works, I am
> > not sure this is _supposed_ to work. I mean, in theory spin_lock_irqave()
> > can be changed as, say
> >
> > 	#define spin_lock_irqsave(lock, flags)		\
> > 		do {    				\
> > 			local_irq_save(flags);		\
> > 			spin_lock(lock);		\
> > 		} while (0)
> >
> > (and iirc it was defined this way a long ago). In this case "flags" is
> > obviously not protected.
> 
> Yes, lets ask the maintainers.
> 
> In short, is this code
> 
> 	spinlock_t LOCK;
> 	unsigned long FLAGS;
> 
> 	void my_lock(void)
> 	{
> 		spin_lock_irqsave(&LOCK, FLAGS);
> 	}
> 
> 	void my_unlock(void)
> 	{
> 		spin_unlock_irqrestore(&LOCK, FLAGS);
> 	}
> 
> correct or not?
> 
> Initially I thought that this is obviously wrong, irqsave/irqrestore 
> assume that "flags" is owned by the caller, not by the lock. And 
> iirc this was certainly wrong in the past.
> 
> But when I look at spinlock.c it seems that this code can actually 
> work. _irqsave() writes to FLAGS after it takes the lock, and 
> _irqrestore() has a copy of FLAGS before it drops this lock.

I don't think that's true: if it was then the lock would not be 
irqsave, a hardware-irq could come in after the lock has been taken 
and before flags are saved+disabled.

So AFAICS this is an unsafe pattern, beyond being ugly as hell.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix)
  2013-12-23 18:12                   ` Linus Torvalds
@ 2013-12-23 18:24                     ` Oleg Nesterov
  2013-12-23 18:43                       ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2013-12-23 18:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jason Seba, Peter Zijlstra, Ingo Molnar, Tomas Henzl, Jack Wang,
	Suresh Thiagarajan, Viswas G, linux-scsi@vger.kernel.org,
	JBottomley@parallels.com, Vasanthalakshmi Tharmarajan,
	Linux Kernel Mailing List

On 12/23, Linus Torvalds wrote:
>
> On Mon, Dec 23, 2013 at 9:27 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > In short, is this code
> >
> >         spinlock_t LOCK;
> >         unsigned long FLAGS;
> >
> >         void my_lock(void)
> >         {
> >                 spin_lock_irqsave(&LOCK, FLAGS);
> >         }
> >
> >         void my_unlock(void)
> >         {
> >                 spin_unlock_irqrestore(&LOCK, FLAGS);
> >         }
> >
> > correct or not?
>
> Hell no. "flags" needs to be a thread-private variable, or at least
> protected some way (ie the above could work if everything is inside a
> bigger lock, to serialize access to FLAGS).

This was my understanding (although, once again, it seems to me this can
suprisingly work with the current implementation).

However, the code above already has the users. Do you think it makes
sense to add something like

	void spinlock_irqsave_careful(spinlock_t *lock, unsigned long *flags)
	{
		unsigned long _flags;
		spinlock_irqsave(lock, _flags);
		*flags = flags;
	}

	void spinlock_irqrestore_careful(spinlock_t *lock, unsigned long *flags)
	{
		unsigned long _flags = *flags;
		spinlock_irqrestore(lock, _flags);
	}

into include/linux/spinlock.h ?

Oleg.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix)
  2013-12-23 18:23                   ` Ingo Molnar
@ 2013-12-23 18:33                     ` Oleg Nesterov
  2013-12-24  8:29                       ` Ingo Molnar
  0 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2013-12-23 18:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jason Seba, Peter Zijlstra, Ingo Molnar, Linus Torvalds,
	Tomas Henzl, Jack Wang, Suresh Thiagarajan, Viswas G,
	linux-scsi@vger.kernel.org, JBottomley@parallels.com,
	Vasanthalakshmi Tharmarajan, linux-kernel

On 12/23, Ingo Molnar wrote:
>
> * Oleg Nesterov <oleg@redhat.com> wrote:
>
> > Initially I thought that this is obviously wrong, irqsave/irqrestore
> > assume that "flags" is owned by the caller, not by the lock. And
> > iirc this was certainly wrong in the past.
> >
> > But when I look at spinlock.c it seems that this code can actually
> > work. _irqsave() writes to FLAGS after it takes the lock, and
> > _irqrestore() has a copy of FLAGS before it drops this lock.
>
> I don't think that's true: if it was then the lock would not be
> irqsave, a hardware-irq could come in after the lock has been taken
> and before flags are saved+disabled.

I do agree that this pattern is not safe, that is why I decided to ask.

But, unless I missed something, with the current implementation
spin_lock_irqsave(lock, global_flags) does:

	unsigned long local_flags;

	local_irq_save(local_flags);
	spin_lock(lock);

	global_flags = local_flags;

so the access to global_flags is actually serialized by lock.

> So AFAICS this is an unsafe pattern, beyond being ugly as hell.

Yes, I think the same.

Oleg.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix)
  2013-12-23 18:24                     ` Oleg Nesterov
@ 2013-12-23 18:43                       ` Linus Torvalds
  0 siblings, 0 replies; 12+ messages in thread
From: Linus Torvalds @ 2013-12-23 18:43 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jason Seba, Peter Zijlstra, Ingo Molnar, Tomas Henzl, Jack Wang,
	Suresh Thiagarajan, Viswas G, linux-scsi@vger.kernel.org,
	JBottomley@parallels.com, Vasanthalakshmi Tharmarajan,
	Linux Kernel Mailing List

On Mon, Dec 23, 2013 at 10:24 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>
> However, the code above already has the users. Do you think it makes
> sense to add something like

No. I think it makes sense to put a big warning on any users you find,
and fart in the general direction of any developer who did that broken
pattern.

Because code like that is obviously crap.

                Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix)
  2013-12-23 18:33                     ` Oleg Nesterov
@ 2013-12-24  8:29                       ` Ingo Molnar
  2013-12-24  9:13                         ` Suresh Thiagarajan
  0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2013-12-24  8:29 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jason Seba, Peter Zijlstra, Ingo Molnar, Linus Torvalds,
	Tomas Henzl, Jack Wang, Suresh Thiagarajan, Viswas G,
	linux-scsi@vger.kernel.org, JBottomley@parallels.com,
	Vasanthalakshmi Tharmarajan, linux-kernel


* Oleg Nesterov <oleg@redhat.com> wrote:

> On 12/23, Ingo Molnar wrote:
> >
> > * Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > > Initially I thought that this is obviously wrong, irqsave/irqrestore
> > > assume that "flags" is owned by the caller, not by the lock. And
> > > iirc this was certainly wrong in the past.
> > >
> > > But when I look at spinlock.c it seems that this code can actually
> > > work. _irqsave() writes to FLAGS after it takes the lock, and
> > > _irqrestore() has a copy of FLAGS before it drops this lock.
> >
> > I don't think that's true: if it was then the lock would not be
> > irqsave, a hardware-irq could come in after the lock has been taken
> > and before flags are saved+disabled.
> 
> I do agree that this pattern is not safe, that is why I decided to ask.
> 
> But, unless I missed something, with the current implementation
> spin_lock_irqsave(lock, global_flags) does:
> 
> 	unsigned long local_flags;
> 
> 	local_irq_save(local_flags);
> 	spin_lock(lock);
> 
> 	global_flags = local_flags;
> 
> so the access to global_flags is actually serialized by lock.

You are right, today that's true technically because IIRC due to Sparc 
quirks we happen to return 'flags' as a return value - still it's very 
ugly and it could break anytime if we decide to do more aggressive 
optimizations and actually directly save into 'flags'.

Note that even today there's a narrow exception: on UP we happen to 
build it the other way around, so that we do:

	local_irq_save(global_flags);
	__acquire(lock);

This does not matter for any real code because on UP there is no 
physical lock and __acquire() is empty code-wise, but any compiler 
driven locking analysis tool using __attribute__ __context__(), if 
built on UP, would see the unsafe locking pattern.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix)
  2013-12-24  8:29                       ` Ingo Molnar
@ 2013-12-24  9:13                         ` Suresh Thiagarajan
  2013-12-24 17:29                           ` James Bottomley
  2013-12-27 16:18                           ` Oleg Nesterov
  0 siblings, 2 replies; 12+ messages in thread
From: Suresh Thiagarajan @ 2013-12-24  9:13 UTC (permalink / raw)
  To: Ingo Molnar, Oleg Nesterov
  Cc: Jason Seba, Peter Zijlstra, Ingo Molnar, Linus Torvalds,
	Tomas Henzl, Jack Wang, Viswas G, linux-scsi@vger.kernel.org,
	JBottomley@parallels.com, Vasanthalakshmi Tharmarajan,
	linux-kernel@vger.kernel.org



On Tue, Dec 24, 2013 at 1:59 PM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Oleg Nesterov <oleg@redhat.com> wrote:
>
>> On 12/23, Ingo Molnar wrote:
>> >
>> > * Oleg Nesterov <oleg@redhat.com> wrote:
>> >
>> > > Initially I thought that this is obviously wrong, irqsave/irqrestore
>> > > assume that "flags" is owned by the caller, not by the lock. And
>> > > iirc this was certainly wrong in the past.
>> > >
>> > > But when I look at spinlock.c it seems that this code can actually
>> > > work. _irqsave() writes to FLAGS after it takes the lock, and
>> > > _irqrestore() has a copy of FLAGS before it drops this lock.
>> >
>> > I don't think that's true: if it was then the lock would not be
>> > irqsave, a hardware-irq could come in after the lock has been taken
>> > and before flags are saved+disabled.
>>
>> I do agree that this pattern is not safe, that is why I decided to ask.
>>
>> But, unless I missed something, with the current implementation
>> spin_lock_irqsave(lock, global_flags) does:
>>
>>       unsigned long local_flags;
>>
>>       local_irq_save(local_flags);
>>       spin_lock(lock);
>>
>>       global_flags = local_flags;
>>
>> so the access to global_flags is actually serialized by lock.

Below is a small pseudo code on protecting/serializing the flag for global access.
struct temp
{
	...
	spinlock_t lock;
	unsigned long lock_flags;
};
void my_lock(struct temp *t)
{
               unsigned long flag; // thread-private variable as suggested
               spin_lock_irqsave(&t->lock, flag);
               t->lock_flags = flag; //updating inside critical section now to serialize the access to flag
}

void my_unlock(struct temp *t)
{
               unsigned long flag = t->lock_flags;
               t->lock_flags = 0;  //clearing it before getting out of critical section
               spin_unlock_irqrestore(&t->lock, flag);
}

Here for unlocking, I could even use spin_unlock_irqrestore(&t->lock, t->lock_flags) directly instead of my_unlock() since t->lock_flags is updated only in my_lock and so there is no need to explicitly clear t->lock_flags. Please let me know if I miss anything here in serializing the global lock flag.

Thanks,
Suresh
>
> You are right, today that's true technically because IIRC due to Sparc
> quirks we happen to return 'flags' as a return value - still it's very
> ugly and it could break anytime if we decide to do more aggressive
> optimizations and actually directly save into 'flags'.
>
> Note that even today there's a narrow exception: on UP we happen to
> build it the other way around, so that we do:
>
>         local_irq_save(global_flags);
>         __acquire(lock);
>
> This does not matter for any real code because on UP there is no
> physical lock and __acquire() is empty code-wise, but any compiler
> driven locking analysis tool using __attribute__ __context__(), if
> built on UP, would see the unsafe locking pattern.
>
> Thanks,
>
>         Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix)
  2013-12-24  9:13                         ` Suresh Thiagarajan
@ 2013-12-24 17:29                           ` James Bottomley
  2013-12-27 16:18                           ` Oleg Nesterov
  1 sibling, 0 replies; 12+ messages in thread
From: James Bottomley @ 2013-12-24 17:29 UTC (permalink / raw)
  To: Suresh Thiagarajan
  Cc: Ingo Molnar, Oleg Nesterov, Jason Seba, Peter Zijlstra,
	Ingo Molnar, Linus Torvalds, Tomas Henzl, Jack Wang, Viswas G,
	linux-scsi@vger.kernel.org, Vasanthalakshmi Tharmarajan,
	linux-kernel@vger.kernel.org

On Tue, 2013-12-24 at 09:13 +0000, Suresh Thiagarajan wrote:
> 
> On Tue, Dec 24, 2013 at 1:59 PM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > * Oleg Nesterov <oleg@redhat.com> wrote:
> >
> >> On 12/23, Ingo Molnar wrote:
> >> >
> >> > * Oleg Nesterov <oleg@redhat.com> wrote:
> >> >
> >> > > Initially I thought that this is obviously wrong, irqsave/irqrestore
> >> > > assume that "flags" is owned by the caller, not by the lock. And
> >> > > iirc this was certainly wrong in the past.
> >> > >
> >> > > But when I look at spinlock.c it seems that this code can actually
> >> > > work. _irqsave() writes to FLAGS after it takes the lock, and
> >> > > _irqrestore() has a copy of FLAGS before it drops this lock.
> >> >
> >> > I don't think that's true: if it was then the lock would not be
> >> > irqsave, a hardware-irq could come in after the lock has been taken
> >> > and before flags are saved+disabled.
> >>
> >> I do agree that this pattern is not safe, that is why I decided to ask.
> >>
> >> But, unless I missed something, with the current implementation
> >> spin_lock_irqsave(lock, global_flags) does:
> >>
> >>       unsigned long local_flags;
> >>
> >>       local_irq_save(local_flags);
> >>       spin_lock(lock);
> >>
> >>       global_flags = local_flags;
> >>
> >> so the access to global_flags is actually serialized by lock.
> 
> Below is a small pseudo code on protecting/serializing the flag for global access.
> struct temp
> {
> 	...
> 	spinlock_t lock;
> 	unsigned long lock_flags;
> };
> void my_lock(struct temp *t)
> {
>                unsigned long flag; // thread-private variable as suggested
>                spin_lock_irqsave(&t->lock, flag);
>                t->lock_flags = flag; //updating inside critical section now to serialize the access to flag
> }
> 
> void my_unlock(struct temp *t)
> {
>                unsigned long flag = t->lock_flags;
>                t->lock_flags = 0;  //clearing it before getting out of critical section
>                spin_unlock_irqrestore(&t->lock, flag);
> }
> 
> Here for unlocking, I could even use spin_unlock_irqrestore(&t->lock,
> t->lock_flags) directly instead of my_unlock() since t->lock_flags is
> updated only in my_lock and so there is no need to explicitly clear
> t->lock_flags. Please let me know if I miss anything here in
> serializing the global lock flag.

I don't think anyone's arguing that you can't do this.  The argument is
more you shouldn't: Lock contention is one of the biggest killers of
performance and getting locking right (avoiding inversions, or even
nested locks at all) is hard.  Therefore you need to keep the shortest
and clearest critical sections in Linux that you can, so the pattern for
locking is to lock and unlock in the same routine with a local flags
variable.

So, rather than develop a primitive that supports and encourages uses of
locking that violate the pattern and is far more likely to cause
performance and other problems, could you just audit the code to see if
the need to carry a lock across a routine could be eliminated (either by
shortening the critical section or by better nesting the calls).

Thanks,

James



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix)
  2013-12-24  9:13                         ` Suresh Thiagarajan
  2013-12-24 17:29                           ` James Bottomley
@ 2013-12-27 16:18                           ` Oleg Nesterov
  2014-01-02 10:31                             ` Suresh Thiagarajan
  1 sibling, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2013-12-27 16:18 UTC (permalink / raw)
  To: Suresh Thiagarajan
  Cc: Ingo Molnar, Jason Seba, Peter Zijlstra, Ingo Molnar,
	Linus Torvalds, Tomas Henzl, Jack Wang, Viswas G,
	linux-scsi@vger.kernel.org, JBottomley@parallels.com,
	Vasanthalakshmi Tharmarajan, linux-kernel@vger.kernel.org

On 12/24, Suresh Thiagarajan wrote:
>
> Below is a small pseudo code on protecting/serializing the flag for global access.
> struct temp
> {
> 	...
> 	spinlock_t lock;
> 	unsigned long lock_flags;
> };
> void my_lock(struct temp *t)
> {
>                unsigned long flag; // thread-private variable as suggested
>                spin_lock_irqsave(&t->lock, flag);
>                t->lock_flags = flag; //updating inside critical section now to serialize the access to flag
> }
>
> void my_unlock(struct temp *t)
> {
>                unsigned long flag = t->lock_flags;
>                t->lock_flags = 0;  //clearing it before getting out of critical section
>                spin_unlock_irqrestore(&t->lock, flag);
> }

Yes, this should work as a quick fix. And you do not need to clear ->lock_flags
in my_unlock().

But when I look at original patch again, I no longer understand why do
you need pm8001_ha->lock_flags at all. Of course I do not understand this
code, I am sure I missed something, but at first glance it seems that only
this sequence

	spin_unlock_irq(&pm8001_ha->lock);
	t->task_done(t);
	spin_lock_irq(&pm8001_ha->lock);

should be fixed?

If yes, why you can't simply do spin_unlock() + spin_lock() around
t->task_done() ? This won't enable irqs, but spin_unlock_irqrestore()
doesn't necessarily enables irqs too, so ->task_done() can run with
irqs disabled?

And note that the pattern above has a lot of users, perhaps it makes
sense to start with something like the patch below?

Hmm. there is another oddity in this code, for example mpi_sata_completion()
does

	} else if (t->uldd_task) {
		spin_unlock_irqrestore(&t->task_state_lock, flags);
		pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
		mb();/* ditto */
		spin_unlock_irq(&pm8001_ha->lock);
		t->task_done(t);
		spin_lock_irq(&pm8001_ha->lock);
	} else if (!t->uldd_task) {
		spin_unlock_irqrestore(&t->task_state_lock, flags);
		pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
		mb();/*ditto*/
		spin_unlock_irq(&pm8001_ha->lock);
		t->task_done(t);
		spin_lock_irq(&pm8001_ha->lock);
	}

and both branches do the same code? mpi_sata_event(), pm8001_chip_sata_req()
too.

Imho, whatever you do, the changelog should tell more.

Oleg.


--- x/drivers/scsi/pm8001/pm8001_sas.h
+++ x/drivers/scsi/pm8001/pm8001_sas.h
@@ -619,6 +619,20 @@ u32 pm8001_get_ncq_tag(struct sas_task *
 void pm8001_ccb_free(struct pm8001_hba_info *pm8001_ha, u32 ccb_idx);
 void pm8001_ccb_task_free(struct pm8001_hba_info *pm8001_ha,
 	struct sas_task *task, struct pm8001_ccb_info *ccb, u32 ccb_idx);
+
+static inline void
+pm8001_ccb_task_free_done(struct pm8001_hba_info *pm8001_ha,
+			struct sas_task *task, struct pm8001_ccb_info *ccb,
+			u32 ccb_idx)
+{
+	pm8001_ccb_task_free(pm8001_ha, task, ccb, ccb_idx);
+	smp_mb(); /* comment */
+	spin_unlock(&pm8001_ha->lock);
+	task->task_done(task);
+	spin_lock(&pm8001_ha->lock);
+}
+
+
 int pm8001_phy_control(struct asd_sas_phy *sas_phy, enum phy_func func,
 	void *funcdata);
 void pm8001_scan_start(struct Scsi_Host *shost);
--- x/drivers/scsi/pm8001/pm8001_hwi.c
+++ x/drivers/scsi/pm8001/pm8001_hwi.c
@@ -2502,11 +2502,7 @@ mpi_sata_completion(struct pm8001_hba_in
 				IO_OPEN_CNX_ERROR_IT_NEXUS_LOSS);
 			ts->resp = SAS_TASK_UNDELIVERED;
 			ts->stat = SAS_QUEUE_FULL;
-			pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-			mb();/*in order to force CPU ordering*/
-			spin_unlock_irq(&pm8001_ha->lock);
-			t->task_done(t);
-			spin_lock_irq(&pm8001_ha->lock);
+			pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 			return;
 		}
 		break;
@@ -2522,11 +2518,7 @@ mpi_sata_completion(struct pm8001_hba_in
 				IO_OPEN_CNX_ERROR_IT_NEXUS_LOSS);
 			ts->resp = SAS_TASK_UNDELIVERED;
 			ts->stat = SAS_QUEUE_FULL;
-			pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-			mb();/*ditto*/
-			spin_unlock_irq(&pm8001_ha->lock);
-			t->task_done(t);
-			spin_lock_irq(&pm8001_ha->lock);
+			pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 			return;
 		}
 		break;
@@ -2550,11 +2542,7 @@ mpi_sata_completion(struct pm8001_hba_in
 				IO_OPEN_CNX_ERROR_STP_RESOURCES_BUSY);
 			ts->resp = SAS_TASK_UNDELIVERED;
 			ts->stat = SAS_QUEUE_FULL;
-			pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-			mb();/* ditto*/
-			spin_unlock_irq(&pm8001_ha->lock);
-			t->task_done(t);
-			spin_lock_irq(&pm8001_ha->lock);
+			pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 			return;
 		}
 		break;
@@ -2617,11 +2605,7 @@ mpi_sata_completion(struct pm8001_hba_in
 				    IO_DS_NON_OPERATIONAL);
 			ts->resp = SAS_TASK_UNDELIVERED;
 			ts->stat = SAS_QUEUE_FULL;
-			pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-			mb();/*ditto*/
-			spin_unlock_irq(&pm8001_ha->lock);
-			t->task_done(t);
-			spin_lock_irq(&pm8001_ha->lock);
+			pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 			return;
 		}
 		break;
@@ -2641,11 +2625,7 @@ mpi_sata_completion(struct pm8001_hba_in
 				    IO_DS_IN_ERROR);
 			ts->resp = SAS_TASK_UNDELIVERED;
 			ts->stat = SAS_QUEUE_FULL;
-			pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-			mb();/*ditto*/
-			spin_unlock_irq(&pm8001_ha->lock);
-			t->task_done(t);
-			spin_lock_irq(&pm8001_ha->lock);
+			pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 			return;
 		}
 		break;
@@ -2674,20 +2654,9 @@ mpi_sata_completion(struct pm8001_hba_in
 			" resp 0x%x stat 0x%x but aborted by upper layer!\n",
 			t, status, ts->resp, ts->stat));
 		pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-	} else if (t->uldd_task) {
-		spin_unlock_irqrestore(&t->task_state_lock, flags);
-		pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-		mb();/* ditto */
-		spin_unlock_irq(&pm8001_ha->lock);
-		t->task_done(t);
-		spin_lock_irq(&pm8001_ha->lock);
-	} else if (!t->uldd_task) {
+	} else {
 		spin_unlock_irqrestore(&t->task_state_lock, flags);
-		pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-		mb();/*ditto*/
-		spin_unlock_irq(&pm8001_ha->lock);
-		t->task_done(t);
-		spin_lock_irq(&pm8001_ha->lock);
+		pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 	}
 }
 
@@ -2796,11 +2765,7 @@ static void mpi_sata_event(struct pm8001
 				IO_OPEN_CNX_ERROR_IT_NEXUS_LOSS);
 			ts->resp = SAS_TASK_COMPLETE;
 			ts->stat = SAS_QUEUE_FULL;
-			pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-			mb();/*ditto*/
-			spin_unlock_irq(&pm8001_ha->lock);
-			t->task_done(t);
-			spin_lock_irq(&pm8001_ha->lock);
+			pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 			return;
 		}
 		break;
@@ -2909,20 +2874,9 @@ static void mpi_sata_event(struct pm8001
 			" resp 0x%x stat 0x%x but aborted by upper layer!\n",
 			t, event, ts->resp, ts->stat));
 		pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-	} else if (t->uldd_task) {
-		spin_unlock_irqrestore(&t->task_state_lock, flags);
-		pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-		mb();/* ditto */
-		spin_unlock_irq(&pm8001_ha->lock);
-		t->task_done(t);
-		spin_lock_irq(&pm8001_ha->lock);
-	} else if (!t->uldd_task) {
+	} else {
 		spin_unlock_irqrestore(&t->task_state_lock, flags);
-		pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-		mb();/*ditto*/
-		spin_unlock_irq(&pm8001_ha->lock);
-		t->task_done(t);
-		spin_lock_irq(&pm8001_ha->lock);
+		pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 	}
 }
 
@@ -4467,23 +4421,10 @@ static int pm8001_chip_sata_req(struct p
 					" stat 0x%x but aborted by upper layer "
 					"\n", task, ts->resp, ts->stat));
 				pm8001_ccb_task_free(pm8001_ha, task, ccb, tag);
-			} else if (task->uldd_task) {
-				spin_unlock_irqrestore(&task->task_state_lock,
-							flags);
-				pm8001_ccb_task_free(pm8001_ha, task, ccb, tag);
-				mb();/* ditto */
-				spin_unlock_irq(&pm8001_ha->lock);
-				task->task_done(task);
-				spin_lock_irq(&pm8001_ha->lock);
-				return 0;
-			} else if (!task->uldd_task) {
+			} else {
 				spin_unlock_irqrestore(&task->task_state_lock,
 							flags);
-				pm8001_ccb_task_free(pm8001_ha, task, ccb, tag);
-				mb();/*ditto*/
-				spin_unlock_irq(&pm8001_ha->lock);
-				task->task_done(task);
-				spin_lock_irq(&pm8001_ha->lock);
+				pm8001_ccb_task_free_done(pm8001_ha, task, ccb, tag);
 				return 0;
 			}
 		}
--- x/drivers/scsi/pm8001/pm80xx_hwi.c
+++ x/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -2175,11 +2175,7 @@ mpi_sata_completion(struct pm8001_hba_in
 				IO_OPEN_CNX_ERROR_IT_NEXUS_LOSS);
 			ts->resp = SAS_TASK_UNDELIVERED;
 			ts->stat = SAS_QUEUE_FULL;
-			pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-			mb();/*in order to force CPU ordering*/
-			spin_unlock_irq(&pm8001_ha->lock);
-			t->task_done(t);
-			spin_lock_irq(&pm8001_ha->lock);
+			pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 			return;
 		}
 		break;
@@ -2195,11 +2191,7 @@ mpi_sata_completion(struct pm8001_hba_in
 				IO_OPEN_CNX_ERROR_IT_NEXUS_LOSS);
 			ts->resp = SAS_TASK_UNDELIVERED;
 			ts->stat = SAS_QUEUE_FULL;
-			pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-			mb();/*ditto*/
-			spin_unlock_irq(&pm8001_ha->lock);
-			t->task_done(t);
-			spin_lock_irq(&pm8001_ha->lock);
+			pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 			return;
 		}
 		break;
@@ -2221,11 +2213,7 @@ mpi_sata_completion(struct pm8001_hba_in
 				IO_OPEN_CNX_ERROR_STP_RESOURCES_BUSY);
 			ts->resp = SAS_TASK_UNDELIVERED;
 			ts->stat = SAS_QUEUE_FULL;
-			pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-			mb();/* ditto*/
-			spin_unlock_irq(&pm8001_ha->lock);
-			t->task_done(t);
-			spin_lock_irq(&pm8001_ha->lock);
+			pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 			return;
 		}
 		break;
@@ -2288,11 +2276,7 @@ mpi_sata_completion(struct pm8001_hba_in
 					IO_DS_NON_OPERATIONAL);
 			ts->resp = SAS_TASK_UNDELIVERED;
 			ts->stat = SAS_QUEUE_FULL;
-			pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-			mb();/*ditto*/
-			spin_unlock_irq(&pm8001_ha->lock);
-			t->task_done(t);
-			spin_lock_irq(&pm8001_ha->lock);
+			pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 			return;
 		}
 		break;
@@ -2312,11 +2296,7 @@ mpi_sata_completion(struct pm8001_hba_in
 					IO_DS_IN_ERROR);
 			ts->resp = SAS_TASK_UNDELIVERED;
 			ts->stat = SAS_QUEUE_FULL;
-			pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-			mb();/*ditto*/
-			spin_unlock_irq(&pm8001_ha->lock);
-			t->task_done(t);
-			spin_lock_irq(&pm8001_ha->lock);
+			pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 			return;
 		}
 		break;
@@ -2345,20 +2325,9 @@ mpi_sata_completion(struct pm8001_hba_in
 			" resp 0x%x stat 0x%x but aborted by upper layer!\n",
 			t, status, ts->resp, ts->stat));
 		pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-	} else if (t->uldd_task) {
-		spin_unlock_irqrestore(&t->task_state_lock, flags);
-		pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-		mb();/* ditto */
-		spin_unlock_irq(&pm8001_ha->lock);
-		t->task_done(t);
-		spin_lock_irq(&pm8001_ha->lock);
-	} else if (!t->uldd_task) {
+	} else {
 		spin_unlock_irqrestore(&t->task_state_lock, flags);
-		pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-		mb();/*ditto*/
-		spin_unlock_irq(&pm8001_ha->lock);
-		t->task_done(t);
-		spin_lock_irq(&pm8001_ha->lock);
+		pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 	}
 }
 
@@ -2470,11 +2439,7 @@ static void mpi_sata_event(struct pm8001
 				IO_OPEN_CNX_ERROR_IT_NEXUS_LOSS);
 			ts->resp = SAS_TASK_COMPLETE;
 			ts->stat = SAS_QUEUE_FULL;
-			pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-			mb();/*ditto*/
-			spin_unlock_irq(&pm8001_ha->lock);
-			t->task_done(t);
-			spin_lock_irq(&pm8001_ha->lock);
+			pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 			return;
 		}
 		break;
@@ -2596,20 +2561,9 @@ static void mpi_sata_event(struct pm8001
 			" resp 0x%x stat 0x%x but aborted by upper layer!\n",
 			t, event, ts->resp, ts->stat));
 		pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-	} else if (t->uldd_task) {
-		spin_unlock_irqrestore(&t->task_state_lock, flags);
-		pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-		mb();/* ditto */
-		spin_unlock_irq(&pm8001_ha->lock);
-		t->task_done(t);
-		spin_lock_irq(&pm8001_ha->lock);
-	} else if (!t->uldd_task) {
+	} else {
 		spin_unlock_irqrestore(&t->task_state_lock, flags);
-		pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
-		mb();/*ditto*/
-		spin_unlock_irq(&pm8001_ha->lock);
-		t->task_done(t);
-		spin_lock_irq(&pm8001_ha->lock);
+		pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
 	}
 }
 
@@ -4304,23 +4258,10 @@ static int pm80xx_chip_sata_req(struct p
 					"\n", task, ts->resp, ts->stat));
 				pm8001_ccb_task_free(pm8001_ha, task, ccb, tag);
 				return 0;
-			} else if (task->uldd_task) {
-				spin_unlock_irqrestore(&task->task_state_lock,
-							flags);
-				pm8001_ccb_task_free(pm8001_ha, task, ccb, tag);
-				mb();/* ditto */
-				spin_unlock_irq(&pm8001_ha->lock);
-				task->task_done(task);
-				spin_lock_irq(&pm8001_ha->lock);
-				return 0;
-			} else if (!task->uldd_task) {
+			} else {
 				spin_unlock_irqrestore(&task->task_state_lock,
 							flags);
-				pm8001_ccb_task_free(pm8001_ha, task, ccb, tag);
-				mb();/*ditto*/
-				spin_unlock_irq(&pm8001_ha->lock);
-				task->task_done(task);
-				spin_lock_irq(&pm8001_ha->lock);
+				pm8001_ccb_task_free_done(pm8001_ha, task, ccb, tag);
 				return 0;
 			}
 		}


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix)
  2013-12-27 16:18                           ` Oleg Nesterov
@ 2014-01-02 10:31                             ` Suresh Thiagarajan
  2014-01-03 20:02                               ` Dan Williams
  0 siblings, 1 reply; 12+ messages in thread
From: Suresh Thiagarajan @ 2014-01-02 10:31 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Jason Seba, Peter Zijlstra, Ingo Molnar,
	Linus Torvalds, Tomas Henzl, Jack Wang, Viswas G,
	linux-scsi@vger.kernel.org, JBottomley@parallels.com,
	Vasanthalakshmi Tharmarajan, linux-kernel@vger.kernel.org



On Fri, Dec 27, 2013 at 9:48 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 12/24, Suresh Thiagarajan wrote:
>>
>> Below is a small pseudo code on protecting/serializing the flag for global access.
>> struct temp
>> {
>>       ...
>>       spinlock_t lock;
>>       unsigned long lock_flags;
>> };
>> void my_lock(struct temp *t)
>> {
>>                unsigned long flag; // thread-private variable as suggested
>>                spin_lock_irqsave(&t->lock, flag);
>>                t->lock_flags = flag; //updating inside critical section now to serialize the access to flag
>> }
>>
>> void my_unlock(struct temp *t)
>> {
>>                unsigned long flag = t->lock_flags;
>>                t->lock_flags = 0;  //clearing it before getting out of critical section
>>                spin_unlock_irqrestore(&t->lock, flag);
>> }
>
> Yes, this should work as a quick fix. And you do not need to clear ->lock_flags
> in my_unlock().
>
> But when I look at original patch again, I no longer understand why do
> you need pm8001_ha->lock_flags at all. Of course I do not understand this
> code, I am sure I missed something, but at first glance it seems that only
> this sequence
>
>         spin_unlock_irq(&pm8001_ha->lock);
>         t->task_done(t);
>         spin_lock_irq(&pm8001_ha->lock);
>
> should be fixed?
>
> If yes, why you can't simply do spin_unlock() + spin_lock() around
> t->task_done() ? This won't enable irqs, but spin_unlock_irqrestore()
> doesn't necessarily enables irqs too, so ->task_done() can run with
> irqs disabled?
>
> And note that the pattern above has a lot of users, perhaps it makes
> sense to start with something like the patch below?

Thanks James, Oleg and all for your inputs.
Will start with review and testing this patch and then work/investigate to keep shortest and clearest critical 
section so that we can have the lock and unlock within the same routine.

Regards,
Suresh
>
> Hmm. there is another oddity in this code, for example mpi_sata_completion()
> does
>
>         } else if (t->uldd_task) {
>                 spin_unlock_irqrestore(&t->task_state_lock, flags);
>                 pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
>                 mb();/* ditto */
>                 spin_unlock_irq(&pm8001_ha->lock);
>                 t->task_done(t);
>                 spin_lock_irq(&pm8001_ha->lock);
>         } else if (!t->uldd_task) {
>                 spin_unlock_irqrestore(&t->task_state_lock, flags);
>                 pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
>                 mb();/*ditto*/
>                 spin_unlock_irq(&pm8001_ha->lock);
>                 t->task_done(t);
>                 spin_lock_irq(&pm8001_ha->lock);
>         }
>
> and both branches do the same code? mpi_sata_event(), pm8001_chip_sata_req()
> too.
>
> Imho, whatever you do, the changelog should tell more.
>
> Oleg.
>
>
> --- x/drivers/scsi/pm8001/pm8001_sas.h
> +++ x/drivers/scsi/pm8001/pm8001_sas.h
> @@ -619,6 +619,20 @@ u32 pm8001_get_ncq_tag(struct sas_task *
>  void pm8001_ccb_free(struct pm8001_hba_info *pm8001_ha, u32 ccb_idx);
>  void pm8001_ccb_task_free(struct pm8001_hba_info *pm8001_ha,
>         struct sas_task *task, struct pm8001_ccb_info *ccb, u32 ccb_idx);
> +
> +static inline void
> +pm8001_ccb_task_free_done(struct pm8001_hba_info *pm8001_ha,
> +                       struct sas_task *task, struct pm8001_ccb_info *ccb,
> +                       u32 ccb_idx)
> +{
> +       pm8001_ccb_task_free(pm8001_ha, task, ccb, ccb_idx);
> +       smp_mb(); /* comment */
> +       spin_unlock(&pm8001_ha->lock);
> +       task->task_done(task);
> +       spin_lock(&pm8001_ha->lock);
> +}
> +
> +
>  int pm8001_phy_control(struct asd_sas_phy *sas_phy, enum phy_func func,
>         void *funcdata);
>  void pm8001_scan_start(struct Scsi_Host *shost);
> --- x/drivers/scsi/pm8001/pm8001_hwi.c
> +++ x/drivers/scsi/pm8001/pm8001_hwi.c
> @@ -2502,11 +2502,7 @@ mpi_sata_completion(struct pm8001_hba_in
>                                 IO_OPEN_CNX_ERROR_IT_NEXUS_LOSS);
>                         ts->resp = SAS_TASK_UNDELIVERED;
>                         ts->stat = SAS_QUEUE_FULL;
> -                       pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -                       mb();/*in order to force CPU ordering*/
> -                       spin_unlock_irq(&pm8001_ha->lock);
> -                       t->task_done(t);
> -                       spin_lock_irq(&pm8001_ha->lock);
> +                       pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>                         return;
>                 }
>                 break;
> @@ -2522,11 +2518,7 @@ mpi_sata_completion(struct pm8001_hba_in
>                                 IO_OPEN_CNX_ERROR_IT_NEXUS_LOSS);
>                         ts->resp = SAS_TASK_UNDELIVERED;
>                         ts->stat = SAS_QUEUE_FULL;
> -                       pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -                       mb();/*ditto*/
> -                       spin_unlock_irq(&pm8001_ha->lock);
> -                       t->task_done(t);
> -                       spin_lock_irq(&pm8001_ha->lock);
> +                       pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>                         return;
>                 }
>                 break;
> @@ -2550,11 +2542,7 @@ mpi_sata_completion(struct pm8001_hba_in
>                                 IO_OPEN_CNX_ERROR_STP_RESOURCES_BUSY);
>                         ts->resp = SAS_TASK_UNDELIVERED;
>                         ts->stat = SAS_QUEUE_FULL;
> -                       pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -                       mb();/* ditto*/
> -                       spin_unlock_irq(&pm8001_ha->lock);
> -                       t->task_done(t);
> -                       spin_lock_irq(&pm8001_ha->lock);
> +                       pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>                         return;
>                 }
>                 break;
> @@ -2617,11 +2605,7 @@ mpi_sata_completion(struct pm8001_hba_in
>                                     IO_DS_NON_OPERATIONAL);
>                         ts->resp = SAS_TASK_UNDELIVERED;
>                         ts->stat = SAS_QUEUE_FULL;
> -                       pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -                       mb();/*ditto*/
> -                       spin_unlock_irq(&pm8001_ha->lock);
> -                       t->task_done(t);
> -                       spin_lock_irq(&pm8001_ha->lock);
> +                       pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>                         return;
>                 }
>                 break;
> @@ -2641,11 +2625,7 @@ mpi_sata_completion(struct pm8001_hba_in
>                                     IO_DS_IN_ERROR);
>                         ts->resp = SAS_TASK_UNDELIVERED;
>                         ts->stat = SAS_QUEUE_FULL;
> -                       pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -                       mb();/*ditto*/
> -                       spin_unlock_irq(&pm8001_ha->lock);
> -                       t->task_done(t);
> -                       spin_lock_irq(&pm8001_ha->lock);
> +                       pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>                         return;
>                 }
>                 break;
> @@ -2674,20 +2654,9 @@ mpi_sata_completion(struct pm8001_hba_in
>                         " resp 0x%x stat 0x%x but aborted by upper layer!\n",
>                         t, status, ts->resp, ts->stat));
>                 pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -       } else if (t->uldd_task) {
> -               spin_unlock_irqrestore(&t->task_state_lock, flags);
> -               pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -               mb();/* ditto */
> -               spin_unlock_irq(&pm8001_ha->lock);
> -               t->task_done(t);
> -               spin_lock_irq(&pm8001_ha->lock);
> -       } else if (!t->uldd_task) {
> +       } else {
>                 spin_unlock_irqrestore(&t->task_state_lock, flags);
> -               pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -               mb();/*ditto*/
> -               spin_unlock_irq(&pm8001_ha->lock);
> -               t->task_done(t);
> -               spin_lock_irq(&pm8001_ha->lock);
> +               pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>         }
>  }
>
> @@ -2796,11 +2765,7 @@ static void mpi_sata_event(struct pm8001
>                                 IO_OPEN_CNX_ERROR_IT_NEXUS_LOSS);
>                         ts->resp = SAS_TASK_COMPLETE;
>                         ts->stat = SAS_QUEUE_FULL;
> -                       pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -                       mb();/*ditto*/
> -                       spin_unlock_irq(&pm8001_ha->lock);
> -                       t->task_done(t);
> -                       spin_lock_irq(&pm8001_ha->lock);
> +                       pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>                         return;
>                 }
>                 break;
> @@ -2909,20 +2874,9 @@ static void mpi_sata_event(struct pm8001
>                         " resp 0x%x stat 0x%x but aborted by upper layer!\n",
>                         t, event, ts->resp, ts->stat));
>                 pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -       } else if (t->uldd_task) {
> -               spin_unlock_irqrestore(&t->task_state_lock, flags);
> -               pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -               mb();/* ditto */
> -               spin_unlock_irq(&pm8001_ha->lock);
> -               t->task_done(t);
> -               spin_lock_irq(&pm8001_ha->lock);
> -       } else if (!t->uldd_task) {
> +       } else {
>                 spin_unlock_irqrestore(&t->task_state_lock, flags);
> -               pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -               mb();/*ditto*/
> -               spin_unlock_irq(&pm8001_ha->lock);
> -               t->task_done(t);
> -               spin_lock_irq(&pm8001_ha->lock);
> +               pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>         }
>  }
>
> @@ -4467,23 +4421,10 @@ static int pm8001_chip_sata_req(struct p
>                                         " stat 0x%x but aborted by upper layer "
>                                         "\n", task, ts->resp, ts->stat));
>                                 pm8001_ccb_task_free(pm8001_ha, task, ccb, tag);
> -                       } else if (task->uldd_task) {
> -                               spin_unlock_irqrestore(&task->task_state_lock,
> -                                                       flags);
> -                               pm8001_ccb_task_free(pm8001_ha, task, ccb, tag);
> -                               mb();/* ditto */
> -                               spin_unlock_irq(&pm8001_ha->lock);
> -                               task->task_done(task);
> -                               spin_lock_irq(&pm8001_ha->lock);
> -                               return 0;
> -                       } else if (!task->uldd_task) {
> +                       } else {
>                                 spin_unlock_irqrestore(&task->task_state_lock,
>                                                         flags);
> -                               pm8001_ccb_task_free(pm8001_ha, task, ccb, tag);
> -                               mb();/*ditto*/
> -                               spin_unlock_irq(&pm8001_ha->lock);
> -                               task->task_done(task);
> -                               spin_lock_irq(&pm8001_ha->lock);
> +                               pm8001_ccb_task_free_done(pm8001_ha, task, ccb, tag);
>                                 return 0;
>                         }
>                 }
> --- x/drivers/scsi/pm8001/pm80xx_hwi.c
> +++ x/drivers/scsi/pm8001/pm80xx_hwi.c
> @@ -2175,11 +2175,7 @@ mpi_sata_completion(struct pm8001_hba_in
>                                 IO_OPEN_CNX_ERROR_IT_NEXUS_LOSS);
>                         ts->resp = SAS_TASK_UNDELIVERED;
>                         ts->stat = SAS_QUEUE_FULL;
> -                       pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -                       mb();/*in order to force CPU ordering*/
> -                       spin_unlock_irq(&pm8001_ha->lock);
> -                       t->task_done(t);
> -                       spin_lock_irq(&pm8001_ha->lock);
> +                       pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>                         return;
>                 }
>                 break;
> @@ -2195,11 +2191,7 @@ mpi_sata_completion(struct pm8001_hba_in
>                                 IO_OPEN_CNX_ERROR_IT_NEXUS_LOSS);
>                         ts->resp = SAS_TASK_UNDELIVERED;
>                         ts->stat = SAS_QUEUE_FULL;
> -                       pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -                       mb();/*ditto*/
> -                       spin_unlock_irq(&pm8001_ha->lock);
> -                       t->task_done(t);
> -                       spin_lock_irq(&pm8001_ha->lock);
> +                       pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>                         return;
>                 }
>                 break;
> @@ -2221,11 +2213,7 @@ mpi_sata_completion(struct pm8001_hba_in
>                                 IO_OPEN_CNX_ERROR_STP_RESOURCES_BUSY);
>                         ts->resp = SAS_TASK_UNDELIVERED;
>                         ts->stat = SAS_QUEUE_FULL;
> -                       pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -                       mb();/* ditto*/
> -                       spin_unlock_irq(&pm8001_ha->lock);
> -                       t->task_done(t);
> -                       spin_lock_irq(&pm8001_ha->lock);
> +                       pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>                         return;
>                 }
>                 break;
> @@ -2288,11 +2276,7 @@ mpi_sata_completion(struct pm8001_hba_in
>                                         IO_DS_NON_OPERATIONAL);
>                         ts->resp = SAS_TASK_UNDELIVERED;
>                         ts->stat = SAS_QUEUE_FULL;
> -                       pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -                       mb();/*ditto*/
> -                       spin_unlock_irq(&pm8001_ha->lock);
> -                       t->task_done(t);
> -                       spin_lock_irq(&pm8001_ha->lock);
> +                       pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>                         return;
>                 }
>                 break;
> @@ -2312,11 +2296,7 @@ mpi_sata_completion(struct pm8001_hba_in
>                                         IO_DS_IN_ERROR);
>                         ts->resp = SAS_TASK_UNDELIVERED;
>                         ts->stat = SAS_QUEUE_FULL;
> -                       pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -                       mb();/*ditto*/
> -                       spin_unlock_irq(&pm8001_ha->lock);
> -                       t->task_done(t);
> -                       spin_lock_irq(&pm8001_ha->lock);
> +                       pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>                         return;
>                 }
>                 break;
> @@ -2345,20 +2325,9 @@ mpi_sata_completion(struct pm8001_hba_in
>                         " resp 0x%x stat 0x%x but aborted by upper layer!\n",
>                         t, status, ts->resp, ts->stat));
>                 pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -       } else if (t->uldd_task) {
> -               spin_unlock_irqrestore(&t->task_state_lock, flags);
> -               pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -               mb();/* ditto */
> -               spin_unlock_irq(&pm8001_ha->lock);
> -               t->task_done(t);
> -               spin_lock_irq(&pm8001_ha->lock);
> -       } else if (!t->uldd_task) {
> +       } else {
>                 spin_unlock_irqrestore(&t->task_state_lock, flags);
> -               pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -               mb();/*ditto*/
> -               spin_unlock_irq(&pm8001_ha->lock);
> -               t->task_done(t);
> -               spin_lock_irq(&pm8001_ha->lock);
> +               pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>         }
>  }
>
> @@ -2470,11 +2439,7 @@ static void mpi_sata_event(struct pm8001
>                                 IO_OPEN_CNX_ERROR_IT_NEXUS_LOSS);
>                         ts->resp = SAS_TASK_COMPLETE;
>                         ts->stat = SAS_QUEUE_FULL;
> -                       pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -                       mb();/*ditto*/
> -                       spin_unlock_irq(&pm8001_ha->lock);
> -                       t->task_done(t);
> -                       spin_lock_irq(&pm8001_ha->lock);
> +                       pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>                         return;
>                 }
>                 break;
> @@ -2596,20 +2561,9 @@ static void mpi_sata_event(struct pm8001
>                         " resp 0x%x stat 0x%x but aborted by upper layer!\n",
>                         t, event, ts->resp, ts->stat));
>                 pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -       } else if (t->uldd_task) {
> -               spin_unlock_irqrestore(&t->task_state_lock, flags);
> -               pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -               mb();/* ditto */
> -               spin_unlock_irq(&pm8001_ha->lock);
> -               t->task_done(t);
> -               spin_lock_irq(&pm8001_ha->lock);
> -       } else if (!t->uldd_task) {
> +       } else {
>                 spin_unlock_irqrestore(&t->task_state_lock, flags);
> -               pm8001_ccb_task_free(pm8001_ha, t, ccb, tag);
> -               mb();/*ditto*/
> -               spin_unlock_irq(&pm8001_ha->lock);
> -               t->task_done(t);
> -               spin_lock_irq(&pm8001_ha->lock);
> +               pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag);
>         }
>  }
>
> @@ -4304,23 +4258,10 @@ static int pm80xx_chip_sata_req(struct p
>                                         "\n", task, ts->resp, ts->stat));
>                                 pm8001_ccb_task_free(pm8001_ha, task, ccb, tag);
>                                 return 0;
> -                       } else if (task->uldd_task) {
> -                               spin_unlock_irqrestore(&task->task_state_lock,
> -                                                       flags);
> -                               pm8001_ccb_task_free(pm8001_ha, task, ccb, tag);
> -                               mb();/* ditto */
> -                               spin_unlock_irq(&pm8001_ha->lock);
> -                               task->task_done(task);
> -                               spin_lock_irq(&pm8001_ha->lock);
> -                               return 0;
> -                       } else if (!task->uldd_task) {
> +                       } else {
>                                 spin_unlock_irqrestore(&task->task_state_lock,
>                                                         flags);
> -                               pm8001_ccb_task_free(pm8001_ha, task, ccb, tag);
> -                               mb();/*ditto*/
> -                               spin_unlock_irq(&pm8001_ha->lock);
> -                               task->task_done(task);
> -                               spin_lock_irq(&pm8001_ha->lock);
> +                               pm8001_ccb_task_free_done(pm8001_ha, task, ccb, tag);
>                                 return 0;
>                         }
>                 }
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix)
  2014-01-02 10:31                             ` Suresh Thiagarajan
@ 2014-01-03 20:02                               ` Dan Williams
  0 siblings, 0 replies; 12+ messages in thread
From: Dan Williams @ 2014-01-03 20:02 UTC (permalink / raw)
  To: Suresh Thiagarajan
  Cc: Oleg Nesterov, Ingo Molnar, Jason Seba, Peter Zijlstra,
	Ingo Molnar, Linus Torvalds, Tomas Henzl, Jack Wang, Viswas G,
	linux-scsi@vger.kernel.org, JBottomley@parallels.com,
	Vasanthalakshmi Tharmarajan, linux-kernel@vger.kernel.org

On Thu, Jan 2, 2014 at 2:31 AM, Suresh Thiagarajan
<Suresh.Thiagarajan@pmcs.com> wrote:
>
>
> On Fri, Dec 27, 2013 at 9:48 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>> On 12/24, Suresh Thiagarajan wrote:
>>>
>>> Below is a small pseudo code on protecting/serializing the flag for global access.
>>> struct temp
>>> {
>>>       ...
>>>       spinlock_t lock;
>>>       unsigned long lock_flags;
>>> };
>>> void my_lock(struct temp *t)
>>> {
>>>                unsigned long flag; // thread-private variable as suggested
>>>                spin_lock_irqsave(&t->lock, flag);
>>>                t->lock_flags = flag; //updating inside critical section now to serialize the access to flag
>>> }
>>>
>>> void my_unlock(struct temp *t)
>>> {
>>>                unsigned long flag = t->lock_flags;
>>>                t->lock_flags = 0;  //clearing it before getting out of critical section
>>>                spin_unlock_irqrestore(&t->lock, flag);
>>> }
>>
>> Yes, this should work as a quick fix. And you do not need to clear ->lock_flags
>> in my_unlock().
>>
>> But when I look at original patch again, I no longer understand why do
>> you need pm8001_ha->lock_flags at all. Of course I do not understand this
>> code, I am sure I missed something, but at first glance it seems that only
>> this sequence
>>
>>         spin_unlock_irq(&pm8001_ha->lock);
>>         t->task_done(t);
>>         spin_lock_irq(&pm8001_ha->lock);
>>
>> should be fixed?
>>
>> If yes, why you can't simply do spin_unlock() + spin_lock() around
>> t->task_done() ? This won't enable irqs, but spin_unlock_irqrestore()
>> doesn't necessarily enables irqs too, so ->task_done() can run with
>> irqs disabled?
>>
>> And note that the pattern above has a lot of users, perhaps it makes
>> sense to start with something like the patch below?
>
> Thanks James, Oleg and all for your inputs.
> Will start with review and testing this patch and then work/investigate to keep shortest and clearest critical
> section so that we can have the lock and unlock within the same routine.
>

Fwiw we solved this in libsas a while back with a similar pattern
proposed by Oleg:

unsigned long flags;

local_irq_save(flags);
spin_unlock(lock);
...
spin_lock_lock(lock);
local_irq_restore(flags);

See commit 312d3e56119a "[SCSI] libsas: remove ata_port.lock
management duties from lldds"

--
Dan

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-01-03 20:02 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1387366123-3950-1-git-send-email-Viswas.G@pmcs.com>
     [not found] ` <52B8357D.60202@redhat.com>
     [not found]   ` <52B83B89.9040700@gmail.com>
     [not found]     ` <CE2C58F938F4C44EA74FFF880BAA7E5E1C68AA8A@BBYEXM01.pmc-sierra.internal>
     [not found]       ` <CAD0Z9T+69VoSnDY-Sj7a1QfJ-KLJYVyL9=WQ123XMfRvQmGdsg@mail.gmail.com>
     [not found]         ` <52B8518B.4060204@gmail.com>
     [not found]           ` <52B8569D.4050101@redhat.com>
     [not found]             ` <CAD0Z9TK-cn=X3ywSzPnEQN-FB+ue+0dHQ3EHuwonC+KfgrmPMA@mail.gmail.com>
     [not found]               ` <20131223163410.GA28220@redhat.com>
2013-12-23 17:27                 ` spinlock_irqsave() && flags (Was: pm80xx: Spinlock fix) Oleg Nesterov
2013-12-23 18:12                   ` Linus Torvalds
2013-12-23 18:24                     ` Oleg Nesterov
2013-12-23 18:43                       ` Linus Torvalds
2013-12-23 18:23                   ` Ingo Molnar
2013-12-23 18:33                     ` Oleg Nesterov
2013-12-24  8:29                       ` Ingo Molnar
2013-12-24  9:13                         ` Suresh Thiagarajan
2013-12-24 17:29                           ` James Bottomley
2013-12-27 16:18                           ` Oleg Nesterov
2014-01-02 10:31                             ` Suresh Thiagarajan
2014-01-03 20:02                               ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox