* [patch 0/6] 3.14-rt1 fixes @ 2014-05-02 11:12 Mike Galbraith 2014-05-02 11:39 ` Pavel Vasilyev 2014-05-02 13:44 ` Sebastian Andrzej Siewior 0 siblings, 2 replies; 12+ messages in thread From: Mike Galbraith @ 2014-05-02 11:12 UTC (permalink / raw) To: RT; +Cc: Sebastian Andrzej Siewior, Steven Rostedt, Thomas Gleixner The following patches are fixes that fell out of testing rt1. Patches 1-4 are intended to be folded into existing patches, 5-6 are replacements. fold: 1/6 - preempt-lazy-support.patch 2/6 - x86-preempt-lazy.patch 3/6 - hotplug-light-get-online-cpus.patch 4/6 - stomp-machine-raw-lock.patch replace: 5/6 - (prep) 6/6 - stomp-machine-deal-clever-with-stopper-lock.patch drop: (buggy - calls migrate_disable() _after_ maybe blocking) migrate_disable-pushd-down-in-atomic_dec_and_spin_lo.patch -Mike ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch 0/6] 3.14-rt1 fixes 2014-05-02 11:12 [patch 0/6] 3.14-rt1 fixes Mike Galbraith @ 2014-05-02 11:39 ` Pavel Vasilyev 2014-05-02 11:43 ` Mike Galbraith 2014-05-02 13:44 ` Sebastian Andrzej Siewior 1 sibling, 1 reply; 12+ messages in thread From: Pavel Vasilyev @ 2014-05-02 11:39 UTC (permalink / raw) To: Mike Galbraith, RT Cc: Sebastian Andrzej Siewior, Steven Rostedt, Thomas Gleixner 02.05.2014 15:12, Mike Galbraith пишет: > The following patches are fixes that fell out of testing rt1. Patches > 1-4 are intended to be folded into existing patches, 5-6 are > replacements. > > fold: > 1/6 - preempt-lazy-support.patch > 2/6 - x86-preempt-lazy.patch > 3/6 - hotplug-light-get-online-cpus.patch > 4/6 - stomp-machine-raw-lock.patch > > replace: > 5/6 - (prep) > 6/6 - stomp-machine-deal-clever-with-stopper-lock.patch > > drop: (buggy - calls migrate_disable() _after_ maybe blocking) > migrate_disable-pushd-down-in-atomic_dec_and_spin_lo.patch Cool, but you can the same, but for 3.14.2 ? -- Pavel. -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch 0/6] 3.14-rt1 fixes 2014-05-02 11:39 ` Pavel Vasilyev @ 2014-05-02 11:43 ` Mike Galbraith 2014-05-03 8:32 ` Nicholas Mc Guire 0 siblings, 1 reply; 12+ messages in thread From: Mike Galbraith @ 2014-05-02 11:43 UTC (permalink / raw) To: pavel; +Cc: RT, Sebastian Andrzej Siewior, Steven Rostedt, Thomas Gleixner On Fri, 2014-05-02 at 15:39 +0400, Pavel Vasilyev wrote: > 02.05.2014 15:12, Mike Galbraith пишет: > > The following patches are fixes that fell out of testing rt1. Patches > > 1-4 are intended to be folded into existing patches, 5-6 are > > replacements. > > > > fold: > > 1/6 - preempt-lazy-support.patch > > 2/6 - x86-preempt-lazy.patch > > 3/6 - hotplug-light-get-online-cpus.patch > > 4/6 - stomp-machine-raw-lock.patch > > > > replace: > > 5/6 - (prep) > > 6/6 - stomp-machine-deal-clever-with-stopper-lock.patch > > > > drop: (buggy - calls migrate_disable() _after_ maybe blocking) > > migrate_disable-pushd-down-in-atomic_dec_and_spin_lo.patch > > > Cool, but you can the same, but for 3.14.2 ? They'll apply fine. You'll just have to comment out the two patches in the rt1 tree which are no longer needed. I added those on top of 14.2-rt1 with those two commented out. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch 0/6] 3.14-rt1 fixes 2014-05-02 11:43 ` Mike Galbraith @ 2014-05-03 8:32 ` Nicholas Mc Guire 2014-05-03 10:40 ` Mike Galbraith 0 siblings, 1 reply; 12+ messages in thread From: Nicholas Mc Guire @ 2014-05-03 8:32 UTC (permalink / raw) To: Mike Galbraith Cc: pavel, RT, Sebastian Andrzej Siewior, Steven Rostedt, Thomas Gleixner On Fri, 02 May 2014, Mike Galbraith wrote: > On Fri, 2014-05-02 at 15:39 +0400, Pavel Vasilyev wrote: > > 02.05.2014 15:12, Mike Galbraith ??????????: > > > The following patches are fixes that fell out of testing rt1. Patches > > > 1-4 are intended to be folded into existing patches, 5-6 are > > > replacements. > > > > > > fold: > > > 1/6 - preempt-lazy-support.patch > > > 2/6 - x86-preempt-lazy.patch > > > 3/6 - hotplug-light-get-online-cpus.patch > > > 4/6 - stomp-machine-raw-lock.patch > > > > > > replace: > > > 5/6 - (prep) > > > 6/6 - stomp-machine-deal-clever-with-stopper-lock.patch > > > > > > drop: (buggy - calls migrate_disable() _after_ maybe blocking) > > > migrate_disable-pushd-down-in-atomic_dec_and_spin_lo.patch > > > > Why would calling migrate_disable be buggy after blocking ? at that point any per cpu data is not yet being accessed so its perfectly fine to lock->block -> migrate -> unblock->lock -> migrate_disable on a new CPU -> access per_cpu object -> migrate_enable -> unlock what problems would there be ? thx! hofrat ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch 0/6] 3.14-rt1 fixes 2014-05-03 8:32 ` Nicholas Mc Guire @ 2014-05-03 10:40 ` Mike Galbraith 2014-05-03 12:31 ` Nicholas Mc Guire 0 siblings, 1 reply; 12+ messages in thread From: Mike Galbraith @ 2014-05-03 10:40 UTC (permalink / raw) To: Nicholas Mc Guire Cc: pavel, RT, Sebastian Andrzej Siewior, Steven Rostedt, Thomas Gleixner On Sat, 2014-05-03 at 10:32 +0200, Nicholas Mc Guire wrote: > On Fri, 02 May 2014, Mike Galbraith wrote: > > > On Fri, 2014-05-02 at 15:39 +0400, Pavel Vasilyev wrote: > > > 02.05.2014 15:12, Mike Galbraith ??????????: > > > > The following patches are fixes that fell out of testing rt1. Patches > > > > 1-4 are intended to be folded into existing patches, 5-6 are > > > > replacements. > > > > > > > > fold: > > > > 1/6 - preempt-lazy-support.patch > > > > 2/6 - x86-preempt-lazy.patch > > > > 3/6 - hotplug-light-get-online-cpus.patch > > > > 4/6 - stomp-machine-raw-lock.patch > > > > > > > > replace: > > > > 5/6 - (prep) > > > > 6/6 - stomp-machine-deal-clever-with-stopper-lock.patch > > > > > > > > drop: (buggy - calls migrate_disable() _after_ maybe blocking) > > > > migrate_disable-pushd-down-in-atomic_dec_and_spin_lo.patch > > > > > > > Why would calling migrate_disable be buggy after blocking ? > at that point any per cpu data is not yet being accessed so its perfectly fine > to > > lock->block > -> migrate > -> unblock->lock > -> migrate_disable on a new CPU > -> access per_cpu object > -> migrate_enable > -> unlock > > what problems would there be ? You tell me. Why do we bother at all if it's ok to migrate with a lock in your pocket, or the CPU can go away on you while you're acquiring? If this is in fact safe, you should be able to move each and every migrate_disable() to post acquisition. I have a virtual nickle that says your box will have a severe allergic reaction to such a patch. -Mike ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch 0/6] 3.14-rt1 fixes 2014-05-03 10:40 ` Mike Galbraith @ 2014-05-03 12:31 ` Nicholas Mc Guire 2014-05-03 14:03 ` Mike Galbraith 0 siblings, 1 reply; 12+ messages in thread From: Nicholas Mc Guire @ 2014-05-03 12:31 UTC (permalink / raw) To: Mike Galbraith Cc: pavel, RT, Sebastian Andrzej Siewior, Steven Rostedt, Thomas Gleixner On Sat, 03 May 2014, Mike Galbraith wrote: > On Sat, 2014-05-03 at 10:32 +0200, Nicholas Mc Guire wrote: > > On Fri, 02 May 2014, Mike Galbraith wrote: > > > > > On Fri, 2014-05-02 at 15:39 +0400, Pavel Vasilyev wrote: > > > > 02.05.2014 15:12, Mike Galbraith ??????????: > > > > > The following patches are fixes that fell out of testing rt1. Patches > > > > > 1-4 are intended to be folded into existing patches, 5-6 are > > > > > replacements. > > > > > > > > > > fold: > > > > > 1/6 - preempt-lazy-support.patch > > > > > 2/6 - x86-preempt-lazy.patch > > > > > 3/6 - hotplug-light-get-online-cpus.patch > > > > > 4/6 - stomp-machine-raw-lock.patch > > > > > > > > > > replace: > > > > > 5/6 - (prep) > > > > > 6/6 - stomp-machine-deal-clever-with-stopper-lock.patch > > > > > > > > > > drop: (buggy - calls migrate_disable() _after_ maybe blocking) > > > > > migrate_disable-pushd-down-in-atomic_dec_and_spin_lo.patch > > > > > > > > > > Why would calling migrate_disable be buggy after blocking ? > > at that point any per cpu data is not yet being accessed so its perfectly fine > > to > > > > lock->block > > -> migrate > > -> unblock->lock > > -> migrate_disable on a new CPU > > -> access per_cpu object > > -> migrate_enable > > -> unlock > > > > what problems would there be ? > > You tell me. Why do we bother at all if it's ok to migrate with a lock > in your pocket, or the CPU can go away on you while you're acquiring? The lock in your pocket is not the issue (unless its a per_cpu lock in which case we are pinned already anyway) and the cpu can go away with a global lock held in my opinion - why not ? > If this is in fact safe, you should be able to move each and every > migrate_disable() to post acquisition. yup > I have a virtual nickle that > says your box will have a severe allergic reaction to such a patch. > Actually that is what the pushdowns in the read_lock/write_lock api did and I did not notice any of the systems having problems with that. Not saying that there is no case in the pushdown that is wrongly handled but before looking at that lets clarify if the assumptions made for the migrate pushdowns was right in the first place. migrate_disable() provides referencial consistency on per_cpu objects so that preemption is safe with a per_cpu ref held - nothing else. So: if the lock is global then it is safe to migrate while blocked on a lock as the referencial consistency of the lock is not affected the reference to a protected per_cpu object not yet set (if it were that is a bug if the method used to access the per_cpu objects ref did not migrate_disable already). if it is a per-cpu lock then migrate_disable() would need to be called before accessing the lock object any way and the problem is not related to blocking on a lock - the get_ primitives would have disable migration before you got access to the lock object and thus you are safe. Regarding the CPU going away while blocked on a lock we have: if the lock is global then the CPU going away would push you off first without that the reference to the lock object is invalidated - so that is safe. if the lock is per_cpu then the get_cpu_var or what ever you use to get that per_cpu lock object would have disable migration BEFORE you got access to the per_cpu lock object, so once you have access to the per_cpu lock you would again be safe against the cpu going away as well via pin_current_cpu >From my understanding for a globally visible lock: per_cpu lock pin access unpin unlock pin lock access unlock unpin are semantically equivalent and safe even per_cpu lock pin access unlock unpin pin lock access unpin unlock is safe (while strange) - in no constelation would you lose referencial consistency of the per_cpu object within the crticial section (access). If code relies on these two (semantically orthogonal functions) to be in a particuular order I would suspect that the code has a bug. If one now looks at the cases of per_cpu lock pin access ... and assume that you got migrated away while aquireing the lock, or actually simply before you got temporarily pinned to the cpu, then this is safe for global locks - if the CPU went away and we got pushed to another CPU then the per_cpu object access is on the cpu we got pushed to but stays consistent and safe. for a per_cpu lock the pin/unpin is implied in access to the per_cpu lock object it self so the migrate_disable done in the locking api is not needed but no harme either and the position of the same (before or after aquisition of the lock) is irrelevant. So what case did I miss ? thx! hofrat ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch 0/6] 3.14-rt1 fixes 2014-05-03 12:31 ` Nicholas Mc Guire @ 2014-05-03 14:03 ` Mike Galbraith 2014-05-03 14:19 ` Nicholas Mc Guire 0 siblings, 1 reply; 12+ messages in thread From: Mike Galbraith @ 2014-05-03 14:03 UTC (permalink / raw) To: Nicholas Mc Guire Cc: pavel, RT, Sebastian Andrzej Siewior, Steven Rostedt, Thomas Gleixner On Sat, 2014-05-03 at 14:31 +0200, Nicholas Mc Guire wrote: > On Sat, 03 May 2014, Mike Galbraith wrote: > > If this is in fact safe, you should be able to move each and every > > migrate_disable() to post acquisition. > > yup Having just seen working -> brick transition, color me skeptical. > > I have a virtual nickle that > > says your box will have a severe allergic reaction to such a patch. > > > Actually that is what the pushdowns in the read_lock/write_lock api did and > I did not notice any of the systems having problems with that. If you had tested hotplug, you would have met the deadlock, and would have verified that the change to read_lock() was the culprit instead of me doing that. Steven also verified that. You too can flip back and forth, drive boxen into the wall as many times as it take to convince yourself that that change really really did induce the breakage. I'll read the rest later, I'm getting all grumpy. -Mike ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch 0/6] 3.14-rt1 fixes 2014-05-03 14:03 ` Mike Galbraith @ 2014-05-03 14:19 ` Nicholas Mc Guire 2014-05-03 15:24 ` Mike Galbraith 0 siblings, 1 reply; 12+ messages in thread From: Nicholas Mc Guire @ 2014-05-03 14:19 UTC (permalink / raw) To: Mike Galbraith Cc: pavel, RT, Sebastian Andrzej Siewior, Steven Rostedt, Thomas Gleixner On Sat, 03 May 2014, Mike Galbraith wrote: > On Sat, 2014-05-03 at 14:31 +0200, Nicholas Mc Guire wrote: > > On Sat, 03 May 2014, Mike Galbraith wrote: > > > > If this is in fact safe, you should be able to move each and every > > > migrate_disable() to post acquisition. > > > > yup > > Having just seen working -> brick transition, color me skeptical. > > > > I have a virtual nickle that > > > says your box will have a severe allergic reaction to such a patch. > > > > > Actually that is what the pushdowns in the read_lock/write_lock api did and > > I did not notice any of the systems having problems with that. > > If you had tested hotplug, you would have met the deadlock, and would > have verified that the change to read_lock() was the culprit instead of > me doing that. Steven also verified that. You too can flip back and > forth, drive boxen into the wall as many times as it take to convince > yourself that that change really really did induce the breakage. > I did not test hotplug - I did try and understand the code to verify the assumptions - but before looking at details of some code path (which in my opinion has a few problems of its own) I think it would be best to clarify first if the assumptions made for the migrate pushdown patches is right or not - if it is not there is no point in discussing individual code paths but then that patch set simply needs to go out, if the assumptions are right then we can discuss where the fix is needed. thx! hofrat ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch 0/6] 3.14-rt1 fixes 2014-05-03 14:19 ` Nicholas Mc Guire @ 2014-05-03 15:24 ` Mike Galbraith 0 siblings, 0 replies; 12+ messages in thread From: Mike Galbraith @ 2014-05-03 15:24 UTC (permalink / raw) To: Nicholas Mc Guire Cc: pavel, RT, Sebastian Andrzej Siewior, Steven Rostedt, Thomas Gleixner On Sat, 2014-05-03 at 16:19 +0200, Nicholas Mc Guire wrote: > On Sat, 03 May 2014, Mike Galbraith wrote: > > > On Sat, 2014-05-03 at 14:31 +0200, Nicholas Mc Guire wrote: > > > On Sat, 03 May 2014, Mike Galbraith wrote: > > > > > > If this is in fact safe, you should be able to move each and every > > > > migrate_disable() to post acquisition. > > > > > > yup > > > > Having just seen working -> brick transition, color me skeptical. > > > > > > I have a virtual nickle that > > > > says your box will have a severe allergic reaction to such a patch. > > > > > > > Actually that is what the pushdowns in the read_lock/write_lock api did and > > > I did not notice any of the systems having problems with that. > > > > If you had tested hotplug, you would have met the deadlock, and would > > have verified that the change to read_lock() was the culprit instead of > > me doing that. Steven also verified that. You too can flip back and > > forth, drive boxen into the wall as many times as it take to convince > > yourself that that change really really did induce the breakage. > > > > I did not test hotplug - I did try and understand the code to Not testing hotplug was obviously a mistake given what you were changing. Nor is that a tiny change, it's a glaringly huge order change... for read_lock(), with no order change to write_lock(). Things that make ya go hmmm. > verify the assumptions - but before looking at details of some code > path (which in my opinion has a few problems of its own) I think it > would be best to clarify first if the assumptions made for the > migrate pushdown patches is right or not - if it is not there is no point > in discussing individual code paths but then that patch set simply needs to > go out, if the assumptions are right then we can discuss where the fix is > needed. If your assumptions were correct, there would have been no breakage. -Mike ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch 0/6] 3.14-rt1 fixes 2014-05-02 11:12 [patch 0/6] 3.14-rt1 fixes Mike Galbraith 2014-05-02 11:39 ` Pavel Vasilyev @ 2014-05-02 13:44 ` Sebastian Andrzej Siewior 2014-05-02 13:45 ` Mike Galbraith 1 sibling, 1 reply; 12+ messages in thread From: Sebastian Andrzej Siewior @ 2014-05-02 13:44 UTC (permalink / raw) To: Mike Galbraith; +Cc: RT, Steven Rostedt, Thomas Gleixner * Mike Galbraith | 2014-05-02 13:12:39 [+0200]: >drop: (buggy - calls migrate_disable() _after_ maybe blocking) >migrate_disable-pushd-down-in-atomic_dec_and_spin_lo.patch In the thread you quoted also the delayed migrated_disable in rt_read_lock(). Is this one no longer required? >-Mike Sebastian ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch 0/6] 3.14-rt1 fixes 2014-05-02 13:44 ` Sebastian Andrzej Siewior @ 2014-05-02 13:45 ` Mike Galbraith 2014-05-02 13:49 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 12+ messages in thread From: Mike Galbraith @ 2014-05-02 13:45 UTC (permalink / raw) To: Sebastian Andrzej Siewior; +Cc: RT, Steven Rostedt, Thomas Gleixner On Fri, 2014-05-02 at 15:44 +0200, Sebastian Andrzej Siewior wrote: > * Mike Galbraith | 2014-05-02 13:12:39 [+0200]: > > >drop: (buggy - calls migrate_disable() _after_ maybe blocking) > >migrate_disable-pushd-down-in-atomic_dec_and_spin_lo.patch > > In the thread you quoted also the delayed migrated_disable in > rt_read_lock(). Is this one no longer required? Steven submitted a patch for that. -Mike ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch 0/6] 3.14-rt1 fixes 2014-05-02 13:45 ` Mike Galbraith @ 2014-05-02 13:49 ` Sebastian Andrzej Siewior 0 siblings, 0 replies; 12+ messages in thread From: Sebastian Andrzej Siewior @ 2014-05-02 13:49 UTC (permalink / raw) To: Mike Galbraith; +Cc: RT, Steven Rostedt, Thomas Gleixner On 05/02/2014 03:45 PM, Mike Galbraith wrote: > On Fri, 2014-05-02 at 15:44 +0200, Sebastian Andrzej Siewior wrote: >> * Mike Galbraith | 2014-05-02 13:12:39 [+0200]: >> >>> drop: (buggy - calls migrate_disable() _after_ maybe blocking) >>> migrate_disable-pushd-down-in-atomic_dec_and_spin_lo.patch >> >> In the thread you quoted also the delayed migrated_disable in >> rt_read_lock(). Is this one no longer required? > > Steven submitted a patch for that. Ah I see. Let me look at this. What it looks is that we revert the 3 three push down patches since they break hotplug. Thanks. > > -Mike > Sebastian ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2014-05-03 15:24 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-05-02 11:12 [patch 0/6] 3.14-rt1 fixes Mike Galbraith 2014-05-02 11:39 ` Pavel Vasilyev 2014-05-02 11:43 ` Mike Galbraith 2014-05-03 8:32 ` Nicholas Mc Guire 2014-05-03 10:40 ` Mike Galbraith 2014-05-03 12:31 ` Nicholas Mc Guire 2014-05-03 14:03 ` Mike Galbraith 2014-05-03 14:19 ` Nicholas Mc Guire 2014-05-03 15:24 ` Mike Galbraith 2014-05-02 13:44 ` Sebastian Andrzej Siewior 2014-05-02 13:45 ` Mike Galbraith 2014-05-02 13:49 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).