From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tomasz Wroblewski Subject: Re: [PATCH v2] Fix scheduler crash after s3 resume Date: Fri, 25 Jan 2013 10:45:01 +0100 Message-ID: <5102541D.1070408@citrix.com> References: <5100070F.7010808@citrix.com> <5100D229.4030906@ts.fujitsu.com> <510144A3.9060302@citrix.com> <5101630D02000078000B93AD@nat28.tlf.novell.com> <51016065.3080902@citrix.com> <510175E802000078000B94A1@nat28.tlf.novell.com> <51024B56.20706@citrix.com> <5102603302000078000B985C@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5102603302000078000B985C@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: George Dunlap , Juergen Gross , "Keir (Xen.org)" , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org > I think I had already raised the question of the placement of > this rcu_barrier() here, and the lack of a counterpart in the > suspend portion of the path. Keir? Or should > rcu_barrier_action() avoid calling process_pending_softirqs() > while still resuming, and instead call __do_softirq() with all but > RCU_SOFTIRQ masked (perhaps through a suitable wrapper, > or alternatively by open-coding its effect)? > > Though I recall these vcpu_wake crashes happen also from other entry points in enter_state but rcu_barrier, so I dont think removing that helps much. Just was unable to get a proper log of them today due to most of them being cut in half. Will try bit more. My belief is that as long as vcpu_migrate is not called in cpu_disable_scheduler, the vcpu->processor shall continue to point to offline cpu. Which will crash if the vcpu_wake is called for that vcpu. If vcpu_migrate is called, then vcpu_wake will still be called with some frequency but since vcpu->processor shall point to online cpu, and it won't crash. So likely avoiding the wakes here completely is not the goal, just the offline ones.