From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from mga14.intel.com ([143.182.124.37]) by bombadil.infradead.org with esmtp (Exim 4.68 #1 (Red Hat Linux)) id 1JwTBM-0008GU-5E for kexec@lists.infradead.org; Thu, 15 May 2008 02:27:00 +0000 Subject: Re: [PATCH -mm] kexec jump -v9 From: "Huang, Ying" In-Reply-To: <20080514205204.GJ30469@redhat.com> References: <1204773188.4707.109.camel@caritas-dev.intel.com> <20080514205204.GJ30469@redhat.com> Date: Thu, 15 May 2008 10:32:42 +0800 Message-ID: <1210818762.23707.102.camel@caritas-dev.intel.com> MIME-Version: 1.0 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: kexec-bounces@lists.infradead.org Errors-To: kexec-bounces+dwmw2=infradead.org@lists.infradead.org To: Vivek Goyal Cc: nigel@nigel.suspend2.net, Kexec Mailing List , linux-kernel@vger.kernel.org, "Rafael J. Wysocki" , "Eric W. Biederman" , Pavel Machek , Andrew Morton , linux-pm@lists.linux-foundation.org T24gV2VkLCAyMDA4LTA1LTE0IGF0IDE2OjUyIC0wNDAwLCBWaXZlayBHb3lhbCB3cm90ZToKWy4u Ll0KPiBPaywgSSBoYXZlIGRvbmUgc29tZSB0ZXN0aW5nIG9uIHRoaXMgcGF0Y2guIEN1cnJlbnRs eSBJIGhhdmUganVzdAo+IHRlc3RlZCBzd2l0Y2hpbmcgYmFjayBhbmQgZm9ydGggYmV0d2VlbiB0 d28ga2VybmVscyBhbmQgaXQgaXMgd29ya2luZyBmb3IKPiBtZS4KClRoYW5rcy4KClsuLi5dCj4g PiArLyoKPiA+ICsgKiBFbnRyeSBwb2ludCBmb3IganVtcGluZyBiYWNrIGZyb20ga2V4ZWNlZCBr ZXJuZWwsIHRoZSBwYWdpbmcgaXMKPiA+ICsgKiB0dXJuZWQgb2ZmLgo+ID4gKyAqLwo+ID4gK2tl eGVjX2p1bXBfYmFja19lbnRyeToKPiA+ICsJY2FsbAkxZgo+ID4gKzE6Cj4gPiArCXBvcGwJJWVi eAo+ID4gKwlzdWJsCSQoMWIgLSBrZXhlY19yZWxvY2F0ZV9wYWdlKSwgJWVieAo+ID4gKwltb3Zs CSVlZGksIEtKVU1QX0VOVFJZX09GRiglZWJ4KQo+ID4gKwltb3ZsCUNQX1ZBX0NPTlRST0xfUEFH RSglZWJ4KSwgJWVkaQo+ID4gKwlsZWEJU1RBQ0tfVE9QKCVlYngpLCAlZXNwCj4gPiArCW1vdmwJ Q1BfUEFfU1dBUF9QQUdFKCVlYngpLCAlZWF4Cj4gPiArCW1vdmwJQ1BfUEFfQkFDS1VQX1BBR0VT X01BUCglZWJ4KSwgJWVkeAo+ID4gKwlwdXNobAklZWF4Cj4gPiArCXB1c2hsCSVlZHgKPiA+ICsJ Y2FsbAlzd2FwX3BhZ2VzCj4gPiArCWFkZGwJJDgsICVlc3AKPiA+ICsJbW92bAlDUF9QQV9QR0Qo JWVieCksICVlYXgKPiA+ICsJbW92bAklZWF4LCAlY3IzCj4gPiArCW1vdmwJJWNyMCwgJWVheAo+ ID4gKwlvcmwJJCgxPDwzMSksICVlYXgKPiA+ICsJbW92bAklZWF4LCAlY3IwCj4gPiArCWxlYQlT VEFDS19UT1AoJWVkaSksICVlc3AKPiA+ICsJbW92bAklZWRpLCAlZWF4Cj4gPiArCWFkZGwJJCh2 aXJ0dWFsX21hcHBlZCAtIGtleGVjX3JlbG9jYXRlX3BhZ2UpLCAlZWF4Cj4gPiArCXB1c2hsCSVl YXgKPiA+ICsJcmV0Cj4gCj4gVXBvbiByZS1lbnRlcmluZyB0aGUga2VybmVsLCB3aGF0IGhhcHBl bnMgdG8gR0RUIHRhYmxlPyBTbyBnZHRyIHdpbGwgYmUKPiBwb2ludGluZyB0byBHRFQgb2Ygb3Ro ZXIga2VybmVsICh3aGljaCBpcyBub3QgdGhlcmUgYXMgcGFnZXMgaGF2ZSBiZWVuCj4gc3dhcHBl ZCk/IERvIHdlIG5lZWQgdG8gcmVsb2FkIHRoZSBnZHRyIHVwb24gcmUtZW50ZXJpbmcgdGhlIGtl cm5lbC4KCkFmdGVyIHJlLWVudGVyaW5nIHRoZSBrZXJuZWwgYW5kIHJldHVybmluZyBmcm9tIG1h Y2hpbmVfa2V4ZWMsCnJlc3RvcmVfcHJvY2Vzc29yX3N0YXRlKCkgaXMgY2FsbGVkLCB3aGVyZSB0 aGUgR0RUUiBhbmQgc29tZSBvdGhlciBDUFUKc3RhdGUgc3VjaCBhcyBGUFUsIElEVCwgZXRjIGFy ZSByZXN0b3JlZC4KCj4gWy4uXQo+ID4gQEAgLTE5Nyw4ICsyODIsNTQgQEAgaWRlbnRpdHlfbWFw cGVkOgo+ID4gIAl4b3JsCSVlYXgsICVlYXgKPiA+ICAJbW92bAklZWF4LCAlY3IzCj4gPiAgCj4g PiArCW1vdmwJQ1BfUEFfU1dBUF9QQUdFKCVlZGkpLCAlZWF4Cj4gPiArCXB1c2hsCSVlYXgKPiA+ ICsJcHVzaGwJJWVieAo+ID4gKwljYWxsCXN3YXBfcGFnZXMKPiA+ICsJYWRkbAkkOCwgJWVzcAo+ ID4gKwo+ID4gKwkvKiBUbyBiZSBjZXJ0YWluIG9mIGF2b2lkaW5nIHByb2JsZW1zIHdpdGggc2Vs Zi1tb2RpZnlpbmcgY29kZQo+ID4gKwkgKiBJIG5lZWQgdG8gZXhlY3V0ZSBhIHNlcmlhbGl6aW5n IGluc3RydWN0aW9uIGhlcmUuCj4gPiArCSAqIFNvIEkgZmx1c2ggdGhlIFRMQiwgaXQncyBoYW5k eSwgYW5kIG5vdCBwcm9jZXNzb3IgZGVwZW5kZW50Lgo+ID4gKwkgKi8KPiA+ICsJeG9ybAklZWF4 LCAlZWF4Cj4gPiArCW1vdmwJJWVheCwgJWNyMwo+ID4gKwo+ID4gKwkvKiBzZXQgYWxsIG9mIHRo ZSByZWdpc3RlcnMgdG8ga25vd24gdmFsdWVzICovCj4gPiArCS8qIGxlYXZlICVlc3AgYWxvbmUg Ki8KPiA+ICsKPiA+ICsJbW92bAlLSlVNUF9NQUdJQ19PRkYoJWVkaSksICVlYXgKPiA+ICsJY21w bAkkS0pVTVBfTUFHSUNfTlVNQkVSLCAlZWF4Cj4gPiArCWp6IDFmCj4gPiArCXhvcmwJJWVkaSwg JWVkaQo+ID4gKwl4b3JsCSVlYXgsICVlYXgKPiA+ICsJeG9ybAklZWJ4LCAlZWJ4Cj4gPiArCXhv cmwgICAgJWVjeCwgJWVjeAo+ID4gKwl4b3JsICAgICVlZHgsICVlZHgKPiA+ICsJeG9ybCAgICAl ZXNpLCAlZXNpCj4gPiArCXhvcmwgICAgJWVicCwgJWVicAo+ID4gKwlyZXQKPiA+ICsxOgo+ID4g Kwlwb3BsCSVlZHgKPiA+ICsJbW92bAlDUF9QQV9TV0FQX1BBR0UoJWVkaSksICVlc3AKPiA+ICsJ YWRkbAkkUEFHRV9TSVpFX2FzbSwgJWVzcAo+ID4gKwlwdXNobAklZWR4Cj4gPiArMjoKPiA+ICsJ Y2FsbAkqJWVkeAo+IAo+ID4gKwltb3ZsCSVlZGksICVlZHgKPiA+ICsJcG9wbAklZWRpCj4gPiAr CXB1c2hsCSVlZHgKPiA+ICsJam1wCTJiCj4gPiArCj4gCj4gV2hhdCBkb2VzIGFib3ZlIHBpZWNl IG9mIGNvZGUgZG8/IExvb2tzIGxpa2UgcmVkdW5kYW50IGZvciBzd2l0Y2hpbmcKPiBiZXR3ZWVu IHRoZSBrZXJuZWxzPyBBZnRlciBjYWxsIColZWR4LCB3ZSBuZXZlciByZXR1cm4gaGVyZS4gSW5z dGVhZAo+IHdlIGNvbWUgYmFjayB0byAia2V4ZWNfanVtcF9iYWNrX2VudHJ5Ij8KCkZvciBzd2l0 Y2hpbmcgYmV0d2VlbiB0aGUga2VybmVscywgdGhpcyBpcyByZWR1bmRhbnQuIE9yaWdpbmFsbHkg YW5vdGhlcgpmZWF0dXJlIG9mIGtleGVjIGp1bXAgaXMgdG8gY2FsbCBzb21lIGNvZGUgaW4gcGh5 c2ljYWwgbW9kZS4g77u/VGhpcyBpcwp1c2VkIHRvIHByb3ZpZGUgYSBDIEFCSSB0byBjYWxsZWQg Y29kZS4KCk5vdywgRXJpYyBzdWdnZXN0cyB0byB1c2UgYSBDIEFCSSBjb21wYXRpYmxlIG1vZGUg dG8gcGFzcyB0aGUganVtcCBiYWNrCmVudHJ5IHBvaW50IHRvbywgdGhhdCBpcywgdXNlIHRoZSBy ZXR1cm4gYWRkcmVzcyBvbiBzdGFjayBpbnN0ZWFkIG9mICUKZWRpLiBJIHRoaW5rIHRoYXQgaXMg cmVhc29uYWJsZS4gTWF5YmUgd2UgY2FuIHJldmlzZSB0aGlzIGNvZGUgdG8gYmUKY29tcGF0aWJs ZSB3aXRoIEMgQUJJIGFuZCBwcm92aWRlIGEgY29udmVuaWVudCBpbnRlcmZhY2UgZm9yIGJvdGgg a2VybmVsCmFuZCBvdGhlciBwaHlzaWNhbCBtb2RlIGNvZGUuCgo+IFsuLl0KPiA+IC0tLSAvZGV2 L251bGwKPiA+ICsrKyBiL0RvY3VtZW50YXRpb24vaTM4Ni9qdW1wX2JhY2tfcHJvdG9jb2wudHh0 Cj4gPiBAQCAtMCwwICsxLDY2IEBACj4gPiArCQlUSEUgTElOVVgvSTM4NiBKVU1QIEJBQ0sgUFJP VE9DT0wKPiA+ICsJCS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQo+ID4gKwo+ID4g KwkJSHVhbmcgWWluZyA8eWluZy5odWFuZ0BpbnRlbC5jb20+Cj4gPiArCQkgICAgTGFzdCB1cGRh dGUgMjAwNy0xMi0xOQo+ID4gKwo+ID4gK0N1cnJlbnRseSwgdGhlIGZvbGxvd2luZyB2ZXJzaW9u cyBvZiB0aGUganVtcCBiYWNrIHByb3RvY29sIGV4aXN0Lgo+ID4gKwo+ID4gK1Byb3RvY29sIDEu MDA6CUp1bXBpbmcgYmV0d2VlbiBvcmlnaW5hbCBrZXJuZWwgYW5kIGtleGVjZWQga2VybmVsCj4g PiArCQlzdXBwb3J0LiBDYWxsaW5nIG9yZGluYXJ5IEMgZnVuY3Rpb24gc3VwcG9ydC4KPiA+ICsK PiA+ICsKPiA+ICsqKiogSlVNUCBCQUNLIEVOVFJZCj4gPiArCj4gPiArQXQganVtcCBiYWNrIGVu dHJ5IG9mIGNhbGxlZSwgdGhlIENQVSBtdXN0IGJlIGluIDMyLWJpdCBwcm90ZWN0ZWQgbW9kZQo+ ID4gK3dpdGggcGFnaW5nIGRpc2FibGVkOyB0aGUgQ1MsIERTLCBFUyBhbmQgU1MgbXVzdCBiZSA0 RyBmbGF0IHNlZ21lbnRzOwo+ID4gK0NTIG11c3QgaGF2ZSBleGVjdXRlL3JlYWQgcGVybWlzc2lv biwgYW5kIERTLCBFUyBhbmQgU1MgbXVzdCBoYXZlCj4gPiArcmVhZC93cml0ZSBwZXJtaXNzaW9u OyBpbnRlcnJ1cHQgbXVzdCBiZSBkaXNhYmxlZDsgdGhlIGNvbnRlbnRzIG9mCj4gPiArcmVnaXN0 ZXJzIGFuZCBjb3JyZXNwb25kaW5nIG1lbW9yeSBtdXN0IGJlIGFzIGZvbGxvdzoKPiA+ICsKPiA+ ICtPZmZzZXQvU2l6ZQlNZWFuaW5nCj4gPiArCj4gPiArJWVkaQkJUmVhbCBqdW1wIGJhY2sgZW50 cnkgb2YgY2FsbGVyIGlmIHN1cHBvcnRlZCwKPiA+ICsJCW90aGVyd2lzZSAwLgo+ID4gKyVlc3AJ CVN0YWNrIHRvcCBwb2ludGVyLCB0aGUgc2l6ZSBvZiBzdGFjayBpcyBhYm91dCA0ayBieXRlcy4K PiA+ICsoJWVzcCkvNAlIZWxwZXIganVtcCBiYWNrIGVudHJ5IG9mIGNhbGxlciBpZiAlZWRpICE9 IDAsCj4gPiArCQlvdGhlcndpc2UgdW5kZWZpbmVkLgo+ID4gKwo+IAo+IEkgYW0gbm90IHN1cmUg d2hhdCBpcyBoZWxwZXIganVtcCBiYWNrIGVudHJ5PyBJIHVuZGVyc3RhbmQgdGhhdCB5b3UgCj4g YXJlIHVzaW5nICVlZGkgdG8gcGFzcyBhcm91bmQgZW50cnkgcG9pbnQgYmV0d2VlbiB0d28ga2Vy bmVscy4gQ2FuCj4geW91IHBsZWFzZSBzaGVkIHNvbWUgbW9yZSBsaWdodCBvbiB0aGlzPwoKSGVs cGVyIGp1bXAgYmFjayBlbnRyeSBpcyB1c2VkIHRvIHByb3ZpZGUgYSBDIEFCSSB0byBzb21lIHBo eXNpY2FsIG1vZGUKY29kZSBvdGhlciB0aGFuIGtlcm5lbC4gSXQgaXMgdGhlIGFib3ZlIHJlZHVu ZGFudCBjb2RlLgoKQmVzdCBSZWdhcmRzLApIdWFuZyBZaW5nCgoKX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX18Ka2V4ZWMgbWFpbGluZyBsaXN0CmtleGVjQGxp c3RzLmluZnJhZGVhZC5vcmcKaHR0cDovL2xpc3RzLmluZnJhZGVhZC5vcmcvbWFpbG1hbi9saXN0 aW5mby9rZXhlYwo= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756267AbYEOC1Q (ORCPT ); Wed, 14 May 2008 22:27:16 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752399AbYEOC1A (ORCPT ); Wed, 14 May 2008 22:27:00 -0400 Received: from mga14.intel.com ([143.182.124.37]:43066 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752263AbYEOC07 convert rfc822-to-8bit (ORCPT ); Wed, 14 May 2008 22:26:59 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.27,489,1204531200"; d="scan'208";a="247034763" Subject: Re: [PATCH -mm] kexec jump -v9 From: "Huang, Ying" To: Vivek Goyal CC: "Eric W. Biederman" , Pavel Machek , nigel@nigel.suspend2.net, "Rafael J. Wysocki" , Andrew Morton , linux-kernel@vger.kernel.org, linux-pm@lists.linux-foundation.org, Kexec Mailing List In-Reply-To: <20080514205204.GJ30469@redhat.com> References: <1204773188.4707.109.camel@caritas-dev.intel.com> <20080514205204.GJ30469@redhat.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8BIT Date: Thu, 15 May 2008 10:32:42 +0800 Message-ID: <1210818762.23707.102.camel@caritas-dev.intel.com> MIME-Version: 1.0 X-Mailer: Evolution 2.22.1 X-OriginalArrivalTime: 15 May 2008 02:26:17.0066 (UTC) FILETIME=[0DC4FCA0:01C8B633] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2008-05-14 at 16:52 -0400, Vivek Goyal wrote: [...] > Ok, I have done some testing on this patch. Currently I have just > tested switching back and forth between two kernels and it is working for > me. Thanks. [...] > > +/* > > + * Entry point for jumping back from kexeced kernel, the paging is > > + * turned off. > > + */ > > +kexec_jump_back_entry: > > + call 1f > > +1: > > + popl %ebx > > + subl $(1b - kexec_relocate_page), %ebx > > + movl %edi, KJUMP_ENTRY_OFF(%ebx) > > + movl CP_VA_CONTROL_PAGE(%ebx), %edi > > + lea STACK_TOP(%ebx), %esp > > + movl CP_PA_SWAP_PAGE(%ebx), %eax > > + movl CP_PA_BACKUP_PAGES_MAP(%ebx), %edx > > + pushl %eax > > + pushl %edx > > + call swap_pages > > + addl $8, %esp > > + movl CP_PA_PGD(%ebx), %eax > > + movl %eax, %cr3 > > + movl %cr0, %eax > > + orl $(1<<31), %eax > > + movl %eax, %cr0 > > + lea STACK_TOP(%edi), %esp > > + movl %edi, %eax > > + addl $(virtual_mapped - kexec_relocate_page), %eax > > + pushl %eax > > + ret > > Upon re-entering the kernel, what happens to GDT table? So gdtr will be > pointing to GDT of other kernel (which is not there as pages have been > swapped)? Do we need to reload the gdtr upon re-entering the kernel. After re-entering the kernel and returning from machine_kexec, restore_processor_state() is called, where the GDTR and some other CPU state such as FPU, IDT, etc are restored. > [..] > > @@ -197,8 +282,54 @@ identity_mapped: > > xorl %eax, %eax > > movl %eax, %cr3 > > > > + movl CP_PA_SWAP_PAGE(%edi), %eax > > + pushl %eax > > + pushl %ebx > > + call swap_pages > > + addl $8, %esp > > + > > + /* To be certain of avoiding problems with self-modifying code > > + * I need to execute a serializing instruction here. > > + * So I flush the TLB, it's handy, and not processor dependent. > > + */ > > + xorl %eax, %eax > > + movl %eax, %cr3 > > + > > + /* set all of the registers to known values */ > > + /* leave %esp alone */ > > + > > + movl KJUMP_MAGIC_OFF(%edi), %eax > > + cmpl $KJUMP_MAGIC_NUMBER, %eax > > + jz 1f > > + xorl %edi, %edi > > + xorl %eax, %eax > > + xorl %ebx, %ebx > > + xorl %ecx, %ecx > > + xorl %edx, %edx > > + xorl %esi, %esi > > + xorl %ebp, %ebp > > + ret > > +1: > > + popl %edx > > + movl CP_PA_SWAP_PAGE(%edi), %esp > > + addl $PAGE_SIZE_asm, %esp > > + pushl %edx > > +2: > > + call *%edx > > > + movl %edi, %edx > > + popl %edi > > + pushl %edx > > + jmp 2b > > + > > What does above piece of code do? Looks like redundant for switching > between the kernels? After call *%edx, we never return here. Instead > we come back to "kexec_jump_back_entry"? For switching between the kernels, this is redundant. Originally another feature of kexec jump is to call some code in physical mode. This is used to provide a C ABI to called code. Now, Eric suggests to use a C ABI compatible mode to pass the jump back entry point too, that is, use the return address on stack instead of % edi. I think that is reasonable. Maybe we can revise this code to be compatible with C ABI and provide a convenient interface for both kernel and other physical mode code. > [..] > > --- /dev/null > > +++ b/Documentation/i386/jump_back_protocol.txt > > @@ -0,0 +1,66 @@ > > + THE LINUX/I386 JUMP BACK PROTOCOL > > + --------------------------------- > > + > > + Huang Ying > > + Last update 2007-12-19 > > + > > +Currently, the following versions of the jump back protocol exist. > > + > > +Protocol 1.00: Jumping between original kernel and kexeced kernel > > + support. Calling ordinary C function support. > > + > > + > > +*** JUMP BACK ENTRY > > + > > +At jump back entry of callee, the CPU must be in 32-bit protected mode > > +with paging disabled; the CS, DS, ES and SS must be 4G flat segments; > > +CS must have execute/read permission, and DS, ES and SS must have > > +read/write permission; interrupt must be disabled; the contents of > > +registers and corresponding memory must be as follow: > > + > > +Offset/Size Meaning > > + > > +%edi Real jump back entry of caller if supported, > > + otherwise 0. > > +%esp Stack top pointer, the size of stack is about 4k bytes. > > +(%esp)/4 Helper jump back entry of caller if %edi != 0, > > + otherwise undefined. > > + > > I am not sure what is helper jump back entry? I understand that you > are using %edi to pass around entry point between two kernels. Can > you please shed some more light on this? Helper jump back entry is used to provide a C ABI to some physical mode code other than kernel. It is the above redundant code. Best Regards, Huang Ying