From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from e06smtp15.uk.ibm.com ([195.75.94.111]) by merlin.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1UiQDN-00037w-7V for kexec@lists.infradead.org; Fri, 31 May 2013 14:21:58 +0000 Received: from /spool/local by e06smtp15.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 31 May 2013 15:17:35 +0100 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id EDACD1B08067 for ; Fri, 31 May 2013 15:21:30 +0100 (BST) Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by b06cxnps4074.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r4VELKVS55312446 for ; Fri, 31 May 2013 14:21:20 GMT Received: from d06av01.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id r4VELUVt007078 for ; Fri, 31 May 2013 08:21:30 -0600 Date: Fri, 31 May 2013 16:21:27 +0200 From: Michael Holzheu Subject: Re: [PATCH 0/2] kdump/mmap: Fix mmap of /proc/vmcore for s390 Message-ID: <20130531162127.6d512233@holzheu> In-Reply-To: <20130530203847.GB5968@redhat.com> References: <20130524143644.GD18218@redhat.com> <20130524170626.2ac06efe@holzheu> <20130524152849.GF18218@redhat.com> <87mwrkatgu.fsf@xmission.com> <51A006CF.90105@gmail.com> <87k3mnahkf.fsf@xmission.com> <51A076FE.3060604@gmail.com> <20130525145217.0549138a@holzheu> <20130528135500.GC7088@redhat.com> <20130529135144.7f95c4c0@holzheu> <20130530203847.GB5968@redhat.com> Mime-Version: 1.0 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "kexec" Errors-To: kexec-bounces+dwmw2=twosheds.infradead.org@lists.infradead.org To: Vivek Goyal Cc: kexec@lists.infradead.org, Heiko Carstens , Jan Willeke , linux-kernel@vger.kernel.org, HATAYAMA Daisuke , "Eric W. Biederman" , Martin Schwidefsky , Andrew Morton , Zhang Yanfei On Thu, 30 May 2013 16:38:47 -0400 Vivek Goyal wrote: > On Wed, May 29, 2013 at 01:51:44PM +0200, Michael Holzheu wrote: > > [..] > > >>> START QUOTE > > > > [PATCH v3 1/3] kdump: Introduce ELF header in new memory feature > > > > Currently for s390 we create the ELF core header in the 2nd kernel > > with a small trick. We relocate the addresses in the ELF header in > > a way that for the /proc/vmcore code it seems to be in the 1st > > kernel (old) memory and the read_from_oldmem() returns the correct > > data. This allows the /proc/vmcore code to use the ELF header in > > the 2nd kernel. > > > > >>> END QUOTE > > > > For our current zfcpdump project (see "[PATCH 3/3]s390/kdump: Use > > vmcore for zfcpdump") we could no longer use this trick. Therefore > > we sent you the patches to get a clean interface for ELF header > > creation in the 2nd kernel. > > Hi Michael, > > Few more questions. > > - What's the difference between zfcpdump and kdump. I thought zfcpdump > just boots specific kernel from fixed drive? If yes, why can't that > kernel prepare headers in similar way as regular kdump kernel does > and gain from kdump kernel swap trick? Correct, the zfcpdump kernel is booted from a fixed disk drive. The difference is that the zfcpdump HSA memory is not mapped into real memory. It is accessed using a read memory interface "memcpy_hsa()" that copies memory from the hypervisor owned HSA memory into the Linux memory. So it looks like the following: +----------+ +------------+ | | memcpy_hsa() | | | zfcpdump | <-------------- | HSA memory | | | | | +----------+ +------------+ | | | old mem | | | +----------+ In the copy_oldmem_page() function for zfcpdump we do the following: copy_oldmem_page_zfcpdump(...) { if (src < ZFCPDUMP_HSA_SIZE) { if (memcpy_hsa(buf, src, csize, userbuf) < 0) return -EINVAL; } else { if (userbuf) copy_to_user_real((buf, src, csize); else memcpy_real(buf, src, csize); } } So I think for zfcpdump we only can use the read() interface of /proc/vmcore. But this is sufficient for us since we also provide the s390 specific zfcpdump user space that copies /proc/vmcore. > Also, we are accessing the contents of elf headers using physical > address. If that's the case, does it make a difference if data is > in old kernel's memory or new kernel's memory. We will use the > physical address and create a temporary mapping and it should not > make a difference whether same physical page is already mapped in > current kernel or not. > > Only restriction this places is that all ELF header needs to be > contiguous. I see that s390 code already creates elf headers using > kzalloc_panic(). So memory allocated should by physically contiguous. > > So can't we just put __pa(elfcorebuf) in elfcorehdr_addr. And same > is true for p_offset fields in PT_NOTE headers and everything should > work fine? > > Only problem we can face is that at some point of time kzalloc() might > not be able to contiguous memory request. We can handle that once s390 > runs into those issues. You are anyway allocating memory using > kzalloc(). > > And if this works for s390 kdump, it should work for zfcpdump too? So your suggestion is that copy_oldmem_page() should also be used for copying memory from the new kernel, correct? For kdump on s390 I think this will work with the new "ELF header swap" patch. With that patch access to [0, OLDMEM_SIZE] will uniquely identify an address in the new kernel and access to [OLDMEM_BASE, OLDMEM_BASE + OLDMEM_SIZE] will identify an address in the old kernel. For zfcpdump currently we add a load from [0, HSA_SIZE] where p_offset equals p_paddr. Therefore we can't distinguish in copy_oldmem_page() if we read from oldmem (HSA) or newmem. The range [0, HSA_SIZE] is used twice. As a workaroun we could use an artificial p_offset for the HSA memory chunk that is not used by the 1st kernel physical memory. This is not really beautiful, but probably doable. When I tried to implement this for kdump, I noticed another problem with the vmcore mmap patches. Our copy_oldmem_page() function uses memcpy_real() to access the old 1st kernel memory. This function switches to real mode and therefore does not require any page tables. But as a side effect of that we can't copy to vmalloc memory. The mmap patches use vmalloc memory for "notes_buf". So currently using our copy_oldmem_page() fails here. If copy_oldmem_page() now also must be able to copy to vmalloc memory, we would have to add new code for that: * oldmem -> newmem (real): Use direct memcpy_real() * oldmem -> newmem (vmalloc): Use intermediate buffer with memcpy_real() * newmem -> newmem: Use memcpy() What do you think? Best Regards, Michael _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec