From mboxrd@z Thu Jan 1 00:00:00 1970 From: Julien Grall Subject: Re: [PATCH] xen: arm: add missing flushing dcache to the copy to/clean guest functions Date: Mon, 25 Nov 2013 17:06:07 +0000 Message-ID: <5293837F.1030906@citrix.com> References: <1385396476-12675-1-git-send-email-oleksandr.dmytryshyn@globallogic.com> <1385397228.22002.93.camel@kazak.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1385397228.22002.93.camel@kazak.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: Oleksandr Dmytryshyn , andrii.anisov@ti.com, xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 11/25/2013 04:33 PM, Ian Campbell wrote: > On Mon, 2013-11-25 at 18:21 +0200, Oleksandr Dmytryshyn wrote: > > Thanks for the patch. > >> Without flushing dcache the hypervisor couldn't copy the device tree >> correctly when booting the kernel dom0 Image (memory with device tree >> is corrupted). As the result - when we try to load the kernel dom0 >> Image - dom0 hungs frequently. This issue is not reproduced with the >> kernel dom0 zImage because the zImage decompressor code flushes all >> dcache before starting the decompressed kernel Image. When the >> hypervisor loads the kernel uImage or initrd, this memory region >> isn't corrupted because the hypervisor code flushes the dcache. > > So if not then when/how is this reproduced? > > In general I would like to try and keep flushes out of this code path > because they are used in the hypercall path, we have decreed that > guest's must have caching enabled to make hypercalls (at least those > which take in-memory arguments). > > I think the right fix is to do the flushes domain_build.c, similar to > how kernel_zimage_load does it. This might need an opencoded version of > copy_to_user. Or better, introduce a flushing variant which shares most > code with the normal one via a common internal function. > > Or perhaps we should flush all of the new guest's RAM after building. I > think Julien was looking at doing something along those lines for the > domU building case. > I was planning to handle only domU, but the issue can also happen with dom0 (which is finally a guest as another :)). The main reason behind the both issues is because Linux is starting with cache disabled. So if Xen/Dom0 are copying data in a guest, some data can stay in the cache for a while. In this case the guest will see corrupted data. Flushing all the new RAM seems a bit complicated and slow (AFAIK, you can only flush data cache page by page). I would stay on a similar solution and check if the guest is already running or not. Ian, what do you think? -- Julien Grall