LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Michael S. Tsirkin @ 2018-08-06 13:46 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Christoph Hellwig, Will Deacon, Anshuman Khandual, virtualization,
	linux-kernel, linuxppc-dev, aik, robh, joe, elfring, david,
	jasowang, mpe, linuxram, haren, paulus, srikar, robin.murphy,
	jean-philippe.brucker, marc.zyngier
In-Reply-To: <fd8fee94cf42e436878f179c7895de3a4dab3355.camel@kernel.crashing.org>

On Sun, Aug 05, 2018 at 02:52:54PM +1000, Benjamin Herrenschmidt wrote:
> On Sun, 2018-08-05 at 03:22 +0300, Michael S. Tsirkin wrote:
> > I see the allure of this, but I think down the road you will
> > discover passing a flag in libvirt XML saying
> > "please use a secure mode" or whatever is a good idea.
> > 
> > Even thought it is probably not required to address this
> > specific issue.
> > 
> > For example, I don't think ballooning works in secure mode,
> > you will be able to teach libvirt not to try to add a
> > balloon to the guest.
> 
> Right, we'll need some quirk to disable balloons  in the guest I
> suppose.
> 
> Passing something from libvirt is cumbersome because the end user may
> not even need to know about secure VMs. There are use cases where the
> security is a contract down to some special application running inside
> the secure VM, the sysadmin knows nothing about.
> 
> Also there's repercussions all the way to admin tools, web UIs etc...
> so it's fairly wide ranging.
> 
> So as long as we only need to quirk a couple of devices, it's much
> better contained that way.

So just the balloon thing already means that yes management and all the
way to the user tools must know this is going on. Otherwise
user will try to inflate the balloon and wonder why this does not work.

> > > Later on, (we may have even already run Linux at that point,
> > > unsecurely, as we can use Linux as a bootloader under some
> > > circumstances), we start a "secure image".
> > > 
> > > This is a kernel zImage that includes a "ticket" that has the
> > > appropriate signature etc... so that when that kernel starts, it can
> > > authenticate with the ultravisor, be verified (along with its ramdisk)
> > > etc... and copied (by the UV) into secure memory & run from there.
> > > 
> > > At that point, the hypervisor is informed that the VM has become
> > > secure.
> > > 
> > > So at that point, we could exit to qemu to inform it of the change,
> > 
> > That's probably a good idea too.
> 
> We probably will have to tell qemu eventually for migration, as we'll
> need some kind of key exchange phase etc... to deal with the crypto
> aspects (the actual page copy is sorted via encrypting the secure pages
> back to normal pages in qemu, but we'll need extra metadata).
> 
> > > and
> > > have it walk the qtree and "Switch" all the virtio devices to use the
> > > IOMMU I suppose, but it feels a lot grosser to me.
> > 
> > That part feels gross, yes.
> > 
> > > That's the only other option I can think of.
> > > 
> > > > However in this specific case, the flag does not need to come from the
> > > > hypervisor, it can be set by arch boot code I think.
> > > > Christoph do you see a problem with that?
> > > 
> > > The above could do that yes. Another approach would be to do it from a
> > > small virtio "quirk" that pokes a bit in the device to force it to
> > > iommu mode when it detects that we are running in a secure VM. That's a
> > > bit warty on the virito side but probably not as much as having a qemu
> > > one that walks of the virtio devices to change how they behave.
> > > 
> > > What do you reckon ?
> > 
> > I think you are right that for the dma limit the hypervisor doesn't seem
> > to need to know.
> 
> It's not just a limit mind you. It's a range, at least if we allocate
> just a single pool of insecure pages. swiotlb feels like a better
> option for us.
> 
> > > What we want to avoid is to expose any of this to the *end user* or
> > > libvirt or any other higher level of the management stack. We really
> > > want that stuff to remain contained between the VM itself, KVM and
> > > maybe qemu.
> > > 
> > > We will need some other qemu changes for migration so that's ok. But
> > > the minute you start touching libvirt and the higher levels it becomes
> > > a nightmare.
> > > 
> > > Cheers,
> > > Ben.
> > 
> > I don't believe you'll be able to avoid that entirely. The split between
> > libvirt and qemu is more about community than about code, random bits of
> > functionality tend to land on random sides of that fence.  Better add a
> > tag in domain XML early is my advice. Having said that, it's your
> > hypervisor. I'm just suggesting that when hypervisor does somehow need
> > to care then I suspect most people won't be receptive to the argument
> > that changing libvirt is a nightmare.
> 
> It only needs to care at runtime. The problem isn't changing libvirt
> per-se, I don't have a problem with that. The problem is that it means
> creating two categories of machines "secure" and "non-secure", which is
> end-user visible, and thus has to be escalated to all the various
> management stacks, UIs, etc... out there.
> 
> In addition, there are some cases where the individual creating the VMs
> may not have any idea that they are secure.
> 
> But yes, if we have to, we'll do it. However, so far, we don't think
> it's a great idea.
> 
> Cheers,
> Ben.

Here's another example: you can't migrate a secure vm to hypervisor
which doesn't support this feature. Again management tools above libvirt
need to know otherwise they will try.

> > > > > >   To get swiotlb you'll need to then use the DT/ACPI
> > > > > > dma-range property to limit the addressable range, and a swiotlb
> > > > > > capable plaform will use swiotlb automatically.
> > > > > 
> > > > > This cannot be done as you describe it.
> > > > > 
> > > > > The VM is created as a *normal* VM. The DT stuff is generated by qemu
> > > > > at a point where it has *no idea* that the VM will later become secure
> > > > > and thus will have to restrict which pages can be used for "DMA".
> > > > > 
> > > > > The VM will *at runtime* turn itself into a secure VM via interactions
> > > > > with the security HW and the Ultravisor layer (which sits below the
> > > > > HV). This happens way after the DT has been created and consumed, the
> > > > > qemu devices instanciated etc...
> > > > > 
> > > > > Only the guest kernel knows because it initates the transition. When
> > > > > that happens, the virtio devices have already been used by the guest
> > > > > firmware, bootloader, possibly another kernel that kexeced the "secure"
> > > > > one, etc... 
> > > > > 
> > > > > So instead of running around saying NAK NAK NAK, please explain how we
> > > > > can solve that differently.
> > > > > 
> > > > > Ben.

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Michael S. Tsirkin @ 2018-08-06 13:36 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Benjamin Herrenschmidt, robh, srikar, aik, Jason Wang, linuxram,
	linux-kernel, virtualization, hch, paulus, joe, david,
	linuxppc-dev, elfring, haren
In-Reply-To: <74a1e1b8-81e0-84db-6d0d-d8bd9caebb4a@linux.vnet.ibm.com>

On Mon, Aug 06, 2018 at 02:32:28PM +0530, Anshuman Khandual wrote:
> On 08/05/2018 05:54 AM, Michael S. Tsirkin wrote:
> > On Fri, Aug 03, 2018 at 08:21:26PM -0500, Benjamin Herrenschmidt wrote:
> >> On Fri, 2018-08-03 at 22:08 +0300, Michael S. Tsirkin wrote:
> >>>>>> Please go through these patches and review whether this approach broadly
> >>>>>> makes sense. I will appreciate suggestions, inputs, comments regarding
> >>>>>> the patches or the approach in general. Thank you.
> >>>>>
> >>>>> Jason did some work on profiling this. Unfortunately he reports
> >>>>> about 4% extra overhead from this switch on x86 with no vIOMMU.
> >>>>
> >>>> The test is rather simple, just run pktgen (pktgen_sample01_simple.sh) in
> >>>> guest and measure PPS on tap on host.
> >>>>
> >>>> Thanks
> >>>
> >>> Could you supply host configuration involved please?
> >>
> >> I wonder how much of that could be caused by Spectre mitigations
> >> blowing up indirect function calls...
> >>
> >> Cheers,
> >> Ben.
> > 
> > I won't be surprised. If yes I suggested a way to mitigate the overhead.
> 
> Did we get better results (lower regression due to indirect calls) with
> the suggested mitigation ? Just curious.

I'm referring to this:
	I wonder whether we can support map_sg and friends being NULL, then use
	that when mapping is an identity. A conditional branch there is likely
	very cheap.

I don't think anyone tried implementing this yes.

-- 
MST

^ permalink raw reply

* [PATCH] misc: ibmvsm: Fix wrong assignment of return code
From: Bryant G. Ly @ 2018-08-06 13:31 UTC (permalink / raw)
  To: gregkh; +Cc: linuxppc-dev, linux-kernel, Bryant G. Ly

From: "Bryant G. Ly" <bryantly@linux.ibm.com>

Currently the assignment is flipped and rc is always 0.

Signed-off-by: Bryant G. Ly <bryantly@linux.ibm.com>
Reviewed-by: Bradley Warrum <bwarrum@us.ibm.com>
---
 drivers/misc/ibmvmc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/ibmvmc.c b/drivers/misc/ibmvmc.c
index 8f82bb9..b8aaa68 100644
--- a/drivers/misc/ibmvmc.c
+++ b/drivers/misc/ibmvmc.c
@@ -2131,7 +2131,7 @@ static int ibmvmc_init_crq_queue(struct crq_server_adapter *adapter)
 	retrc = plpar_hcall_norets(H_REG_CRQ,
 				   vdev->unit_address,
 				   queue->msg_token, PAGE_SIZE);
-	retrc = rc;
+	rc = retrc;
 
 	if (rc == H_RESOURCE)
 		rc = ibmvmc_reset_crq_queue(adapter);
-- 
2.7.2

^ permalink raw reply related

* Re: [PATCH v5 0/8] powerpc/fsl: Speculation barrier for NXP PowerPC Book3E
From: Diana Madalina Craciun @ 2018-08-06 13:28 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev@ozlabs.org
  Cc: oss@buserror.net, Leo Li, Bharat Bhushan
In-Reply-To: <20180727230639.25413-1-mpe@ellerman.id.au>

Hi Michael,=0A=
=0A=
Sorry for the late answer, I was out of the office last week.=0A=
=0A=
It looks fine to me, I have tested the patches on NXP PowerPC Book 3E=0A=
platforms and it worked well.=0A=
=0A=
Thanks,=0A=
=0A=
Diana=0A=
=0A=
On 7/28/2018 2:06 AM, Michael Ellerman wrote:=0A=
> Implement barrier_nospec for NXP PowerPC Book3E processors.=0A=
>=0A=
> Hi Diana,=0A=
>=0A=
> This series interacts with another series of mine, so I wanted to rework =
it=0A=
> slightly. Let me know if this looks OK to you.=0A=
>=0A=
> cheers=0A=
>=0A=
> Diana Craciun (6):=0A=
>   powerpc/64: Disable the speculation barrier from the command line=0A=
>   powerpc/64: Make stf barrier PPC_BOOK3S_64 specific.=0A=
>   powerpc/64: Make meltdown reporting Book3S 64 specific=0A=
>   powerpc/fsl: Add barrier_nospec implementation for NXP PowerPC Book3E=
=0A=
>   powerpc/fsl: Sanitize the syscall table for NXP PowerPC 32 bit=0A=
>     platforms=0A=
>   Documentation: Add nospectre_v1 parameter=0A=
>=0A=
> Michael Ellerman (2):=0A=
>   powerpc/64: Add CONFIG_PPC_BARRIER_NOSPEC=0A=
>   powerpc/64: Call setup_barrier_nospec() from setup_arch()=0A=
>=0A=
>  Documentation/admin-guide/kernel-parameters.txt |  4 +++=0A=
>  arch/powerpc/Kconfig                            |  7 ++++-=0A=
>  arch/powerpc/include/asm/barrier.h              | 12 ++++++---=0A=
>  arch/powerpc/include/asm/setup.h                |  6 ++++-=0A=
>  arch/powerpc/kernel/Makefile                    |  3 ++-=0A=
>  arch/powerpc/kernel/entry_32.S                  | 10 +++++++=0A=
>  arch/powerpc/kernel/module.c                    |  4 ++-=0A=
>  arch/powerpc/kernel/security.c                  | 17 +++++++++++-=0A=
>  arch/powerpc/kernel/setup-common.c              |  2 ++=0A=
>  arch/powerpc/kernel/vmlinux.lds.S               |  4 ++-=0A=
>  arch/powerpc/lib/feature-fixups.c               | 35 +++++++++++++++++++=
+++++-=0A=
>  arch/powerpc/platforms/powernv/setup.c          |  1 -=0A=
>  arch/powerpc/platforms/pseries/setup.c          |  1 -=0A=
>  13 files changed, 94 insertions(+), 12 deletions(-)=0A=
>=0A=
=0A=

^ permalink raw reply

* Re: [PATCH v2] selftests/powerpc: Avoid remaining process/threads
From: Michael Ellerman @ 2018-08-06 11:06 UTC (permalink / raw)
  To: Breno Leitao, linuxppc-dev; +Cc: Breno Leitao, Gustavo Romero
In-Reply-To: <1533307039-13744-1-git-send-email-leitao@debian.org>

Breno Leitao <leitao@debian.org> writes:

> diff --git a/tools/testing/selftests/powerpc/harness.c b/tools/testing/selftests/powerpc/harness.c
> index 66d31de60b9a..06c51e8d8ccb 100644
> --- a/tools/testing/selftests/powerpc/harness.c
> +++ b/tools/testing/selftests/powerpc/harness.c
> @@ -85,13 +85,16 @@ int run_test(int (test_function)(void), char *name)
>  	return status;
>  }
>  
> -static void alarm_handler(int signum)
> +static void sig_handler(int signum)
>  {
> -	/* Jut wake us up from waitpid */
> +	if (signum == SIGINT)
> +		kill(-pid, SIGTERM);

I don't think we need to do that here, if we just return then we'll pop
out of the waitpid() and go via the normal path.

Can you test with the existing signal handler, but wired up to SIGINT?

cheers

^ permalink raw reply

* Re: Build regressions/improvements in v4.17-rc1
From: Geert Uytterhoeven @ 2018-08-06 10:39 UTC (permalink / raw)
  To: Linux Kernel Mailing List, Dan Williams, Michael Ellerman,
	Andrew Morton, linuxppc-dev
In-Reply-To: <1523884165-17044-1-git-send-email-geert@linux-m68k.org>

CC Dan, Michael, AKPM, powerpc

On Mon, Apr 16, 2018 at 3:10 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> Below is the list of build error/warning regressions/improvements in
> v4.17-rc1[1] compared to v4.16[2].

I'd like to point your attention to:

>   + warning: vmlinux.o(.text+0x376518): Section mismatch in reference from the function .devm_memremap_pages() to the function .meminit.text:.arch_add_memory():  => N/A
>   + warning: vmlinux.o(.text+0x376d64): Section mismatch in reference from the function .devm_memremap_pages_release() to the function .meminit.text:.arch_remove_memory():  => N/A
>   + warning: vmlinux.o(.text+0x37dfd8): Section mismatch in reference from the function .devm_memremap_pages() to the function .meminit.text:.arch_add_memory():  => N/A
>   + warning: vmlinux.o(.text+0x37e824): Section mismatch in reference from the function .devm_memremap_pages_release() to the function .meminit.text:.arch_remove_memory():  => N/A
>   + warning: vmlinux.o(.text+0x3944): Section mismatch in reference from the variable start_here_multiplatform to the function .init.text:early_setup():  => N/A
>   + warning: vmlinux.o(.text+0x3978): Section mismatch in reference from the variable start_here_common to the function .init.text:start_kernel():  => N/A
>   + warning: vmlinux.o(.text+0x3a66c): Section mismatch in reference from the function mips_sc_init() to the function .init.text:mips_sc_probe_cm3():  => N/A
>   + warning: vmlinux.o(.text+0x3e9908): Section mismatch in reference from the function .devm_memremap_pages() to the function .meminit.text:.arch_add_memory():  => N/A
>   + warning: vmlinux.o(.text+0x3ea154): Section mismatch in reference from the function .devm_memremap_pages_release() to the function .meminit.text:.arch_remove_memory():  => N/A
>   + warning: vmlinux.o(.text+0x498dbc): Section mismatch in reference from the function hmm_devmem_release() to the function .meminit.text:arch_remove_memory():  => N/A
>   + warning: vmlinux.o(.text+0x499130): Section mismatch in reference from the function hmm_devmem_pages_create() to the function .meminit.text:arch_add_memory():  => N/A
>   + warning: vmlinux.o(.text+0x4a59ec): Section mismatch in reference from the function .hmm_devmem_release() to the function .meminit.text:.arch_remove_memory():  => N/A
>   + warning: vmlinux.o(.text+0x4a5d08): Section mismatch in reference from the function .hmm_devmem_pages_create() to the function .meminit.text:.arch_add_memory():  => N/A
>   + warning: vmlinux.o(.text+0x4ad8ac): Section mismatch in reference from the function .hmm_devmem_release() to the function .meminit.text:.arch_remove_memory():  => N/A
>   + warning: vmlinux.o(.text+0x4adbc8): Section mismatch in reference from the function .hmm_devmem_pages_create() to the function .meminit.text:.arch_add_memory():  => N/A
>   + warning: vmlinux.o(.text+0x4ca7238): Section mismatch in reference from the function .create_device_attrs() to the function .init.text:.make_sensor_label():  => N/A
>   + warning: vmlinux.o(.text+0x51ffec): Section mismatch in reference from the function .hmm_devmem_release() to the function .meminit.text:.arch_remove_memory():  => N/A
>   + warning: vmlinux.o(.text+0x520308): Section mismatch in reference from the function .hmm_devmem_pages_create() to the function .meminit.text:.arch_add_memory():  => N/A

These are still seen on v4.18-rc8 on various powerpc all{mod,yes}config builds.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Christoph Hellwig @ 2018-08-06  9:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Christoph Hellwig, Michael S. Tsirkin, Will Deacon,
	Anshuman Khandual, virtualization, linux-kernel, linuxppc-dev,
	aik, robh, joe, elfring, david, jasowang, mpe, linuxram, haren,
	paulus, srikar, robin.murphy, jean-philippe.brucker, marc.zyngier
In-Reply-To: <b7e8294e3e70d24072883a7e8e5375719d5af870.camel@kernel.crashing.org>

On Mon, Aug 06, 2018 at 07:16:47AM +1000, Benjamin Herrenschmidt wrote:
> Who would set this bit ? qemu ? Under what circumstances ?

I don't really care who sets what.  The implementation might not even
involved qemu.

It is your job to write a coherent interface specification that does
not depend on the used components.  The hypervisor might be PAPR,
Linux + qemu, VMware, Hyperv or something so secret that you'd have
to shoot me if you had to tell me.  The guest might be Linux, FreeBSD,
AIX, OS400 or a Hipster project of the day in Rust.  As long as we
properly specify the interface it simplify does not matter.

> What would be the effect of this bit while VIRTIO_F_IOMMU is NOT set,
> ie, what would qemu do and what would Linux do ? I'm not sure I fully
> understand your idea.

In a perfect would we'd just reuse VIRTIO_F_IOMMU and clarify the
description which currently is rather vague but basically captures
the use case.  Currently is is:

VIRTIO_F_IOMMU_PLATFORM(33)
    This feature indicates that the device is behind an IOMMU that
    translates bus addresses from the device into physical addresses in
    memory. If this feature bit is set to 0, then the device emits
    physical addresses which are not translated further, even though an
    IOMMU may be present.

And I'd change it to something like:

VIRTIO_F_PLATFORM_DMA(33)
    This feature indicates that the device emits platform specific
    bus addresses that might not be identical to physical address.
    The translation of physical to bus address is platform speific
    and defined by the plaform specification for the bus that the virtio
    device is attached to.
    If this feature bit is set to 0, then the device emits
    physical addresses which are not translated further, even if
    the platform would normally require translations for the bus that
    the virtio device is attached to.

If we can't change the defintion any more we should deprecate the
old VIRTIO_F_IOMMU_PLATFORM bit, and require the VIRTIO_F_IOMMU_PLATFORM
and VIRTIO_F_PLATFORM_DMA to be not set at the same time.

> I'm trying to understand because the limitation is not a device side
> limitation, it's not a qemu limitation, it's actually more of a VM
> limitation. It has most of its memory pages made inaccessible for
> security reasons. The platform from a qemu/KVM perspective is almost
> entirely normal.

Well, find a way to describe this either in the qemu specification using
new feature bits, or by using something like the above.

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Anshuman Khandual @ 2018-08-06  9:02 UTC (permalink / raw)
  To: Michael S. Tsirkin, Benjamin Herrenschmidt
  Cc: robh, srikar, aik, Jason Wang, linuxram, linux-kernel,
	virtualization, hch, paulus, joe, david, linuxppc-dev, elfring,
	haren
In-Reply-To: <20180805032355-mutt-send-email-mst@kernel.org>

On 08/05/2018 05:54 AM, Michael S. Tsirkin wrote:
> On Fri, Aug 03, 2018 at 08:21:26PM -0500, Benjamin Herrenschmidt wrote:
>> On Fri, 2018-08-03 at 22:08 +0300, Michael S. Tsirkin wrote:
>>>>>> Please go through these patches and review whether this approach broadly
>>>>>> makes sense. I will appreciate suggestions, inputs, comments regarding
>>>>>> the patches or the approach in general. Thank you.
>>>>>
>>>>> Jason did some work on profiling this. Unfortunately he reports
>>>>> about 4% extra overhead from this switch on x86 with no vIOMMU.
>>>>
>>>> The test is rather simple, just run pktgen (pktgen_sample01_simple.sh) in
>>>> guest and measure PPS on tap on host.
>>>>
>>>> Thanks
>>>
>>> Could you supply host configuration involved please?
>>
>> I wonder how much of that could be caused by Spectre mitigations
>> blowing up indirect function calls...
>>
>> Cheers,
>> Ben.
> 
> I won't be surprised. If yes I suggested a way to mitigate the overhead.

Did we get better results (lower regression due to indirect calls) with
the suggested mitigation ? Just curious.

^ permalink raw reply

* Re: [PATCH] powerpc/fadump: handle crash memory ranges array overflow
From: Michael Ellerman @ 2018-08-06  8:13 UTC (permalink / raw)
  To: Hari Bathini, Mahesh Jagannath Salgaonkar, Hari Bathini
  Cc: stable, linuxppc-dev
In-Reply-To: <a552618a-64e0-6c54-29cd-7d8b49093350@linux.vnet.ibm.com>

Hari Bathini <hbathini@linux.vnet.ibm.com> writes:
> On Monday 06 August 2018 09:52 AM, Mahesh Jagannath Salgaonkar wrote:
>> On 07/31/2018 07:26 PM, Hari Bathini wrote:
>>> Crash memory ranges is an array of memory ranges of the crashing kernel
>>> to be exported as a dump via /proc/vmcore file. The size of the array
>>> is set based on INIT_MEMBLOCK_REGIONS, which works alright in most cases
>>> where memblock memory regions count is less than INIT_MEMBLOCK_REGIONS
>>> value. But this count can grow beyond INIT_MEMBLOCK_REGIONS value since
>>> commit 142b45a72e22 ("memblock: Add array resizing support").
>>>
...
>>>
>>> Fixes: 2df173d9e85d ("fadump: Initialize elfcore header and add PT_LOAD program headers.")
>>> Cc: stable@vger.kernel.org
>>> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
>>> ---
>>>   arch/powerpc/include/asm/fadump.h |    2 +
>>>   arch/powerpc/kernel/fadump.c      |   63 ++++++++++++++++++++++++++++++++++---
>>>   2 files changed, 59 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/fadump.h b/arch/powerpc/include/asm/fadump.h
>>> index 5a23010..ff708b3 100644
>>> --- a/arch/powerpc/include/asm/fadump.h
>>> +++ b/arch/powerpc/include/asm/fadump.h
>>> @@ -196,7 +196,7 @@ struct fadump_crash_info_header {
>>>   };
>>>
>>>   /* Crash memory ranges */
>>> -#define INIT_CRASHMEM_RANGES	(INIT_MEMBLOCK_REGIONS + 2)
>>> +#define INIT_CRASHMEM_RANGES	INIT_MEMBLOCK_REGIONS
>>>
>>>   struct fad_crash_memory_ranges {
>>>   	unsigned long long	base;
>>> diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
>>> index 07e8396..1c1df4f 100644
>>> --- a/arch/powerpc/kernel/fadump.c
>>> +++ b/arch/powerpc/kernel/fadump.c
...
>
>> Also alongwith this change, Should we also double the initial array size
>> (e.g. INIT_CRASHMEM_RANGES * 2) to reduce our chances to go for memory
>> allocation ?
>
> Agreed that doubling the static array size reduces the likelihood of the 
> need for
> dynamic array resizing. Will do that.
>
> Nonetheless, if we get to the point where 2K memory allocation fails on 
> a system with so many memory ranges, it is likely that the kernel has some basic 
> problems to deal with first :)

Yes, this all seems a bit silly. 

Why not just allocate a 64K page and be done with it?

AFAICS we're not being called too early to do that, and if you can't
allocate a single page then the system is going to OOM anyway.

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/fadump: handle crash memory ranges array overflow
From: Hari Bathini @ 2018-08-06  5:35 UTC (permalink / raw)
  To: Mahesh Jagannath Salgaonkar, Hari Bathini, Michael Ellerman
  Cc: stable, linuxppc-dev
In-Reply-To: <c0232b82-57d7-db09-6ddb-68370bee4ff2@linux.vnet.ibm.com>



On Monday 06 August 2018 09:52 AM, Mahesh Jagannath Salgaonkar wrote:
> On 07/31/2018 07:26 PM, Hari Bathini wrote:
>> Crash memory ranges is an array of memory ranges of the crashing kernel
>> to be exported as a dump via /proc/vmcore file. The size of the array
>> is set based on INIT_MEMBLOCK_REGIONS, which works alright in most cases
>> where memblock memory regions count is less than INIT_MEMBLOCK_REGIONS
>> value. But this count can grow beyond INIT_MEMBLOCK_REGIONS value since
>> commit 142b45a72e22 ("memblock: Add array resizing support").
>>
>> On large memory systems with a few DLPAR operations, the memblock memory
>> regions count could be larger than INIT_MEMBLOCK_REGIONS value. On such
>> systems, registering fadump results in crash or other system failures
>> like below:
>>
>>    task: c00007f39a290010 ti: c00000000b738000 task.ti: c00000000b738000
>>    NIP: c000000000047df4 LR: c0000000000f9e58 CTR: c00000000010f180
>>    REGS: c00000000b73b570 TRAP: 0300   Tainted: G          L   X  (4.4.140+)
>>    MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22004484  XER: 20000000
>>    CFAR: c000000000008500 DAR: 000007a450000000 DSISR: 40000000 SOFTE: 0
>>    GPR00: c0000000000f9e58 c00000000b73b7f0 c000000000f09a00 000000000000001a
>>    GPR04: c00007f3bf774c90 0000000000000004 c000000000eb9a00 0000000000000800
>>    GPR08: 0000000000000804 000007a450000000 c000000000fa9a00 c00007ffb169ca20
>>    GPR12: 0000000022004482 c00000000fa12c00 c00007f3a0ea97a8 0000000000000000
>>    GPR16: c00007f3a0ea9a50 c00000000b73bd60 0000000000000118 000000000001fe80
>>    GPR20: 0000000000000118 0000000000000000 c000000000b8c980 00000000000000d0
>>    GPR24: 000007ffb0b10000 c00007ffb169c980 0000000000000000 c000000000b8c980
>>    GPR28: 0000000000000004 c00007ffb169c980 000000000000001a c00007ffb169c980
>>    NIP [c000000000047df4] smp_send_reschedule+0x24/0x80
>>    LR [c0000000000f9e58] resched_curr+0x138/0x160
>>    Call Trace:
>>    [c00000000b73b7f0] [c0000000000f9e58] resched_curr+0x138/0x160 (unreliable)
>>    [c00000000b73b820] [c0000000000fb538] check_preempt_curr+0xc8/0xf0
>>    [c00000000b73b850] [c0000000000fb598] ttwu_do_wakeup+0x38/0x150
>>    [c00000000b73b890] [c0000000000fc9c4] try_to_wake_up+0x224/0x4d0
>>    [c00000000b73b900] [c00000000011ef34] __wake_up_common+0x94/0x100
>>    [c00000000b73b960] [c00000000034a78c] ep_poll_callback+0xac/0x1c0
>>    [c00000000b73b9b0] [c00000000011ef34] __wake_up_common+0x94/0x100
>>    [c00000000b73ba10] [c00000000011f810] __wake_up_sync_key+0x70/0xa0
>>    [c00000000b73ba60] [c00000000067c3e8] sock_def_readable+0x58/0xa0
>>    [c00000000b73ba90] [c0000000007848ac] unix_stream_sendmsg+0x2dc/0x4c0
>>    [c00000000b73bb70] [c000000000675a38] sock_sendmsg+0x68/0xa0
>>    [c00000000b73bba0] [c00000000067673c] ___sys_sendmsg+0x2cc/0x2e0
>>    [c00000000b73bd30] [c000000000677dbc] __sys_sendmsg+0x5c/0xc0
>>    [c00000000b73bdd0] [c0000000006789bc] SyS_socketcall+0x36c/0x3f0
>>    [c00000000b73be30] [c000000000009488] system_call+0x3c/0x100
>>    Instruction dump:
>>    4e800020 60000000 60420000 3c4c00ec 38421c30 7c0802a6 f8010010 60000000
>>    3d42000a e92ab420 2fa90000 4dde0020 <e9290000> 2fa90000 419e0044 7c0802a6
>>    ---[ end trace a6d1dd4bab5f8253 ]---
>>
>> as array index overflow is not checked for while setting up crash memory
>> ranges causing memory corruption. To resolve this issue, resize crash
>> memory ranges array on hitting array size limit.
>>
>> But without a hard limit on the number of crash memory ranges, there is
>> a possibility of program headers count overflow in the /proc/vmcore ELF
>> file while exporting each of this memory ranges as PT_LOAD segments. To
>> reduce the likelihood of such scenario, fold adjacent memory ranges to
>> minimize the total number of crash memory ranges.
>>
>> Fixes: 2df173d9e85d ("fadump: Initialize elfcore header and add PT_LOAD program headers.")
>> Cc: stable@vger.kernel.org
>> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
>> ---
>>   arch/powerpc/include/asm/fadump.h |    2 +
>>   arch/powerpc/kernel/fadump.c      |   63 ++++++++++++++++++++++++++++++++++---
>>   2 files changed, 59 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/fadump.h b/arch/powerpc/include/asm/fadump.h
>> index 5a23010..ff708b3 100644
>> --- a/arch/powerpc/include/asm/fadump.h
>> +++ b/arch/powerpc/include/asm/fadump.h
>> @@ -196,7 +196,7 @@ struct fadump_crash_info_header {
>>   };
>>
>>   /* Crash memory ranges */
>> -#define INIT_CRASHMEM_RANGES	(INIT_MEMBLOCK_REGIONS + 2)
>> +#define INIT_CRASHMEM_RANGES	INIT_MEMBLOCK_REGIONS
>>
>>   struct fad_crash_memory_ranges {
>>   	unsigned long long	base;
>> diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
>> index 07e8396..1c1df4f 100644
>> --- a/arch/powerpc/kernel/fadump.c
>> +++ b/arch/powerpc/kernel/fadump.c
>> @@ -47,7 +47,9 @@ static struct fadump_mem_struct fdm;
>>   static const struct fadump_mem_struct *fdm_active;
>>
>>   static DEFINE_MUTEX(fadump_mutex);
>> -struct fad_crash_memory_ranges crash_memory_ranges[INIT_CRASHMEM_RANGES];
>> +struct fad_crash_memory_ranges init_crash_memory_ranges[INIT_CRASHMEM_RANGES];
>> +int max_crash_mem_ranges = INIT_CRASHMEM_RANGES;
>> +struct fad_crash_memory_ranges *crash_memory_ranges = init_crash_memory_ranges;
>>   int crash_mem_ranges;
>>
>>   /* Scan the Firmware Assisted dump configuration details. */
>> @@ -871,14 +873,65 @@ static int __init process_fadump(const struct fadump_mem_struct *fdm_active)
>>   static inline void fadump_add_crash_memory(unsigned long long base,
>>   					unsigned long long end)
>>   {
>> +	u64  start, size;
>> +	bool is_adjacent = false;
>> +
>>   	if (base == end)
>>   		return;
>>
>> +	/*
>> +	 * Fold adjacent memory ranges to bring down the memory ranges/
>> +	 * PT_LOAD segments count.
>> +	 */
>> +	if (crash_mem_ranges) {
>> +		start = crash_memory_ranges[crash_mem_ranges-1].base;
>> +		size = crash_memory_ranges[crash_mem_ranges-1].size;
>> +
>> +		if ((start + size) == base)
>> +			is_adjacent = true;
>> +	}
>> +
>> +	if (!is_adjacent) {
>> +		/* resize the array on reaching the limit */
>> +		if (crash_mem_ranges == max_crash_mem_ranges) {
>> +			u64 old_size, new_max;
>> +			struct fad_crash_memory_ranges *new_array;
>> +
>> +			old_size = max_crash_mem_ranges;
>> +			old_size *= sizeof(struct fad_crash_memory_ranges);
>> +
>> +			new_max = max_crash_mem_ranges + INIT_CRASHMEM_RANGES;
>> +			size = new_max * sizeof(struct fad_crash_memory_ranges);
>> +
>> +			pr_debug("Resizing crash memory ranges count from %d to %d\n",
>> +				 max_crash_mem_ranges, new_max);
>> +
>> +			new_array = kmalloc(size, GFP_KERNEL);
>> +			if (new_array == NULL) {
>> +				pr_warn("Insufficient memory for setting up crash memory ranges\n");
>> +				return;
> Looks like we still going ahead with fadump registration in this case.
> This will give us partial dump. Should we not fail fadump registration
> if we are not able to cover all the crash memory ranges ??

I preferred to register even if we failed to setup all memory ranges as 
partial dump
is better than no dump at all. As that dump may not be useful always, 
should I just
error out to avoid false expectations?

> Also alongwith this change, Should we also double the initial array size
> (e.g. INIT_CRASHMEM_RANGES * 2) to reduce our chances to go for memory
> allocation ?

Agreed that doubling the static array size reduces the likelihood of the 
need for
dynamic array resizing. Will do that.

Nonetheless, if we get to the point where 2K memory allocation fails on 
a system
with so many memory ranges, it is likely that the kernel has some basic 
problems
to deal with first :)

Thanks
Hari

^ permalink raw reply

* Re: [PATCH] powerpc/fadump: handle crash memory ranges array overflow
From: Mahesh Jagannath Salgaonkar @ 2018-08-06  4:22 UTC (permalink / raw)
  To: Hari Bathini, Michael Ellerman; +Cc: stable, linuxppc-dev
In-Reply-To: <153304539025.23724.13483958866671131484.stgit@hbathini.in.ibm.com>

On 07/31/2018 07:26 PM, Hari Bathini wrote:
> Crash memory ranges is an array of memory ranges of the crashing kernel
> to be exported as a dump via /proc/vmcore file. The size of the array
> is set based on INIT_MEMBLOCK_REGIONS, which works alright in most cases
> where memblock memory regions count is less than INIT_MEMBLOCK_REGIONS
> value. But this count can grow beyond INIT_MEMBLOCK_REGIONS value since
> commit 142b45a72e22 ("memblock: Add array resizing support").
> 
> On large memory systems with a few DLPAR operations, the memblock memory
> regions count could be larger than INIT_MEMBLOCK_REGIONS value. On such
> systems, registering fadump results in crash or other system failures
> like below:
> 
>   task: c00007f39a290010 ti: c00000000b738000 task.ti: c00000000b738000
>   NIP: c000000000047df4 LR: c0000000000f9e58 CTR: c00000000010f180
>   REGS: c00000000b73b570 TRAP: 0300   Tainted: G          L   X  (4.4.140+)
>   MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22004484  XER: 20000000
>   CFAR: c000000000008500 DAR: 000007a450000000 DSISR: 40000000 SOFTE: 0
>   GPR00: c0000000000f9e58 c00000000b73b7f0 c000000000f09a00 000000000000001a
>   GPR04: c00007f3bf774c90 0000000000000004 c000000000eb9a00 0000000000000800
>   GPR08: 0000000000000804 000007a450000000 c000000000fa9a00 c00007ffb169ca20
>   GPR12: 0000000022004482 c00000000fa12c00 c00007f3a0ea97a8 0000000000000000
>   GPR16: c00007f3a0ea9a50 c00000000b73bd60 0000000000000118 000000000001fe80
>   GPR20: 0000000000000118 0000000000000000 c000000000b8c980 00000000000000d0
>   GPR24: 000007ffb0b10000 c00007ffb169c980 0000000000000000 c000000000b8c980
>   GPR28: 0000000000000004 c00007ffb169c980 000000000000001a c00007ffb169c980
>   NIP [c000000000047df4] smp_send_reschedule+0x24/0x80
>   LR [c0000000000f9e58] resched_curr+0x138/0x160
>   Call Trace:
>   [c00000000b73b7f0] [c0000000000f9e58] resched_curr+0x138/0x160 (unreliable)
>   [c00000000b73b820] [c0000000000fb538] check_preempt_curr+0xc8/0xf0
>   [c00000000b73b850] [c0000000000fb598] ttwu_do_wakeup+0x38/0x150
>   [c00000000b73b890] [c0000000000fc9c4] try_to_wake_up+0x224/0x4d0
>   [c00000000b73b900] [c00000000011ef34] __wake_up_common+0x94/0x100
>   [c00000000b73b960] [c00000000034a78c] ep_poll_callback+0xac/0x1c0
>   [c00000000b73b9b0] [c00000000011ef34] __wake_up_common+0x94/0x100
>   [c00000000b73ba10] [c00000000011f810] __wake_up_sync_key+0x70/0xa0
>   [c00000000b73ba60] [c00000000067c3e8] sock_def_readable+0x58/0xa0
>   [c00000000b73ba90] [c0000000007848ac] unix_stream_sendmsg+0x2dc/0x4c0
>   [c00000000b73bb70] [c000000000675a38] sock_sendmsg+0x68/0xa0
>   [c00000000b73bba0] [c00000000067673c] ___sys_sendmsg+0x2cc/0x2e0
>   [c00000000b73bd30] [c000000000677dbc] __sys_sendmsg+0x5c/0xc0
>   [c00000000b73bdd0] [c0000000006789bc] SyS_socketcall+0x36c/0x3f0
>   [c00000000b73be30] [c000000000009488] system_call+0x3c/0x100
>   Instruction dump:
>   4e800020 60000000 60420000 3c4c00ec 38421c30 7c0802a6 f8010010 60000000
>   3d42000a e92ab420 2fa90000 4dde0020 <e9290000> 2fa90000 419e0044 7c0802a6
>   ---[ end trace a6d1dd4bab5f8253 ]---
> 
> as array index overflow is not checked for while setting up crash memory
> ranges causing memory corruption. To resolve this issue, resize crash
> memory ranges array on hitting array size limit.
> 
> But without a hard limit on the number of crash memory ranges, there is
> a possibility of program headers count overflow in the /proc/vmcore ELF
> file while exporting each of this memory ranges as PT_LOAD segments. To
> reduce the likelihood of such scenario, fold adjacent memory ranges to
> minimize the total number of crash memory ranges.
> 
> Fixes: 2df173d9e85d ("fadump: Initialize elfcore header and add PT_LOAD program headers.")
> Cc: stable@vger.kernel.org
> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/fadump.h |    2 +
>  arch/powerpc/kernel/fadump.c      |   63 ++++++++++++++++++++++++++++++++++---
>  2 files changed, 59 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/fadump.h b/arch/powerpc/include/asm/fadump.h
> index 5a23010..ff708b3 100644
> --- a/arch/powerpc/include/asm/fadump.h
> +++ b/arch/powerpc/include/asm/fadump.h
> @@ -196,7 +196,7 @@ struct fadump_crash_info_header {
>  };
> 
>  /* Crash memory ranges */
> -#define INIT_CRASHMEM_RANGES	(INIT_MEMBLOCK_REGIONS + 2)
> +#define INIT_CRASHMEM_RANGES	INIT_MEMBLOCK_REGIONS
> 
>  struct fad_crash_memory_ranges {
>  	unsigned long long	base;
> diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
> index 07e8396..1c1df4f 100644
> --- a/arch/powerpc/kernel/fadump.c
> +++ b/arch/powerpc/kernel/fadump.c
> @@ -47,7 +47,9 @@ static struct fadump_mem_struct fdm;
>  static const struct fadump_mem_struct *fdm_active;
> 
>  static DEFINE_MUTEX(fadump_mutex);
> -struct fad_crash_memory_ranges crash_memory_ranges[INIT_CRASHMEM_RANGES];
> +struct fad_crash_memory_ranges init_crash_memory_ranges[INIT_CRASHMEM_RANGES];
> +int max_crash_mem_ranges = INIT_CRASHMEM_RANGES;
> +struct fad_crash_memory_ranges *crash_memory_ranges = init_crash_memory_ranges;
>  int crash_mem_ranges;
> 
>  /* Scan the Firmware Assisted dump configuration details. */
> @@ -871,14 +873,65 @@ static int __init process_fadump(const struct fadump_mem_struct *fdm_active)
>  static inline void fadump_add_crash_memory(unsigned long long base,
>  					unsigned long long end)
>  {
> +	u64  start, size;
> +	bool is_adjacent = false;
> +
>  	if (base == end)
>  		return;
> 
> +	/*
> +	 * Fold adjacent memory ranges to bring down the memory ranges/
> +	 * PT_LOAD segments count.
> +	 */
> +	if (crash_mem_ranges) {
> +		start = crash_memory_ranges[crash_mem_ranges-1].base;
> +		size = crash_memory_ranges[crash_mem_ranges-1].size;
> +
> +		if ((start + size) == base)
> +			is_adjacent = true;
> +	}
> +
> +	if (!is_adjacent) {
> +		/* resize the array on reaching the limit */
> +		if (crash_mem_ranges == max_crash_mem_ranges) {
> +			u64 old_size, new_max;
> +			struct fad_crash_memory_ranges *new_array;
> +
> +			old_size = max_crash_mem_ranges;
> +			old_size *= sizeof(struct fad_crash_memory_ranges);
> +
> +			new_max = max_crash_mem_ranges + INIT_CRASHMEM_RANGES;
> +			size = new_max * sizeof(struct fad_crash_memory_ranges);
> +
> +			pr_debug("Resizing crash memory ranges count from %d to %d\n",
> +				 max_crash_mem_ranges, new_max);
> +
> +			new_array = kmalloc(size, GFP_KERNEL);
> +			if (new_array == NULL) {
> +				pr_warn("Insufficient memory for setting up crash memory ranges\n");
> +				return;

Looks like we still going ahead with fadump registration in this case.
This will give us partial dump. Should we not fail fadump registration
if we are not able to cover all the crash memory ranges ??

Also alongwith this change, Should we also double the initial array size
(e.g. INIT_CRASHMEM_RANGES * 2) to reduce our chances to go for memory
allocation ?

Thanks,
-Mahesh.

> +			}
> +
> +			/*
> +			 * Copy the old memory ranges into the new array before
> +			 * free'ing it.
> +			 */
> +			memcpy(new_array, crash_memory_ranges, old_size);
> +			if (crash_memory_ranges != init_crash_memory_ranges)
> +				kfree(crash_memory_ranges);
> +
> +			crash_memory_ranges = new_array;
> +			max_crash_mem_ranges = new_max;
> +		}
> +		start = base;
> +		crash_memory_ranges[crash_mem_ranges].base = start;
> +		crash_mem_ranges++;
> +	}
> +
> +	crash_memory_ranges[crash_mem_ranges-1].size = (end - start);
> +
>  	pr_debug("crash_memory_range[%d] [%#016llx-%#016llx], %#llx bytes\n",
> -		crash_mem_ranges, base, end - 1, (end - base));
> -	crash_memory_ranges[crash_mem_ranges].base = base;
> -	crash_memory_ranges[crash_mem_ranges].size = end - base;
> -	crash_mem_ranges++;
> +		(crash_mem_ranges - 1), start, end - 1, (end - start));
>  }
> 
>  static void fadump_exclude_reserved_area(unsigned long long start,
> 

^ permalink raw reply

* Re: [PATCH] misc: cxl: changed asterisk position
From: Andrew Donnellan @ 2018-08-06  0:57 UTC (permalink / raw)
  To: Parth Y Shah, fbarrat, arnd, gregkh; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1533291638-22224-1-git-send-email-sparth1292@gmail.com>

On 03/08/18 20:20, Parth Y Shah wrote:
> Resolved <"foo* bar" should be "foo *bar"> error
> 
> Signed-off-by: Parth Y Shah <sparth1292@gmail.com>

Thanks for picking this up.

Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>

> ---
>   drivers/misc/cxl/fault.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/cxl/fault.c b/drivers/misc/cxl/fault.c
> index 70dbb6d..d45f3e6 100644
> --- a/drivers/misc/cxl/fault.c
> +++ b/drivers/misc/cxl/fault.c
> @@ -33,7 +33,7 @@ static bool sste_matches(struct cxl_sste *sste, struct copro_slb *slb)
>    * This finds a free SSTE for the given SLB, or returns NULL if it's already in
>    * the segment table.
>    */
> -static struct cxl_sste* find_free_sste(struct cxl_context *ctx,
> +static struct cxl_sste *find_free_sste(struct cxl_context *ctx,
>   				       struct copro_slb *slb)
>   {
>   	struct cxl_sste *primary, *sste, *ret = NULL;
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan@au1.ibm.com  IBM Australia Limited

^ permalink raw reply

* Re: [v2, 3/3] ptp_qoriq: support automatic configuration for ptp timer
From: David Miller @ 2018-08-06  0:12 UTC (permalink / raw)
  To: yangbo.lu
  Cc: netdev, madalin.bucur, richardcochran, robh+dt, shawnguo,
	devicetree, linuxppc-dev, linux-arm-kernel, linux-kernel
In-Reply-To: <20180801100554.36634-3-yangbo.lu@nxp.com>

From: Yangbo Lu <yangbo.lu@nxp.com>
Date: Wed,  1 Aug 2018 18:05:54 +0800

> This patch is to support automatic configuration for ptp timer.
> If required ptp dts properties are not provided, driver could
> try to calculate a set of default configurations to initialize
> the ptp timer. This makes the driver work for many boards which
> don't have the required ptp dts properties in current kernel.
> Also the users could set dts properties by themselves according
> to their requirement.
> 
> Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
> ---
> Changes for v2:
> 	- Dropped module_param.

Applied.

^ permalink raw reply

* Re: [v2, 2/3] powerpc/mpc85xx: add clocks property for fman ptp timer node
From: David Miller @ 2018-08-06  0:12 UTC (permalink / raw)
  To: yangbo.lu
  Cc: netdev, madalin.bucur, richardcochran, robh+dt, shawnguo,
	devicetree, linuxppc-dev, linux-arm-kernel, linux-kernel
In-Reply-To: <20180801100554.36634-2-yangbo.lu@nxp.com>

From: Yangbo Lu <yangbo.lu@nxp.com>
Date: Wed,  1 Aug 2018 18:05:53 +0800

> This patch is to add clocks property for fman ptp timer node.
> 
> Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
> ---
> Changes for v2:
> 	- None.

Applied.

^ permalink raw reply

* Re: [v2, 1/3] arm64: dts: fsl: add clocks property for fman ptp timer node
From: David Miller @ 2018-08-06  0:12 UTC (permalink / raw)
  To: yangbo.lu
  Cc: netdev, madalin.bucur, richardcochran, robh+dt, shawnguo,
	devicetree, linuxppc-dev, linux-arm-kernel, linux-kernel
In-Reply-To: <20180801100554.36634-1-yangbo.lu@nxp.com>

From: Yangbo Lu <yangbo.lu@nxp.com>
Date: Wed,  1 Aug 2018 18:05:52 +0800

> This patch is to add clocks property for fman ptp timer node.
> 
> Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
> ---
> Changes for v2:
> 	- None.

Applied.

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Benjamin Herrenschmidt @ 2018-08-05 21:30 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Michael S. Tsirkin, Will Deacon, Anshuman Khandual,
	virtualization, linux-kernel, linuxppc-dev, aik, robh, joe,
	elfring, david, jasowang, mpe, linuxram, haren, paulus, srikar,
	robin.murphy, jean-philippe.brucker, marc.zyngier
In-Reply-To: <b7e8294e3e70d24072883a7e8e5375719d5af870.camel@kernel.crashing.org>

On Mon, 2018-08-06 at 07:16 +1000, Benjamin Herrenschmidt wrote:
> I'm trying to understand because the limitation is not a device side
> limitation, it's not a qemu limitation, it's actually more of a VM
> limitation. It has most of its memory pages made inaccessible for
> security reasons. The platform from a qemu/KVM perspective is almost
> entirely normal.

In fact this is probably the best image of what's going on:

It's a normal VM from a KVM/qemu perspective (and thus virtio). It
boots normally, can run firmware, linux, etc... normally, it's not
created with any different XML or qemu command line definition etc...

It just that once it reaches the kernel with the secure stuff enabled
(could be via kexec from a normal kernel), that kernel will "stash
away" most of the VM's memory into some secure space that nothing else
(not even the hypervisor) can access.

It can keep around a pool or two of normal memory for bounce buferring
IOs but that's about it.

I think that's the clearest way I could find to explain what's going
on, and why I'm so resistant on adding things on qemu side.

That said, we *can* (and will) notify KVM and qemu of the transition,
and we can/will do so after virtio has been instanciated and used by
the bootloader, but before it will be used (or even probed) by the
secure VM itself, so there's an opportunity to poke at things, either
from the VM itself (a quirk poking at virtio config space for example)
or from qemu (though I find the idea of iterating all virtio devices
from qemu to change a setting rather gross).

Cheers,
Ben.

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Benjamin Herrenschmidt @ 2018-08-05 21:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Michael S. Tsirkin, Will Deacon, Anshuman Khandual,
	virtualization, linux-kernel, linuxppc-dev, aik, robh, joe,
	elfring, david, jasowang, mpe, linuxram, haren, paulus, srikar,
	robin.murphy, jean-philippe.brucker, marc.zyngier
In-Reply-To: <20180805072930.GB23288@infradead.org>

On Sun, 2018-08-05 at 00:29 -0700, Christoph Hellwig wrote:
> On Sun, Aug 05, 2018 at 11:10:15AM +1000, Benjamin Herrenschmidt wrote:
> >  - One you have rejected, which is to have a way for "no-iommu" virtio
> > (which still doesn't use an iommu on the qemu side and doesn't need
> > to), to be forced to use some custom DMA ops on the VM side.
> > 
> >  - One, which sadly has more overhead and will require modifying more
> > pieces of the puzzle, which is to make qemu uses an emulated iommu.
> > Once we make qemu do that, we can then layer swiotlb on top of the
> > emulated iommu on the guest side, and pass that as dma_ops to virtio.
> 
> Or number three:  have a a virtio feature bit that tells the VM
> to use whatever dma ops the platform thinks are appropinquate for
> the bus it pretends to be on.  Then set a dma-range that is limited
> to your secure memory range (if you really need it to be runtime
> enabled only after a device reset that rescans) and use the normal
> dma mapping code to bounce buffer.

Who would set this bit ? qemu ? Under what circumstances ?

What would be the effect of this bit while VIRTIO_F_IOMMU is NOT set,
ie, what would qemu do and what would Linux do ? I'm not sure I fully
understand your idea.

I'm trying to understand because the limitation is not a device side
limitation, it's not a qemu limitation, it's actually more of a VM
limitation. It has most of its memory pages made inaccessible for
security reasons. The platform from a qemu/KVM perspective is almost
entirely normal.

So I don't understand when would qemu set this bit, or should it be set
by the VM at runtime ?

Cheers,
Ben.

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Christoph Hellwig @ 2018-08-05  7:25 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Christoph Hellwig, Benjamin Herrenschmidt, Will Deacon,
	Anshuman Khandual, virtualization, linux-kernel, linuxppc-dev,
	aik, robh, joe, elfring, david, jasowang, mpe, linuxram, haren,
	paulus, srikar, robin.murphy, jean-philippe.brucker, marc.zyngier
In-Reply-To: <20180805030326-mutt-send-email-mst@kernel.org>

On Sun, Aug 05, 2018 at 03:09:55AM +0300, Michael S. Tsirkin wrote:
> So in this case however I'm not sure what exactly do we want to add. It
> seems that from point of view of the device, there is nothing special -
> it just gets a PA and writes there.  It also seems that guest does not
> need to get any info from the device either. Instead guest itself needs
> device to DMA into specific addresses, for its own reasons.
> 
> It seems that the fact that within guest it's implemented using a bounce
> buffer and that it's easiest to do by switching virtio to use the DMA API
> isn't something virtio spec concerns itself with.

And that is exactly what we added bus_dma_mask for - the case where
the device itself has not limitation (or a bigger limitation), but
the platform limits the accessible dma ranges.  One typical case is
a PCIe root port that is only connected to the CPU through an
interconnect that is limited to 32 address bits for example.

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Christoph Hellwig @ 2018-08-05  7:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Christoph Hellwig, Michael S. Tsirkin, Will Deacon,
	Anshuman Khandual, virtualization, linux-kernel, linuxppc-dev,
	aik, robh, joe, elfring, david, jasowang, mpe, linuxram, haren,
	paulus, srikar, robin.murphy, jean-philippe.brucker, marc.zyngier
In-Reply-To: <a3ec0a454928e880dabc3782ea906e318634abf9.camel@kernel.crashing.org>

On Sun, Aug 05, 2018 at 11:10:15AM +1000, Benjamin Herrenschmidt wrote:
>  - One you have rejected, which is to have a way for "no-iommu" virtio
> (which still doesn't use an iommu on the qemu side and doesn't need
> to), to be forced to use some custom DMA ops on the VM side.
> 
>  - One, which sadly has more overhead and will require modifying more
> pieces of the puzzle, which is to make qemu uses an emulated iommu.
> Once we make qemu do that, we can then layer swiotlb on top of the
> emulated iommu on the guest side, and pass that as dma_ops to virtio.

Or number three:  have a a virtio feature bit that tells the VM
to use whatever dma ops the platform thinks are appropinquate for
the bus it pretends to be on.  Then set a dma-range that is limited
to your secure memory range (if you really need it to be runtime
enabled only after a device reset that rescans) and use the normal
dma mapping code to bounce buffer.

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Benjamin Herrenschmidt @ 2018-08-05  4:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Christoph Hellwig, Will Deacon, Anshuman Khandual, virtualization,
	linux-kernel, linuxppc-dev, aik, robh, joe, elfring, david,
	jasowang, mpe, linuxram, haren, paulus, srikar, robin.murphy,
	jean-philippe.brucker, marc.zyngier
In-Reply-To: <20180805031046-mutt-send-email-mst@kernel.org>

On Sun, 2018-08-05 at 03:22 +0300, Michael S. Tsirkin wrote:
> I see the allure of this, but I think down the road you will
> discover passing a flag in libvirt XML saying
> "please use a secure mode" or whatever is a good idea.
> 
> Even thought it is probably not required to address this
> specific issue.
> 
> For example, I don't think ballooning works in secure mode,
> you will be able to teach libvirt not to try to add a
> balloon to the guest.

Right, we'll need some quirk to disable balloons  in the guest I
suppose.

Passing something from libvirt is cumbersome because the end user may
not even need to know about secure VMs. There are use cases where the
security is a contract down to some special application running inside
the secure VM, the sysadmin knows nothing about.

Also there's repercussions all the way to admin tools, web UIs etc...
so it's fairly wide ranging.

So as long as we only need to quirk a couple of devices, it's much
better contained that way.

> > Later on, (we may have even already run Linux at that point,
> > unsecurely, as we can use Linux as a bootloader under some
> > circumstances), we start a "secure image".
> > 
> > This is a kernel zImage that includes a "ticket" that has the
> > appropriate signature etc... so that when that kernel starts, it can
> > authenticate with the ultravisor, be verified (along with its ramdisk)
> > etc... and copied (by the UV) into secure memory & run from there.
> > 
> > At that point, the hypervisor is informed that the VM has become
> > secure.
> > 
> > So at that point, we could exit to qemu to inform it of the change,
> 
> That's probably a good idea too.

We probably will have to tell qemu eventually for migration, as we'll
need some kind of key exchange phase etc... to deal with the crypto
aspects (the actual page copy is sorted via encrypting the secure pages
back to normal pages in qemu, but we'll need extra metadata).

> > and
> > have it walk the qtree and "Switch" all the virtio devices to use the
> > IOMMU I suppose, but it feels a lot grosser to me.
> 
> That part feels gross, yes.
> 
> > That's the only other option I can think of.
> > 
> > > However in this specific case, the flag does not need to come from the
> > > hypervisor, it can be set by arch boot code I think.
> > > Christoph do you see a problem with that?
> > 
> > The above could do that yes. Another approach would be to do it from a
> > small virtio "quirk" that pokes a bit in the device to force it to
> > iommu mode when it detects that we are running in a secure VM. That's a
> > bit warty on the virito side but probably not as much as having a qemu
> > one that walks of the virtio devices to change how they behave.
> > 
> > What do you reckon ?
> 
> I think you are right that for the dma limit the hypervisor doesn't seem
> to need to know.

It's not just a limit mind you. It's a range, at least if we allocate
just a single pool of insecure pages. swiotlb feels like a better
option for us.

> > What we want to avoid is to expose any of this to the *end user* or
> > libvirt or any other higher level of the management stack. We really
> > want that stuff to remain contained between the VM itself, KVM and
> > maybe qemu.
> > 
> > We will need some other qemu changes for migration so that's ok. But
> > the minute you start touching libvirt and the higher levels it becomes
> > a nightmare.
> > 
> > Cheers,
> > Ben.
> 
> I don't believe you'll be able to avoid that entirely. The split between
> libvirt and qemu is more about community than about code, random bits of
> functionality tend to land on random sides of that fence.  Better add a
> tag in domain XML early is my advice. Having said that, it's your
> hypervisor. I'm just suggesting that when hypervisor does somehow need
> to care then I suspect most people won't be receptive to the argument
> that changing libvirt is a nightmare.

It only needs to care at runtime. The problem isn't changing libvirt
per-se, I don't have a problem with that. The problem is that it means
creating two categories of machines "secure" and "non-secure", which is
end-user visible, and thus has to be escalated to all the various
management stacks, UIs, etc... out there.

In addition, there are some cases where the individual creating the VMs
may not have any idea that they are secure.

But yes, if we have to, we'll do it. However, so far, we don't think
it's a great idea.

Cheers,
Ben.

> > > > >   To get swiotlb you'll need to then use the DT/ACPI
> > > > > dma-range property to limit the addressable range, and a swiotlb
> > > > > capable plaform will use swiotlb automatically.
> > > > 
> > > > This cannot be done as you describe it.
> > > > 
> > > > The VM is created as a *normal* VM. The DT stuff is generated by qemu
> > > > at a point where it has *no idea* that the VM will later become secure
> > > > and thus will have to restrict which pages can be used for "DMA".
> > > > 
> > > > The VM will *at runtime* turn itself into a secure VM via interactions
> > > > with the security HW and the Ultravisor layer (which sits below the
> > > > HV). This happens way after the DT has been created and consumed, the
> > > > qemu devices instanciated etc...
> > > > 
> > > > Only the guest kernel knows because it initates the transition. When
> > > > that happens, the virtio devices have already been used by the guest
> > > > firmware, bootloader, possibly another kernel that kexeced the "secure"
> > > > one, etc... 
> > > > 
> > > > So instead of running around saying NAK NAK NAK, please explain how we
> > > > can solve that differently.
> > > > 
> > > > Ben.

^ permalink raw reply

* Re: [PATCH 2/2] powerpc/cpu: post the event cpux add/remove instead of online/offline during hotplug
From: Pingfan Liu @ 2018-08-05  2:52 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, Benjamin Herrenschmidt, Rafael J . Wysocki,
	Tyrel Datwyler, linux-pm
In-Reply-To: <87tvoag835.fsf@concordia.ellerman.id.au>

On Sat, Aug 4, 2018 at 10:48 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Hi Pingfan,
>
> Pingfan Liu <kernelfans@gmail.com> writes:
> > Technically speaking, echo 1/0 > cpuX/online is only a subset of cpu
> > hotplug/unplug, i.e. add/remove. The latter one includes the physical
> > adding/removing of a cpu device. Some user space tools such as kexec-tools
> > resort to the event add/remove to automatically rebuild dtb.
> > If the dtb is not rebuilt correctly, we may hang on 2nd kernel due to
> > lack the info of boot-cpu-hwid in dtb.
>
> I notice you also sent a patch for ppc64_cpu to deal with CPUs being
> removed rather than just offlined.
>
> If I apply this patch then existing ppc64_cpu (without your other patch)
> will break. Is that right?
>
Yes. Since removing cpu will make a hole in cpu map, but ppc64_cpu
code takes the assumption that the cpu map is contiguous.

Thanks,
Pingfan

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Benjamin Herrenschmidt @ 2018-08-05  1:11 UTC (permalink / raw)
  To: Michael S. Tsirkin, Christoph Hellwig
  Cc: Will Deacon, Anshuman Khandual, virtualization, linux-kernel,
	linuxppc-dev, aik, robh, joe, elfring, david, jasowang, mpe,
	linuxram, haren, paulus, srikar, robin.murphy,
	jean-philippe.brucker, marc.zyngier
In-Reply-To: <20180805030326-mutt-send-email-mst@kernel.org>

On Sun, 2018-08-05 at 03:09 +0300, Michael S. Tsirkin wrote:
> It seems that the fact that within guest it's implemented using a bounce
> buffer and that it's easiest to do by switching virtio to use the DMA API
> isn't something virtio spec concerns itself with.

Right, this is my reasoning as well. See this other (long) email I just
sent to Christoph to explain the whole flow.

> I'm open to suggestions.

Cheers,
Ben.

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Benjamin Herrenschmidt @ 2018-08-05  1:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Michael S. Tsirkin, Will Deacon, Anshuman Khandual,
	virtualization, linux-kernel, linuxppc-dev, aik, robh, joe,
	elfring, david, jasowang, mpe, linuxram, haren, paulus, srikar,
	robin.murphy, jean-philippe.brucker, marc.zyngier
In-Reply-To: <20180804082120.GB4421@infradead.org>

On Sat, 2018-08-04 at 01:21 -0700, Christoph Hellwig wrote:
> No matter if you like it or not (I don't!) virtio is defined to bypass
> dma translations, it is very clearly stated in the spec.  It has some
> ill-defined bits to bypass it, so if you want the dma mapping API
> to be used you'll have to set that bit (in its original form, a refined
> form, or an entirely newly defined sane form) and make sure your
> hypersivors always sets it.  It's not rocket science, just a little bit
> for work to make sure your setup is actually going to work reliably
> and portably.

I think you are conflating completely different things, let me try to
clarify, we might actually be talking past each other.

> > We aren't going to cancel years of HW and SW development for our
> 
> Maybe you should have actually read the specs you are claiming to
> implemented before spending all that effort.

Anyway, let's cool our respective jets and sort that out, there are
indeed other approaches than overriding the DMA ops with special ones,
though I find them less tasty ... but here's my attempt at a (simpler)
description.

Bear with me for the long-ish email, this tries to describe the system
so you get an idea where we come from, and options we can use to get
out of this.

So we *are* implementing the spec, since qemu is currently unmodified:

Default virtio will bypass the iommu emulated by qemu as per spec etc..

On the Linux side, thus, virtio "sees" a normal iommu-bypassing device
and will treat it as such.

The problem is the assumption in the middle that qemu can access all
guest pages directly, which holds true for traditional VMs, but breaks
when the VM in our case turns itself into a secure VM. This isn't under
the action (or due to changes in) the hypervisor. KVM operates (almost)
normally here.

But there's this (very thin and open source btw) layer underneath
called ultravisor, which exploits some HW facilities to maintain a
separate pool of "secure" memory, which cannot be physically accessed
by a non-secure entity.

So in our scenario, qemu and KVM create a VM totally normally, there is
no changes required to the VM firmware, bootloader(s), etc... in fact
we support Linux based bootloaders, and those will work as normal linux
would in a VM, virtio works normally, etc...

Until that VM (via grub or kexec for example) loads a "secure image".

That secure image is a Linux kernel which has been "wrapped" (to simply
imagine a modified zImage wrapper though that's not entirely exact).

When that is run, before it modifies it's .data, it will interact with
the ultravisor using a specific HW facility to make itself secure. What
happens then is that the UV cryptographically verifies the kernel and
ramdisk, and copies them to the secure memory where execution returns.

The Ultravisor is then involved as a small shim for hypercalls between
the secure VM and KVM to prevent leakage of information (sanitize
registers etc...).

Now at this point, qemu can no longer access the secure VM pages
(there's more to this, such as using HMM to allow migration/encryption
accross etc... but let's not get bogged down).

So virtio can no longer access any page in the VM.

Now the VM *can* request from the Ultravisor some selected pages to be
made "insecure" and thus shared with qemu. This is how we handle some
of the pages used in our paravirt stuff, and that's how we want to deal
with virtio, by creating an insecure swiotlb pool.

At this point, thus, there are two options.

 - One you have rejected, which is to have a way for "no-iommu" virtio
(which still doesn't use an iommu on the qemu side and doesn't need
to), to be forced to use some custom DMA ops on the VM side.

 - One, which sadly has more overhead and will require modifying more
pieces of the puzzle, which is to make qemu uses an emulated iommu.
Once we make qemu do that, we can then layer swiotlb on top of the
emulated iommu on the guest side, and pass that as dma_ops to virtio.

Now, assuming you still absolutely want us to go down the second
option, there are several ways to get there. We would prefer to avoid
requiring the user to pass some special option to qemu. That has an
impact up the food chain (libvirt, management tools etc...) and users
probably won't understand what it's about. In fact the *end user* might
not even need to know a VM is secure, though applications inside might.

There's the additional annoyance that currently our guest FW (SLOF)
cannot deal with virtio in IOMMU mode, but that's fixable.

>From there, refer to the email chain between Michael and I where we are
discussing options to "switch" virtio at runtime on the qemu side.

Any comment or suggestion ?

Cheers,
Ben.

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Benjamin Herrenschmidt @ 2018-08-05  0:53 UTC (permalink / raw)
  To: Christoph Hellwig, Michael S. Tsirkin
  Cc: Will Deacon, Anshuman Khandual, virtualization, linux-kernel,
	linuxppc-dev, aik, robh, joe, elfring, david, jasowang, mpe,
	linuxram, haren, paulus, srikar, robin.murphy,
	jean-philippe.brucker, marc.zyngier
In-Reply-To: <20180804081500.GA1455@infradead.org>

On Sat, 2018-08-04 at 01:15 -0700, Christoph Hellwig wrote:
>   b) a way to document in a virtio-related spec how the bus handles
>      dma for Ben's totally fucked up hypervisor.  Without that there
>      is not way we'll get interoperable implementations.

Christoph, this isn't a totally fucked up hypervisor. It's not even
about the hypervisor itself, I mean seriously, man, can you at least
bother reading what I described is going on with the security
architecture ?

Anyway, Michael is onto what could possibly be an alternative approach,
by having us tell qemu to flip to iommu mode at secure VM boot time.
Let's see where that leads.

Cheers,
Ben.

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Michael S. Tsirkin @ 2018-08-05  0:27 UTC (permalink / raw)
  To: Will Deacon
  Cc: Benjamin Herrenschmidt, Christoph Hellwig, Anshuman Khandual,
	virtualization, linux-kernel, linuxppc-dev, aik, robh, joe,
	elfring, david, jasowang, mpe, linuxram, haren, paulus, srikar,
	robin.murphy, jean-philippe.brucker, marc.zyngier
In-Reply-To: <20180801081637.GA14438@arm.com>

On Wed, Aug 01, 2018 at 09:16:38AM +0100, Will Deacon wrote:
> On Tue, Jul 31, 2018 at 03:36:22PM -0500, Benjamin Herrenschmidt wrote:
> > On Tue, 2018-07-31 at 10:30 -0700, Christoph Hellwig wrote:
> > > > However the question people raise is that DMA API is already full of
> > > > arch-specific tricks the likes of which are outlined in your post linked
> > > > above. How is this one much worse?
> > > 
> > > None of these warts is visible to the driver, they are all handled in
> > > the architecture (possibly on a per-bus basis).
> > > 
> > > So for virtio we really need to decide if it has one set of behavior
> > > as specified in the virtio spec, or if it behaves exactly as if it
> > > was on a PCI bus, or in fact probably both as you lined up.  But no
> > > magic arch specific behavior inbetween.
> > 
> > The only arch specific behaviour is needed in the case where it doesn't
> > behave like PCI. In this case, the PCI DMA ops are not suitable, but in
> > our secure VMs, we still need to make it use swiotlb in order to bounce
> > through non-secure pages.
> 
> On arm/arm64, the problem we have is that legacy virtio devices on the MMIO
> transport (so definitely not PCI) have historically been advertised by qemu
> as not being cache coherent, but because the virtio core has bypassed DMA
> ops then everything has happened to work. If we blindly enable the arch DMA
> ops, we'll plumb in the non-coherent ops and start getting data corruption,
> so we do need a way to quirk virtio as being "always coherent" if we want to
> use the DMA ops (which we do, because our emulation platforms have an IOMMU
> for all virtio devices).
> 
> Will

Right that's not very different from placing the device within the IOMMU
domain but in fact bypassing the IOMMU. I wonder whether anyone ever
needs a non coherent virtio-mmio. If yes we can extend
PLATFORM_IOMMU to cover that or add another bit.

What exactly do the non-coherent ops do that causes the corruption?

-- 
MST

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox