All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Graf <agraf@suse.de>
To: Mihai Caraman <mihai.caraman@freescale.com>, kvm-ppc@vger.kernel.org
Cc: kvm@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 5/5 v4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
Date: Thu, 03 Jul 2014 13:53:02 +0000	[thread overview]
Message-ID: <53B5603E.807@suse.de> (raw)
In-Reply-To: <1403909347-31622-6-git-send-email-mihai.caraman@freescale.com>


On 28.06.14 00:49, Mihai Caraman wrote:
> On book3e, KVM uses load external pid (lwepx) dedicated instruction to read
> guest last instruction on the exit path. lwepx exceptions (DTLB_MISS, DSI
> and LRAT), generated by loading a guest address, needs to be handled by KVM.
> These exceptions are generated in a substituted guest translation context
> (EPLC[EGS] = 1) from host context (MSR[GS] = 0).
>
> Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1),
> doing minimal checks on the fast path to avoid host performance degradation.
> lwepx exceptions originate from host state (MSR[GS] = 0) which implies
> additional checks in DO_KVM macro (beside the current MSR[GS] = 1) by looking
> at the Exception Syndrome Register (ESR[EPID]) and the External PID Load Context
> Register (EPLC[EGS]). Doing this on each Data TLB miss exception is obvious
> too intrusive for the host.
>
> Read guest last instruction from kvmppc_load_last_inst() by searching for the
> physical address and kmap it. This address the TODO for TLB eviction and
> execute-but-not-read entries, and allow us to get rid of lwepx until we are
> able to handle failures.
>
> A simple stress benchmark shows a 1% sys performance degradation compared with
> previous approach (lwepx without failure handling):
>
> time for i in `seq 1 10000`; do /bin/echo > /dev/null; done
>
> real    0m 8.85s
> user    0m 4.34s
> sys     0m 4.48s
>
> vs
>
> real    0m 8.84s
> user    0m 4.36s
> sys     0m 4.44s
>
> An alternative solution, to handle lwepx exceptions in KVM, is to temporary
> highjack the interrupt vector from host. Some cores share host IVOR registers
> between hardware threads, which is the case of FSL e6500, which impose additional
> synchronization logic for this solution to work. The optimization can be addressed
> later on top of this patch.
>
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> ---
> v4:
>   - add switch and new function when getting last inst earlier
>   - use enum instead of prev semnatic
>   - get rid of mas0, optimize mas7_mas3
>   - give more context in visible messages
>   - check storage attributes mismatch on MMUv2
>   - get rid of pfn_valid check
>
> v3:
>   - reworked patch description
>   - use unaltered kmap addr for kunmap
>   - get last instruction before beeing preempted
>
> v2:
>   - reworked patch description
>   - used pr_* functions
>   - addressed cosmetic feedback
>
>   arch/powerpc/kvm/booke.c              | 44 +++++++++++++++++
>   arch/powerpc/kvm/bookehv_interrupts.S | 37 ++++----------
>   arch/powerpc/kvm/e500_mmu_host.c      | 91 +++++++++++++++++++++++++++++++++++
>   3 files changed, 144 insertions(+), 28 deletions(-)
>
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 34a42b9..843077b 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -869,6 +869,28 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
>   	}
>   }
>   
> +static int kvmppc_resume_inst_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
> +				  enum emulation_result emulated, u32 last_inst)
> +{
> +	switch (emulated) {
> +	case EMULATE_AGAIN:
> +		return RESUME_GUEST;
> +
> +	case EMULATE_FAIL:
> +		pr_debug("%s: load instruction from guest address %lx failed\n",
> +		       __func__, vcpu->arch.pc);
> +		/* For debugging, encode the failing instruction and
> +		 * report it to userspace. */
> +		run->hw.hardware_exit_reason = ~0ULL << 32;
> +		run->hw.hardware_exit_reason |= last_inst;
> +		kvmppc_core_queue_program(vcpu, ESR_PIL);
> +		return RESUME_HOST;
> +
> +	default:
> +		BUG();
> +	}
> +}
> +
>   /**
>    * kvmppc_handle_exit
>    *
> @@ -880,6 +902,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	int r = RESUME_HOST;
>   	int s;
>   	int idx;
> +	u32 last_inst = KVM_INST_FETCH_FAILED;
> +	enum emulation_result emulated = EMULATE_DONE;
>   
>   	/* update before a new last_exit_type is rewritten */
>   	kvmppc_update_timing_stats(vcpu);
> @@ -887,6 +911,20 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	/* restart interrupts if they were meant for the host */
>   	kvmppc_restart_interrupt(vcpu, exit_nr);
>   
> +	/*
> +	 * get last instruction before beeing preempted
> +	 * TODO: for e6500 check also BOOKE_INTERRUPT_LRAT_ERROR & ESR_DATA
> +	 */
> +	switch (exit_nr) {
> +	case BOOKE_INTERRUPT_DATA_STORAGE:
> +	case BOOKE_INTERRUPT_DTLB_MISS:
> +	case BOOKE_INTERRUPT_HV_PRIV:
> +		emulated = kvmppc_get_last_inst(vcpu, false, &last_inst);
> +		break;
> +	default:
> +		break;
> +	}
> +
>   	local_irq_enable();
>   
>   	trace_kvm_exit(exit_nr, vcpu);
> @@ -895,6 +933,11 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	run->exit_reason = KVM_EXIT_UNKNOWN;
>   	run->ready_for_interrupt_injection = 1;
>   
> +	if (emulated != EMULATE_DONE) {
> +		r = kvmppc_resume_inst_load(run, vcpu, emulated, last_inst);
> +		goto out;
> +	}
> +
>   	switch (exit_nr) {
>   	case BOOKE_INTERRUPT_MACHINE_CHECK:
>   		printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR));
> @@ -1184,6 +1227,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   		BUG();
>   	}
>   
> +out:
>   	/*
>   	 * To avoid clobbering exit_reason, only check for signals if we
>   	 * aren't already exiting to userspace for some other reason.
> diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
> index 6ff4480..e000b39 100644
> --- a/arch/powerpc/kvm/bookehv_interrupts.S
> +++ b/arch/powerpc/kvm/bookehv_interrupts.S
> @@ -121,38 +121,14 @@
>   1:
>   
>   	.if	\flags & NEED_EMU
> -	/*
> -	 * This assumes you have external PID support.
> -	 * To support a bookehv CPU without external PID, you'll
> -	 * need to look up the TLB entry and create a temporary mapping.
> -	 *
> -	 * FIXME: we don't currently handle if the lwepx faults.  PR-mode
> -	 * booke doesn't handle it either.  Since Linux doesn't use
> -	 * broadcast tlbivax anymore, the only way this should happen is
> -	 * if the guest maps its memory execute-but-not-read, or if we
> -	 * somehow take a TLB miss in the middle of this entry code and
> -	 * evict the relevant entry.  On e500mc, all kernel lowmem is
> -	 * bolted into TLB1 large page mappings, and we don't use
> -	 * broadcast invalidates, so we should not take a TLB miss here.
> -	 *
> -	 * Later we'll need to deal with faults here.  Disallowing guest
> -	 * mappings that are execute-but-not-read could be an option on
> -	 * e500mc, but not on chips with an LRAT if it is used.
> -	 */
> -
> -	mfspr	r3, SPRN_EPLC	/* will already have correct ELPID and EGS */
>   	PPC_STL	r15, VCPU_GPR(R15)(r4)
>   	PPC_STL	r16, VCPU_GPR(R16)(r4)
>   	PPC_STL	r17, VCPU_GPR(R17)(r4)
>   	PPC_STL	r18, VCPU_GPR(R18)(r4)
>   	PPC_STL	r19, VCPU_GPR(R19)(r4)
> -	mr	r8, r3
>   	PPC_STL	r20, VCPU_GPR(R20)(r4)
> -	rlwimi	r8, r6, EPC_EAS_SHIFT - MSR_IR_LG, EPC_EAS
>   	PPC_STL	r21, VCPU_GPR(R21)(r4)
> -	rlwimi	r8, r6, EPC_EPR_SHIFT - MSR_PR_LG, EPC_EPR
>   	PPC_STL	r22, VCPU_GPR(R22)(r4)
> -	rlwimi	r8, r10, EPC_EPID_SHIFT, EPC_EPID
>   	PPC_STL	r23, VCPU_GPR(R23)(r4)
>   	PPC_STL	r24, VCPU_GPR(R24)(r4)
>   	PPC_STL	r25, VCPU_GPR(R25)(r4)
> @@ -162,10 +138,15 @@
>   	PPC_STL	r29, VCPU_GPR(R29)(r4)
>   	PPC_STL	r30, VCPU_GPR(R30)(r4)
>   	PPC_STL	r31, VCPU_GPR(R31)(r4)
> -	mtspr	SPRN_EPLC, r8
> -	isync
> -	lwepx   r9, 0, r5
> -	mtspr	SPRN_EPLC, r3
> +
> +	/*
> +	 * We don't use external PID support. lwepx faults would need to be
> +	 * handled by KVM and this implies aditional code in DO_KVM (for
> +	 * DTB_MISS, DSI and LRAT) to check ESR[EPID] and EPLC[EGS] which
> +	 * is too intrusive for the host. Get last instuction in
> +	 * kvmppc_get_last_inst().
> +	 */
> +	li	r9, KVM_INST_FETCH_FAILED
>   	stw	r9, VCPU_LAST_INST(r4)
>   	.endif
>   
> diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
> index 4b4e8d6..57463e5 100644
> --- a/arch/powerpc/kvm/e500_mmu_host.c
> +++ b/arch/powerpc/kvm/e500_mmu_host.c
> @@ -606,11 +606,102 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
>   	}
>   }
>   
> +#ifdef CONFIG_KVM_BOOKE_HV
>   int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, enum instruction_type type,
>   			  u32 *instr)
>   {
> +	gva_t geaddr;
> +	hpa_t addr;
> +	hfn_t pfn;
> +	hva_t eaddr;
> +	u32 mas1, mas2, mas3;
> +	u64 mas7_mas3;
> +	struct page *page;
> +	unsigned int addr_space, psize_shift;
> +	bool pr;
> +	unsigned long flags;
> +
> +	/* Search TLB for guest pc to get the real address */
> +	geaddr = kvmppc_get_pc(vcpu);
> +
> +	addr_space = (vcpu->arch.shared->msr & MSR_IS) >> MSR_IR_LG;
> +
> +	local_irq_save(flags);
> +	mtspr(SPRN_MAS6, (vcpu->arch.pid << MAS6_SPID_SHIFT) | addr_space);
> +	mtspr(SPRN_MAS5, MAS5_SGS | vcpu->kvm->arch.lpid);
> +	asm volatile("tlbsx 0, %[geaddr]\n" : :
> +		     [geaddr] "r" (geaddr));
> +	mtspr(SPRN_MAS5, 0);
> +	mtspr(SPRN_MAS8, 0);
> +	mas1 = mfspr(SPRN_MAS1);
> +	mas2 = mfspr(SPRN_MAS2);
> +	mas3 = mfspr(SPRN_MAS3);
> +#ifdef CONFIG_64BIT
> +	mas7_mas3 = mfspr(SPRN_MAS7_MAS3);
> +#else
> +	mas7_mas3 = ((u64)mfspr(SPRN_MAS7) << 32) | mas3;
> +#endif
> +	local_irq_restore(flags);
> +
> +	/*
> +	 * If the TLB entry for guest pc was evicted, return to the guest.
> +	 * There are high chances to find a valid TLB entry next time.
> +	 */
> +	if (!(mas1 & MAS1_VALID))
> +		return EMULATE_AGAIN;
> +
> +	/*
> +	 * Another thread may rewrite the TLB entry in parallel, don't
> +	 * execute from the address if the execute permission is not set
> +	 */
> +	pr = vcpu->arch.shared->msr & MSR_PR;
> +	if (unlikely((pr && !(mas3 & MAS3_UX)) ||
> +		     (!pr && !(mas3 & MAS3_SX)))) {
> +		pr_debug("%s: Instuction emulation from guest addres %08lx without execute permission\n",
> +			 __func__, geaddr);
> +		return EMULATE_FAIL;

In this case how did we ever get here? Why can't we just evict the entry 
and return EMULATE_AGAIN?

> +	}
> +
> +	/*
> +	 * The real address will be mapped by a cacheable, memory coherent,
> +	 * write-back page. Check for mismatches when LRAT is used.
> +	 */
> +	if (has_feature(vcpu, VCPU_FTR_MMU_V2) &&
> +	    unlikely((mas2 & MAS2_I) || (mas2 & MAS2_W) || !(mas2 & MAS2_M))) {
> +		pr_debug("%s: Instuction emulation from guest addres %08lx mismatches storage attributes\n",
> +			__func__, geaddr);
> +		return EMULATE_FAIL;

Hrm - do we really want to deal with injecting faults here? I'd say it's 
ok to just end up in an endless EMULATE_AGAIN loop.

> +	}
> +
> +	/* Get page size */
> +	psize_shift = MAS1_GET_TSIZE(mas1) + 10;
> +
> +	/* Map a page and get guest's instruction */
> +	addr = (mas7_mas3 & (~0ULL << psize_shift)) |
> +	       (geaddr & ((1ULL << psize_shift) - 1ULL));
> +	pfn = addr >> PAGE_SHIFT;
> +
> +	/* Guard us against emulation from devices area */
> +	if (unlikely(!page_is_ram(pfn))) {
> +		pr_debug("%s: Instruction emulation from non-RAM host addres %08llx is not supported\n",
> +			 __func__, addr);
> +		return EMULATE_FAIL;

Same here :).

> +	}
> +
> +	page = pfn_to_page(pfn);
> +	eaddr = (unsigned long)kmap_atomic(page);
> +	*instr = *(u32 *)(eaddr | (addr & ~PAGE_MASK));
> +	kunmap_atomic((u32 *)eaddr);

Doesn't kmap_atomic() have to be guarded somehow?


Alex


WARNING: multiple messages have this Message-ID (diff)
From: Alexander Graf <agraf@suse.de>
To: Mihai Caraman <mihai.caraman@freescale.com>, kvm-ppc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org, kvm@vger.kernel.org
Subject: Re: [PATCH 5/5 v4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
Date: Thu, 03 Jul 2014 15:53:02 +0200	[thread overview]
Message-ID: <53B5603E.807@suse.de> (raw)
In-Reply-To: <1403909347-31622-6-git-send-email-mihai.caraman@freescale.com>


On 28.06.14 00:49, Mihai Caraman wrote:
> On book3e, KVM uses load external pid (lwepx) dedicated instruction to read
> guest last instruction on the exit path. lwepx exceptions (DTLB_MISS, DSI
> and LRAT), generated by loading a guest address, needs to be handled by KVM.
> These exceptions are generated in a substituted guest translation context
> (EPLC[EGS] = 1) from host context (MSR[GS] = 0).
>
> Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1),
> doing minimal checks on the fast path to avoid host performance degradation.
> lwepx exceptions originate from host state (MSR[GS] = 0) which implies
> additional checks in DO_KVM macro (beside the current MSR[GS] = 1) by looking
> at the Exception Syndrome Register (ESR[EPID]) and the External PID Load Context
> Register (EPLC[EGS]). Doing this on each Data TLB miss exception is obvious
> too intrusive for the host.
>
> Read guest last instruction from kvmppc_load_last_inst() by searching for the
> physical address and kmap it. This address the TODO for TLB eviction and
> execute-but-not-read entries, and allow us to get rid of lwepx until we are
> able to handle failures.
>
> A simple stress benchmark shows a 1% sys performance degradation compared with
> previous approach (lwepx without failure handling):
>
> time for i in `seq 1 10000`; do /bin/echo > /dev/null; done
>
> real    0m 8.85s
> user    0m 4.34s
> sys     0m 4.48s
>
> vs
>
> real    0m 8.84s
> user    0m 4.36s
> sys     0m 4.44s
>
> An alternative solution, to handle lwepx exceptions in KVM, is to temporary
> highjack the interrupt vector from host. Some cores share host IVOR registers
> between hardware threads, which is the case of FSL e6500, which impose additional
> synchronization logic for this solution to work. The optimization can be addressed
> later on top of this patch.
>
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> ---
> v4:
>   - add switch and new function when getting last inst earlier
>   - use enum instead of prev semnatic
>   - get rid of mas0, optimize mas7_mas3
>   - give more context in visible messages
>   - check storage attributes mismatch on MMUv2
>   - get rid of pfn_valid check
>
> v3:
>   - reworked patch description
>   - use unaltered kmap addr for kunmap
>   - get last instruction before beeing preempted
>
> v2:
>   - reworked patch description
>   - used pr_* functions
>   - addressed cosmetic feedback
>
>   arch/powerpc/kvm/booke.c              | 44 +++++++++++++++++
>   arch/powerpc/kvm/bookehv_interrupts.S | 37 ++++----------
>   arch/powerpc/kvm/e500_mmu_host.c      | 91 +++++++++++++++++++++++++++++++++++
>   3 files changed, 144 insertions(+), 28 deletions(-)
>
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 34a42b9..843077b 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -869,6 +869,28 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
>   	}
>   }
>   
> +static int kvmppc_resume_inst_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
> +				  enum emulation_result emulated, u32 last_inst)
> +{
> +	switch (emulated) {
> +	case EMULATE_AGAIN:
> +		return RESUME_GUEST;
> +
> +	case EMULATE_FAIL:
> +		pr_debug("%s: load instruction from guest address %lx failed\n",
> +		       __func__, vcpu->arch.pc);
> +		/* For debugging, encode the failing instruction and
> +		 * report it to userspace. */
> +		run->hw.hardware_exit_reason = ~0ULL << 32;
> +		run->hw.hardware_exit_reason |= last_inst;
> +		kvmppc_core_queue_program(vcpu, ESR_PIL);
> +		return RESUME_HOST;
> +
> +	default:
> +		BUG();
> +	}
> +}
> +
>   /**
>    * kvmppc_handle_exit
>    *
> @@ -880,6 +902,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	int r = RESUME_HOST;
>   	int s;
>   	int idx;
> +	u32 last_inst = KVM_INST_FETCH_FAILED;
> +	enum emulation_result emulated = EMULATE_DONE;
>   
>   	/* update before a new last_exit_type is rewritten */
>   	kvmppc_update_timing_stats(vcpu);
> @@ -887,6 +911,20 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	/* restart interrupts if they were meant for the host */
>   	kvmppc_restart_interrupt(vcpu, exit_nr);
>   
> +	/*
> +	 * get last instruction before beeing preempted
> +	 * TODO: for e6500 check also BOOKE_INTERRUPT_LRAT_ERROR & ESR_DATA
> +	 */
> +	switch (exit_nr) {
> +	case BOOKE_INTERRUPT_DATA_STORAGE:
> +	case BOOKE_INTERRUPT_DTLB_MISS:
> +	case BOOKE_INTERRUPT_HV_PRIV:
> +		emulated = kvmppc_get_last_inst(vcpu, false, &last_inst);
> +		break;
> +	default:
> +		break;
> +	}
> +
>   	local_irq_enable();
>   
>   	trace_kvm_exit(exit_nr, vcpu);
> @@ -895,6 +933,11 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	run->exit_reason = KVM_EXIT_UNKNOWN;
>   	run->ready_for_interrupt_injection = 1;
>   
> +	if (emulated != EMULATE_DONE) {
> +		r = kvmppc_resume_inst_load(run, vcpu, emulated, last_inst);
> +		goto out;
> +	}
> +
>   	switch (exit_nr) {
>   	case BOOKE_INTERRUPT_MACHINE_CHECK:
>   		printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR));
> @@ -1184,6 +1227,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   		BUG();
>   	}
>   
> +out:
>   	/*
>   	 * To avoid clobbering exit_reason, only check for signals if we
>   	 * aren't already exiting to userspace for some other reason.
> diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
> index 6ff4480..e000b39 100644
> --- a/arch/powerpc/kvm/bookehv_interrupts.S
> +++ b/arch/powerpc/kvm/bookehv_interrupts.S
> @@ -121,38 +121,14 @@
>   1:
>   
>   	.if	\flags & NEED_EMU
> -	/*
> -	 * This assumes you have external PID support.
> -	 * To support a bookehv CPU without external PID, you'll
> -	 * need to look up the TLB entry and create a temporary mapping.
> -	 *
> -	 * FIXME: we don't currently handle if the lwepx faults.  PR-mode
> -	 * booke doesn't handle it either.  Since Linux doesn't use
> -	 * broadcast tlbivax anymore, the only way this should happen is
> -	 * if the guest maps its memory execute-but-not-read, or if we
> -	 * somehow take a TLB miss in the middle of this entry code and
> -	 * evict the relevant entry.  On e500mc, all kernel lowmem is
> -	 * bolted into TLB1 large page mappings, and we don't use
> -	 * broadcast invalidates, so we should not take a TLB miss here.
> -	 *
> -	 * Later we'll need to deal with faults here.  Disallowing guest
> -	 * mappings that are execute-but-not-read could be an option on
> -	 * e500mc, but not on chips with an LRAT if it is used.
> -	 */
> -
> -	mfspr	r3, SPRN_EPLC	/* will already have correct ELPID and EGS */
>   	PPC_STL	r15, VCPU_GPR(R15)(r4)
>   	PPC_STL	r16, VCPU_GPR(R16)(r4)
>   	PPC_STL	r17, VCPU_GPR(R17)(r4)
>   	PPC_STL	r18, VCPU_GPR(R18)(r4)
>   	PPC_STL	r19, VCPU_GPR(R19)(r4)
> -	mr	r8, r3
>   	PPC_STL	r20, VCPU_GPR(R20)(r4)
> -	rlwimi	r8, r6, EPC_EAS_SHIFT - MSR_IR_LG, EPC_EAS
>   	PPC_STL	r21, VCPU_GPR(R21)(r4)
> -	rlwimi	r8, r6, EPC_EPR_SHIFT - MSR_PR_LG, EPC_EPR
>   	PPC_STL	r22, VCPU_GPR(R22)(r4)
> -	rlwimi	r8, r10, EPC_EPID_SHIFT, EPC_EPID
>   	PPC_STL	r23, VCPU_GPR(R23)(r4)
>   	PPC_STL	r24, VCPU_GPR(R24)(r4)
>   	PPC_STL	r25, VCPU_GPR(R25)(r4)
> @@ -162,10 +138,15 @@
>   	PPC_STL	r29, VCPU_GPR(R29)(r4)
>   	PPC_STL	r30, VCPU_GPR(R30)(r4)
>   	PPC_STL	r31, VCPU_GPR(R31)(r4)
> -	mtspr	SPRN_EPLC, r8
> -	isync
> -	lwepx   r9, 0, r5
> -	mtspr	SPRN_EPLC, r3
> +
> +	/*
> +	 * We don't use external PID support. lwepx faults would need to be
> +	 * handled by KVM and this implies aditional code in DO_KVM (for
> +	 * DTB_MISS, DSI and LRAT) to check ESR[EPID] and EPLC[EGS] which
> +	 * is too intrusive for the host. Get last instuction in
> +	 * kvmppc_get_last_inst().
> +	 */
> +	li	r9, KVM_INST_FETCH_FAILED
>   	stw	r9, VCPU_LAST_INST(r4)
>   	.endif
>   
> diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
> index 4b4e8d6..57463e5 100644
> --- a/arch/powerpc/kvm/e500_mmu_host.c
> +++ b/arch/powerpc/kvm/e500_mmu_host.c
> @@ -606,11 +606,102 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
>   	}
>   }
>   
> +#ifdef CONFIG_KVM_BOOKE_HV
>   int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, enum instruction_type type,
>   			  u32 *instr)
>   {
> +	gva_t geaddr;
> +	hpa_t addr;
> +	hfn_t pfn;
> +	hva_t eaddr;
> +	u32 mas1, mas2, mas3;
> +	u64 mas7_mas3;
> +	struct page *page;
> +	unsigned int addr_space, psize_shift;
> +	bool pr;
> +	unsigned long flags;
> +
> +	/* Search TLB for guest pc to get the real address */
> +	geaddr = kvmppc_get_pc(vcpu);
> +
> +	addr_space = (vcpu->arch.shared->msr & MSR_IS) >> MSR_IR_LG;
> +
> +	local_irq_save(flags);
> +	mtspr(SPRN_MAS6, (vcpu->arch.pid << MAS6_SPID_SHIFT) | addr_space);
> +	mtspr(SPRN_MAS5, MAS5_SGS | vcpu->kvm->arch.lpid);
> +	asm volatile("tlbsx 0, %[geaddr]\n" : :
> +		     [geaddr] "r" (geaddr));
> +	mtspr(SPRN_MAS5, 0);
> +	mtspr(SPRN_MAS8, 0);
> +	mas1 = mfspr(SPRN_MAS1);
> +	mas2 = mfspr(SPRN_MAS2);
> +	mas3 = mfspr(SPRN_MAS3);
> +#ifdef CONFIG_64BIT
> +	mas7_mas3 = mfspr(SPRN_MAS7_MAS3);
> +#else
> +	mas7_mas3 = ((u64)mfspr(SPRN_MAS7) << 32) | mas3;
> +#endif
> +	local_irq_restore(flags);
> +
> +	/*
> +	 * If the TLB entry for guest pc was evicted, return to the guest.
> +	 * There are high chances to find a valid TLB entry next time.
> +	 */
> +	if (!(mas1 & MAS1_VALID))
> +		return EMULATE_AGAIN;
> +
> +	/*
> +	 * Another thread may rewrite the TLB entry in parallel, don't
> +	 * execute from the address if the execute permission is not set
> +	 */
> +	pr = vcpu->arch.shared->msr & MSR_PR;
> +	if (unlikely((pr && !(mas3 & MAS3_UX)) ||
> +		     (!pr && !(mas3 & MAS3_SX)))) {
> +		pr_debug("%s: Instuction emulation from guest addres %08lx without execute permission\n",
> +			 __func__, geaddr);
> +		return EMULATE_FAIL;

In this case how did we ever get here? Why can't we just evict the entry 
and return EMULATE_AGAIN?

> +	}
> +
> +	/*
> +	 * The real address will be mapped by a cacheable, memory coherent,
> +	 * write-back page. Check for mismatches when LRAT is used.
> +	 */
> +	if (has_feature(vcpu, VCPU_FTR_MMU_V2) &&
> +	    unlikely((mas2 & MAS2_I) || (mas2 & MAS2_W) || !(mas2 & MAS2_M))) {
> +		pr_debug("%s: Instuction emulation from guest addres %08lx mismatches storage attributes\n",
> +			__func__, geaddr);
> +		return EMULATE_FAIL;

Hrm - do we really want to deal with injecting faults here? I'd say it's 
ok to just end up in an endless EMULATE_AGAIN loop.

> +	}
> +
> +	/* Get page size */
> +	psize_shift = MAS1_GET_TSIZE(mas1) + 10;
> +
> +	/* Map a page and get guest's instruction */
> +	addr = (mas7_mas3 & (~0ULL << psize_shift)) |
> +	       (geaddr & ((1ULL << psize_shift) - 1ULL));
> +	pfn = addr >> PAGE_SHIFT;
> +
> +	/* Guard us against emulation from devices area */
> +	if (unlikely(!page_is_ram(pfn))) {
> +		pr_debug("%s: Instruction emulation from non-RAM host addres %08llx is not supported\n",
> +			 __func__, addr);
> +		return EMULATE_FAIL;

Same here :).

> +	}
> +
> +	page = pfn_to_page(pfn);
> +	eaddr = (unsigned long)kmap_atomic(page);
> +	*instr = *(u32 *)(eaddr | (addr & ~PAGE_MASK));
> +	kunmap_atomic((u32 *)eaddr);

Doesn't kmap_atomic() have to be guarded somehow?


Alex

WARNING: multiple messages have this Message-ID (diff)
From: Alexander Graf <agraf@suse.de>
To: Mihai Caraman <mihai.caraman@freescale.com>, kvm-ppc@vger.kernel.org
Cc: kvm@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 5/5 v4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
Date: Thu, 03 Jul 2014 15:53:02 +0200	[thread overview]
Message-ID: <53B5603E.807@suse.de> (raw)
In-Reply-To: <1403909347-31622-6-git-send-email-mihai.caraman@freescale.com>


On 28.06.14 00:49, Mihai Caraman wrote:
> On book3e, KVM uses load external pid (lwepx) dedicated instruction to read
> guest last instruction on the exit path. lwepx exceptions (DTLB_MISS, DSI
> and LRAT), generated by loading a guest address, needs to be handled by KVM.
> These exceptions are generated in a substituted guest translation context
> (EPLC[EGS] = 1) from host context (MSR[GS] = 0).
>
> Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1),
> doing minimal checks on the fast path to avoid host performance degradation.
> lwepx exceptions originate from host state (MSR[GS] = 0) which implies
> additional checks in DO_KVM macro (beside the current MSR[GS] = 1) by looking
> at the Exception Syndrome Register (ESR[EPID]) and the External PID Load Context
> Register (EPLC[EGS]). Doing this on each Data TLB miss exception is obvious
> too intrusive for the host.
>
> Read guest last instruction from kvmppc_load_last_inst() by searching for the
> physical address and kmap it. This address the TODO for TLB eviction and
> execute-but-not-read entries, and allow us to get rid of lwepx until we are
> able to handle failures.
>
> A simple stress benchmark shows a 1% sys performance degradation compared with
> previous approach (lwepx without failure handling):
>
> time for i in `seq 1 10000`; do /bin/echo > /dev/null; done
>
> real    0m 8.85s
> user    0m 4.34s
> sys     0m 4.48s
>
> vs
>
> real    0m 8.84s
> user    0m 4.36s
> sys     0m 4.44s
>
> An alternative solution, to handle lwepx exceptions in KVM, is to temporary
> highjack the interrupt vector from host. Some cores share host IVOR registers
> between hardware threads, which is the case of FSL e6500, which impose additional
> synchronization logic for this solution to work. The optimization can be addressed
> later on top of this patch.
>
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> ---
> v4:
>   - add switch and new function when getting last inst earlier
>   - use enum instead of prev semnatic
>   - get rid of mas0, optimize mas7_mas3
>   - give more context in visible messages
>   - check storage attributes mismatch on MMUv2
>   - get rid of pfn_valid check
>
> v3:
>   - reworked patch description
>   - use unaltered kmap addr for kunmap
>   - get last instruction before beeing preempted
>
> v2:
>   - reworked patch description
>   - used pr_* functions
>   - addressed cosmetic feedback
>
>   arch/powerpc/kvm/booke.c              | 44 +++++++++++++++++
>   arch/powerpc/kvm/bookehv_interrupts.S | 37 ++++----------
>   arch/powerpc/kvm/e500_mmu_host.c      | 91 +++++++++++++++++++++++++++++++++++
>   3 files changed, 144 insertions(+), 28 deletions(-)
>
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 34a42b9..843077b 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -869,6 +869,28 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
>   	}
>   }
>   
> +static int kvmppc_resume_inst_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
> +				  enum emulation_result emulated, u32 last_inst)
> +{
> +	switch (emulated) {
> +	case EMULATE_AGAIN:
> +		return RESUME_GUEST;
> +
> +	case EMULATE_FAIL:
> +		pr_debug("%s: load instruction from guest address %lx failed\n",
> +		       __func__, vcpu->arch.pc);
> +		/* For debugging, encode the failing instruction and
> +		 * report it to userspace. */
> +		run->hw.hardware_exit_reason = ~0ULL << 32;
> +		run->hw.hardware_exit_reason |= last_inst;
> +		kvmppc_core_queue_program(vcpu, ESR_PIL);
> +		return RESUME_HOST;
> +
> +	default:
> +		BUG();
> +	}
> +}
> +
>   /**
>    * kvmppc_handle_exit
>    *
> @@ -880,6 +902,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	int r = RESUME_HOST;
>   	int s;
>   	int idx;
> +	u32 last_inst = KVM_INST_FETCH_FAILED;
> +	enum emulation_result emulated = EMULATE_DONE;
>   
>   	/* update before a new last_exit_type is rewritten */
>   	kvmppc_update_timing_stats(vcpu);
> @@ -887,6 +911,20 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	/* restart interrupts if they were meant for the host */
>   	kvmppc_restart_interrupt(vcpu, exit_nr);
>   
> +	/*
> +	 * get last instruction before beeing preempted
> +	 * TODO: for e6500 check also BOOKE_INTERRUPT_LRAT_ERROR & ESR_DATA
> +	 */
> +	switch (exit_nr) {
> +	case BOOKE_INTERRUPT_DATA_STORAGE:
> +	case BOOKE_INTERRUPT_DTLB_MISS:
> +	case BOOKE_INTERRUPT_HV_PRIV:
> +		emulated = kvmppc_get_last_inst(vcpu, false, &last_inst);
> +		break;
> +	default:
> +		break;
> +	}
> +
>   	local_irq_enable();
>   
>   	trace_kvm_exit(exit_nr, vcpu);
> @@ -895,6 +933,11 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	run->exit_reason = KVM_EXIT_UNKNOWN;
>   	run->ready_for_interrupt_injection = 1;
>   
> +	if (emulated != EMULATE_DONE) {
> +		r = kvmppc_resume_inst_load(run, vcpu, emulated, last_inst);
> +		goto out;
> +	}
> +
>   	switch (exit_nr) {
>   	case BOOKE_INTERRUPT_MACHINE_CHECK:
>   		printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR));
> @@ -1184,6 +1227,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   		BUG();
>   	}
>   
> +out:
>   	/*
>   	 * To avoid clobbering exit_reason, only check for signals if we
>   	 * aren't already exiting to userspace for some other reason.
> diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
> index 6ff4480..e000b39 100644
> --- a/arch/powerpc/kvm/bookehv_interrupts.S
> +++ b/arch/powerpc/kvm/bookehv_interrupts.S
> @@ -121,38 +121,14 @@
>   1:
>   
>   	.if	\flags & NEED_EMU
> -	/*
> -	 * This assumes you have external PID support.
> -	 * To support a bookehv CPU without external PID, you'll
> -	 * need to look up the TLB entry and create a temporary mapping.
> -	 *
> -	 * FIXME: we don't currently handle if the lwepx faults.  PR-mode
> -	 * booke doesn't handle it either.  Since Linux doesn't use
> -	 * broadcast tlbivax anymore, the only way this should happen is
> -	 * if the guest maps its memory execute-but-not-read, or if we
> -	 * somehow take a TLB miss in the middle of this entry code and
> -	 * evict the relevant entry.  On e500mc, all kernel lowmem is
> -	 * bolted into TLB1 large page mappings, and we don't use
> -	 * broadcast invalidates, so we should not take a TLB miss here.
> -	 *
> -	 * Later we'll need to deal with faults here.  Disallowing guest
> -	 * mappings that are execute-but-not-read could be an option on
> -	 * e500mc, but not on chips with an LRAT if it is used.
> -	 */
> -
> -	mfspr	r3, SPRN_EPLC	/* will already have correct ELPID and EGS */
>   	PPC_STL	r15, VCPU_GPR(R15)(r4)
>   	PPC_STL	r16, VCPU_GPR(R16)(r4)
>   	PPC_STL	r17, VCPU_GPR(R17)(r4)
>   	PPC_STL	r18, VCPU_GPR(R18)(r4)
>   	PPC_STL	r19, VCPU_GPR(R19)(r4)
> -	mr	r8, r3
>   	PPC_STL	r20, VCPU_GPR(R20)(r4)
> -	rlwimi	r8, r6, EPC_EAS_SHIFT - MSR_IR_LG, EPC_EAS
>   	PPC_STL	r21, VCPU_GPR(R21)(r4)
> -	rlwimi	r8, r6, EPC_EPR_SHIFT - MSR_PR_LG, EPC_EPR
>   	PPC_STL	r22, VCPU_GPR(R22)(r4)
> -	rlwimi	r8, r10, EPC_EPID_SHIFT, EPC_EPID
>   	PPC_STL	r23, VCPU_GPR(R23)(r4)
>   	PPC_STL	r24, VCPU_GPR(R24)(r4)
>   	PPC_STL	r25, VCPU_GPR(R25)(r4)
> @@ -162,10 +138,15 @@
>   	PPC_STL	r29, VCPU_GPR(R29)(r4)
>   	PPC_STL	r30, VCPU_GPR(R30)(r4)
>   	PPC_STL	r31, VCPU_GPR(R31)(r4)
> -	mtspr	SPRN_EPLC, r8
> -	isync
> -	lwepx   r9, 0, r5
> -	mtspr	SPRN_EPLC, r3
> +
> +	/*
> +	 * We don't use external PID support. lwepx faults would need to be
> +	 * handled by KVM and this implies aditional code in DO_KVM (for
> +	 * DTB_MISS, DSI and LRAT) to check ESR[EPID] and EPLC[EGS] which
> +	 * is too intrusive for the host. Get last instuction in
> +	 * kvmppc_get_last_inst().
> +	 */
> +	li	r9, KVM_INST_FETCH_FAILED
>   	stw	r9, VCPU_LAST_INST(r4)
>   	.endif
>   
> diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
> index 4b4e8d6..57463e5 100644
> --- a/arch/powerpc/kvm/e500_mmu_host.c
> +++ b/arch/powerpc/kvm/e500_mmu_host.c
> @@ -606,11 +606,102 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
>   	}
>   }
>   
> +#ifdef CONFIG_KVM_BOOKE_HV
>   int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, enum instruction_type type,
>   			  u32 *instr)
>   {
> +	gva_t geaddr;
> +	hpa_t addr;
> +	hfn_t pfn;
> +	hva_t eaddr;
> +	u32 mas1, mas2, mas3;
> +	u64 mas7_mas3;
> +	struct page *page;
> +	unsigned int addr_space, psize_shift;
> +	bool pr;
> +	unsigned long flags;
> +
> +	/* Search TLB for guest pc to get the real address */
> +	geaddr = kvmppc_get_pc(vcpu);
> +
> +	addr_space = (vcpu->arch.shared->msr & MSR_IS) >> MSR_IR_LG;
> +
> +	local_irq_save(flags);
> +	mtspr(SPRN_MAS6, (vcpu->arch.pid << MAS6_SPID_SHIFT) | addr_space);
> +	mtspr(SPRN_MAS5, MAS5_SGS | vcpu->kvm->arch.lpid);
> +	asm volatile("tlbsx 0, %[geaddr]\n" : :
> +		     [geaddr] "r" (geaddr));
> +	mtspr(SPRN_MAS5, 0);
> +	mtspr(SPRN_MAS8, 0);
> +	mas1 = mfspr(SPRN_MAS1);
> +	mas2 = mfspr(SPRN_MAS2);
> +	mas3 = mfspr(SPRN_MAS3);
> +#ifdef CONFIG_64BIT
> +	mas7_mas3 = mfspr(SPRN_MAS7_MAS3);
> +#else
> +	mas7_mas3 = ((u64)mfspr(SPRN_MAS7) << 32) | mas3;
> +#endif
> +	local_irq_restore(flags);
> +
> +	/*
> +	 * If the TLB entry for guest pc was evicted, return to the guest.
> +	 * There are high chances to find a valid TLB entry next time.
> +	 */
> +	if (!(mas1 & MAS1_VALID))
> +		return EMULATE_AGAIN;
> +
> +	/*
> +	 * Another thread may rewrite the TLB entry in parallel, don't
> +	 * execute from the address if the execute permission is not set
> +	 */
> +	pr = vcpu->arch.shared->msr & MSR_PR;
> +	if (unlikely((pr && !(mas3 & MAS3_UX)) ||
> +		     (!pr && !(mas3 & MAS3_SX)))) {
> +		pr_debug("%s: Instuction emulation from guest addres %08lx without execute permission\n",
> +			 __func__, geaddr);
> +		return EMULATE_FAIL;

In this case how did we ever get here? Why can't we just evict the entry 
and return EMULATE_AGAIN?

> +	}
> +
> +	/*
> +	 * The real address will be mapped by a cacheable, memory coherent,
> +	 * write-back page. Check for mismatches when LRAT is used.
> +	 */
> +	if (has_feature(vcpu, VCPU_FTR_MMU_V2) &&
> +	    unlikely((mas2 & MAS2_I) || (mas2 & MAS2_W) || !(mas2 & MAS2_M))) {
> +		pr_debug("%s: Instuction emulation from guest addres %08lx mismatches storage attributes\n",
> +			__func__, geaddr);
> +		return EMULATE_FAIL;

Hrm - do we really want to deal with injecting faults here? I'd say it's 
ok to just end up in an endless EMULATE_AGAIN loop.

> +	}
> +
> +	/* Get page size */
> +	psize_shift = MAS1_GET_TSIZE(mas1) + 10;
> +
> +	/* Map a page and get guest's instruction */
> +	addr = (mas7_mas3 & (~0ULL << psize_shift)) |
> +	       (geaddr & ((1ULL << psize_shift) - 1ULL));
> +	pfn = addr >> PAGE_SHIFT;
> +
> +	/* Guard us against emulation from devices area */
> +	if (unlikely(!page_is_ram(pfn))) {
> +		pr_debug("%s: Instruction emulation from non-RAM host addres %08llx is not supported\n",
> +			 __func__, addr);
> +		return EMULATE_FAIL;

Same here :).

> +	}
> +
> +	page = pfn_to_page(pfn);
> +	eaddr = (unsigned long)kmap_atomic(page);
> +	*instr = *(u32 *)(eaddr | (addr & ~PAGE_MASK));
> +	kunmap_atomic((u32 *)eaddr);

Doesn't kmap_atomic() have to be guarded somehow?


Alex

  reply	other threads:[~2014-07-03 13:53 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-27 22:49 [PATCH 0/5 v4] Read guest last instruction from kvmppc_get_last_inst() Mihai Caraman
2014-06-27 22:49 ` Mihai Caraman
2014-06-27 22:49 ` Mihai Caraman
2014-06-27 22:49 ` [PATCH 1/5 v4] KVM: PPC: e500mc: Revert "add load inst fixup" Mihai Caraman
2014-06-27 22:49   ` Mihai Caraman
2014-06-27 22:49   ` Mihai Caraman
2014-06-27 22:49 ` [PATCH 2/5 v4] KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1 Mihai Caraman
2014-06-27 22:49   ` Mihai Caraman
2014-06-27 22:49   ` Mihai Caraman
2014-06-27 22:49 ` [PATCH 3/5 v4] KVM: PPC: Book3s: Remove kvmppc_read_inst() function Mihai Caraman
2014-06-27 22:49   ` Mihai Caraman
2014-06-27 22:49   ` Mihai Caraman
2014-07-03 13:37   ` Alexander Graf
2014-07-03 13:37     ` Alexander Graf
2014-07-03 13:37     ` Alexander Graf
2014-07-03 16:18     ` mihai.caraman
2014-07-03 16:18       ` mihai.caraman
2014-07-03 16:18       ` mihai.caraman
2014-07-03 16:25       ` Alexander Graf
2014-07-03 16:25         ` Alexander Graf
2014-07-03 16:25         ` Alexander Graf
2014-06-27 22:49 ` [PATCH 4/5 v4] KVM: PPC: Alow kvmppc_get_last_inst() to fail Mihai Caraman
2014-06-27 22:49   ` Mihai Caraman
2014-06-27 22:49   ` Mihai Caraman
2014-07-03 13:44   ` Alexander Graf
2014-07-03 13:44     ` Alexander Graf
2014-07-03 13:44     ` Alexander Graf
2014-06-27 22:49 ` [PATCH 5/5 v4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation Mihai Caraman
2014-06-27 22:49   ` Mihai Caraman
2014-06-27 22:49   ` Mihai Caraman
2014-07-03 13:53   ` Alexander Graf [this message]
2014-07-03 13:53     ` Alexander Graf
2014-07-03 13:53     ` Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53B5603E.807@suse.de \
    --to=agraf@suse.de \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mihai.caraman@freescale.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.