diff for duplicates of <1373395605.8183.198@snotra> diff --git a/a/1.txt b/N1/1.txt index bc043f9..a0fe415 100644 --- a/a/1.txt +++ b/N1/1.txt @@ -1,75 +1,75 @@ On 07/09/2013 12:44:32 PM, Alexander Graf wrote: > On 07/09/2013 07:13 PM, Scott Wood wrote: >> On 07/08/2013 08:39:05 AM, Alexander Graf wrote: ->>> +>>>=20 >>> On 28.06.2013, at 11:20, Mihai Caraman wrote: ->>> ->>> > lwepx faults needs to be handled by KVM and this implies +>>>=20 +>>> > lwepx faults needs to be handled by KVM and this implies =20 >>> additional code ->>> > in DO_KVM macro to identify the source of the exception +>>> > in DO_KVM macro to identify the source of the exception =20 >>> originated from ->>> > host context. This requires to check the Exception Syndrome +>>> > host context. This requires to check the Exception Syndrome =20 >>> Register ->>> > (ESR[EPID]) and External PID Load Context Register (EPLC[EGS]) +>>> > (ESR[EPID]) and External PID Load Context Register (EPLC[EGS]) =20 >>> for DTB_MISS, >>> > DSI and LRAT exceptions which is too intrusive for the host. >>> > ->>> > Get rid of lwepx and acquire last instuction in +>>> > Get rid of lwepx and acquire last instuction in =20 >>> kvmppc_handle_exit() by ->>> > searching for the physical address and kmap it. This fixes an +>>> > searching for the physical address and kmap it. This fixes an =20 >>> infinite loop ->>> +>>>=20 >>> What's the difference in speed for this? ->>> ->>> Also, could we call lwepx later in host code, when +>>>=20 +>>> Also, could we call lwepx later in host code, when =20 >>> kvmppc_get_last_inst() gets invoked? ->> ->> Any use of lwepx is problematic unless we want to add overhead to +>>=20 +>> Any use of lwepx is problematic unless we want to add overhead to =20 >> the main Linux TLB miss handler. -> +>=20 > What exactly would be missing? -If lwepx faults, it goes to the normal host TLB miss handler. Without -adding code to it to recognize that it's an external-PID fault, it will -try to search the normal Linux page tables and insert a normal host -entry. If it thinks it has succeeded, it will retry the instruction -rather than search for an exception handler. The instruction will +If lwepx faults, it goes to the normal host TLB miss handler. Without =20 +adding code to it to recognize that it's an external-PID fault, it will =20 +try to search the normal Linux page tables and insert a normal host =20 +entry. If it thinks it has succeeded, it will retry the instruction =20 +rather than search for an exception handler. The instruction will =20 fault again, and you get a hang. -> I'd also still like to see some performance benchmarks on this to +> I'd also still like to see some performance benchmarks on this to =20 > make sure we're not walking into a bad direction. -I doubt it'll be significantly different. There's overhead involved in -setting up for lwepx as well. It doesn't hurt to test, though this is -a functional correctness issue, so I'm not sure what better -alternatives we have. I don't want to slow down non-KVM TLB misses for +I doubt it'll be significantly different. There's overhead involved in =20 +setting up for lwepx as well. It doesn't hurt to test, though this is =20 +a functional correctness issue, so I'm not sure what better =20 +alternatives we have. I don't want to slow down non-KVM TLB misses for =20 this. ->>> > + addr = (mas7_mas3 & (~0ULL << psize_shift)) | +>>> > + addr =3D (mas7_mas3 & (~0ULL << psize_shift)) | >>> > + (geaddr & ((1ULL << psize_shift) - 1ULL)); >>> > + >>> > + /* Map a page and get guest's instruction */ ->>> > + page = pfn_to_page(addr >> PAGE_SHIFT); ->>> ->>> So it seems to me like you're jumping through a lot of hoops to ->>> make sure this works for LRAT and non-LRAT at the same time. Can't +>>> > + page =3D pfn_to_page(addr >> PAGE_SHIFT); +>>>=20 +>>> So it seems to me like you're jumping through a lot of hoops to =20 +>>> make sure this works for LRAT and non-LRAT at the same time. Can't =20 >>> we just treat them as the different things they are? ->>> ->>> What if we have different MMU backends for LRAT and non-LRAT? The ->>> non-LRAT case could then try lwepx, if that fails, fall back to ->>> read the shadow TLB. For the LRAT case, we'd do lwepx, if that +>>>=20 +>>> What if we have different MMU backends for LRAT and non-LRAT? The =20 +>>> non-LRAT case could then try lwepx, if that fails, fall back to =20 +>>> read the shadow TLB. For the LRAT case, we'd do lwepx, if that =20 >>> fails fall back to this logic. ->> ->> This isn't about LRAT; it's about hardware threads. It also fixes +>>=20 +>> This isn't about LRAT; it's about hardware threads. It also fixes =20 >> the handling of execute-only pages on current chips. -> -> On non-LRAT systems we could always check our shadow copy of the -> guest's TLB, no? I'd really like to know what the performance +>=20 +> On non-LRAT systems we could always check our shadow copy of the =20 +> guest's TLB, no? I'd really like to know what the performance =20 > difference would be for the 2 approaches. -I suspect that tlbsx is faster, or at worst similar. And unlike -comparing tlbsx to lwepx (not counting a fix for the threading -problem), we don't already have code to search the guest TLB, so +I suspect that tlbsx is faster, or at worst similar. And unlike =20 +comparing tlbsx to lwepx (not counting a fix for the threading =20 +problem), we don't already have code to search the guest TLB, so =20 testing would be more work. --Scott +-Scott= diff --git a/a/content_digest b/N1/content_digest index 45d48c3..63bf3c1 100644 --- a/a/content_digest +++ b/N1/content_digest @@ -1,88 +1,88 @@ "ref\051DC4C00.70509@suse.de\0" "From\0Scott Wood <scottwood@freescale.com>\0" "Subject\0Re: [PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation\0" - "Date\0Tue, 09 Jul 2013 18:46:45 +0000\0" + "Date\0Tue, 9 Jul 2013 13:46:45 -0500\0" "To\0Alexander Graf <agraf@suse.de>\0" "Cc\0Mihai Caraman <mihai.caraman@freescale.com>" - kvm-ppc@vger.kernel.org + linuxppc-dev@lists.ozlabs.org kvm@vger.kernel.org - " linuxppc-dev@lists.ozlabs.org\0" + " kvm-ppc@vger.kernel.org\0" "\00:1\0" "b\0" "On 07/09/2013 12:44:32 PM, Alexander Graf wrote:\n" "> On 07/09/2013 07:13 PM, Scott Wood wrote:\n" ">> On 07/08/2013 08:39:05 AM, Alexander Graf wrote:\n" - ">>> \n" + ">>>=20\n" ">>> On 28.06.2013, at 11:20, Mihai Caraman wrote:\n" - ">>> \n" - ">>> > lwepx faults needs to be handled by KVM and this implies \n" + ">>>=20\n" + ">>> > lwepx faults needs to be handled by KVM and this implies =20\n" ">>> additional code\n" - ">>> > in DO_KVM macro to identify the source of the exception \n" + ">>> > in DO_KVM macro to identify the source of the exception =20\n" ">>> originated from\n" - ">>> > host context. This requires to check the Exception Syndrome \n" + ">>> > host context. This requires to check the Exception Syndrome =20\n" ">>> Register\n" - ">>> > (ESR[EPID]) and External PID Load Context Register (EPLC[EGS]) \n" + ">>> > (ESR[EPID]) and External PID Load Context Register (EPLC[EGS]) =20\n" ">>> for DTB_MISS,\n" ">>> > DSI and LRAT exceptions which is too intrusive for the host.\n" ">>> >\n" - ">>> > Get rid of lwepx and acquire last instuction in \n" + ">>> > Get rid of lwepx and acquire last instuction in =20\n" ">>> kvmppc_handle_exit() by\n" - ">>> > searching for the physical address and kmap it. This fixes an \n" + ">>> > searching for the physical address and kmap it. This fixes an =20\n" ">>> infinite loop\n" - ">>> \n" + ">>>=20\n" ">>> What's the difference in speed for this?\n" - ">>> \n" - ">>> Also, could we call lwepx later in host code, when \n" + ">>>=20\n" + ">>> Also, could we call lwepx later in host code, when =20\n" ">>> kvmppc_get_last_inst() gets invoked?\n" - ">> \n" - ">> Any use of lwepx is problematic unless we want to add overhead to \n" + ">>=20\n" + ">> Any use of lwepx is problematic unless we want to add overhead to =20\n" ">> the main Linux TLB miss handler.\n" - "> \n" + ">=20\n" "> What exactly would be missing?\n" "\n" - "If lwepx faults, it goes to the normal host TLB miss handler. Without \n" - "adding code to it to recognize that it's an external-PID fault, it will \n" - "try to search the normal Linux page tables and insert a normal host \n" - "entry. If it thinks it has succeeded, it will retry the instruction \n" - "rather than search for an exception handler. The instruction will \n" + "If lwepx faults, it goes to the normal host TLB miss handler. Without =20\n" + "adding code to it to recognize that it's an external-PID fault, it will =20\n" + "try to search the normal Linux page tables and insert a normal host =20\n" + "entry. If it thinks it has succeeded, it will retry the instruction =20\n" + "rather than search for an exception handler. The instruction will =20\n" "fault again, and you get a hang.\n" "\n" - "> I'd also still like to see some performance benchmarks on this to \n" + "> I'd also still like to see some performance benchmarks on this to =20\n" "> make sure we're not walking into a bad direction.\n" "\n" - "I doubt it'll be significantly different. There's overhead involved in \n" - "setting up for lwepx as well. It doesn't hurt to test, though this is \n" - "a functional correctness issue, so I'm not sure what better \n" - "alternatives we have. I don't want to slow down non-KVM TLB misses for \n" + "I doubt it'll be significantly different. There's overhead involved in =20\n" + "setting up for lwepx as well. It doesn't hurt to test, though this is =20\n" + "a functional correctness issue, so I'm not sure what better =20\n" + "alternatives we have. I don't want to slow down non-KVM TLB misses for =20\n" "this.\n" "\n" - ">>> > + addr = (mas7_mas3 & (~0ULL << psize_shift)) |\n" + ">>> > + addr =3D (mas7_mas3 & (~0ULL << psize_shift)) |\n" ">>> > + (geaddr & ((1ULL << psize_shift) - 1ULL));\n" ">>> > +\n" ">>> > + /* Map a page and get guest's instruction */\n" - ">>> > + page = pfn_to_page(addr >> PAGE_SHIFT);\n" - ">>> \n" - ">>> So it seems to me like you're jumping through a lot of hoops to \n" - ">>> make sure this works for LRAT and non-LRAT at the same time. Can't \n" + ">>> > + page =3D pfn_to_page(addr >> PAGE_SHIFT);\n" + ">>>=20\n" + ">>> So it seems to me like you're jumping through a lot of hoops to =20\n" + ">>> make sure this works for LRAT and non-LRAT at the same time. Can't =20\n" ">>> we just treat them as the different things they are?\n" - ">>> \n" - ">>> What if we have different MMU backends for LRAT and non-LRAT? The \n" - ">>> non-LRAT case could then try lwepx, if that fails, fall back to \n" - ">>> read the shadow TLB. For the LRAT case, we'd do lwepx, if that \n" + ">>>=20\n" + ">>> What if we have different MMU backends for LRAT and non-LRAT? The =20\n" + ">>> non-LRAT case could then try lwepx, if that fails, fall back to =20\n" + ">>> read the shadow TLB. For the LRAT case, we'd do lwepx, if that =20\n" ">>> fails fall back to this logic.\n" - ">> \n" - ">> This isn't about LRAT; it's about hardware threads. It also fixes \n" + ">>=20\n" + ">> This isn't about LRAT; it's about hardware threads. It also fixes =20\n" ">> the handling of execute-only pages on current chips.\n" - "> \n" - "> On non-LRAT systems we could always check our shadow copy of the \n" - "> guest's TLB, no? I'd really like to know what the performance \n" + ">=20\n" + "> On non-LRAT systems we could always check our shadow copy of the =20\n" + "> guest's TLB, no? I'd really like to know what the performance =20\n" "> difference would be for the 2 approaches.\n" "\n" - "I suspect that tlbsx is faster, or at worst similar. And unlike \n" - "comparing tlbsx to lwepx (not counting a fix for the threading \n" - "problem), we don't already have code to search the guest TLB, so \n" + "I suspect that tlbsx is faster, or at worst similar. And unlike =20\n" + "comparing tlbsx to lwepx (not counting a fix for the threading =20\n" + "problem), we don't already have code to search the guest TLB, so =20\n" "testing would be more work.\n" "\n" - -Scott + -Scott= -1550757c8efed5b7570c75f5a3fd0aea3cc6dfc9a2e8236e3810b27ddafc9d53 +a14614f520f5bbcd1fb7e8bc4a9c3ec79c14b5e8ee3c8b6ddf9dd9b13b9a2662
diff --git a/a/content_digest b/N2/content_digest index 45d48c3..907a030 100644 --- a/a/content_digest +++ b/N2/content_digest @@ -1,12 +1,12 @@ "ref\051DC4C00.70509@suse.de\0" "From\0Scott Wood <scottwood@freescale.com>\0" "Subject\0Re: [PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation\0" - "Date\0Tue, 09 Jul 2013 18:46:45 +0000\0" + "Date\0Tue, 9 Jul 2013 13:46:45 -0500\0" "To\0Alexander Graf <agraf@suse.de>\0" "Cc\0Mihai Caraman <mihai.caraman@freescale.com>" - kvm-ppc@vger.kernel.org - kvm@vger.kernel.org - " linuxppc-dev@lists.ozlabs.org\0" + <kvm-ppc@vger.kernel.org> + <kvm@vger.kernel.org> + " <linuxppc-dev@lists.ozlabs.org>\0" "\00:1\0" "b\0" "On 07/09/2013 12:44:32 PM, Alexander Graf wrote:\n" @@ -85,4 +85,4 @@ "\n" -Scott -1550757c8efed5b7570c75f5a3fd0aea3cc6dfc9a2e8236e3810b27ddafc9d53 +f76b761ebe1e8f2a9baedc40c5efea8bfc986e5293e083b99334501b5d1a259b
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.