All of lore.kernel.org
 help / color / mirror / Atom feed
From: zhu <vanbas.han@gmail.com>
To: "Robert S. Phillips" <rphillips@virtualiron.com>
Cc: Xen-devel@lists.xensource.com
Subject: [PATCH - proposed] XI Shadow Page Table Mechanism]
Date: Fri, 30 Jun 2006 10:37:16 +0800	[thread overview]
Message-ID: <44A48E5C.8060609@gmail.com> (raw)
In-Reply-To: <44A3F073.50305@virtualiron.com>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=gb18030; format=flowed, Size: 11900 bytes --]

Hi, Robert,
I found out another confusing code snippet:
in void xi_invl_mfn(struct domain *d, unsigned long mfn)
     if (ext && pfn < ext->large_page_aligned_size)

According to the code, it should be
     if (ext && (pfn>>SPT_ENTRIES_ORDER) < ext->large_page_aligned_size)

If I made any mistake, please point it out.
_______________________________________________________
Best Regards,
hanzhu


Robert S. Phillips дµÀ:
> Hi Han,
> Yes it is a bug but of limited impact.  The effect is that we will fail 
> to keep the 'accessed' flag up-to-date but only for L3 guest page table 
> entries in 64-bit guests.   Still I'm very pleased that you spotted it.
> -- rsp
> 
> Robert S. Phillips wrote:
>> Hi Han,
>> I think you're right.  I'll get back to you soon after we review the 
>> ramifications.
>> Thanks!
>> -- rsp
>> p.s. Out of curiosity ... where do you work?
>>
>> han wrote:
>>> Hi Robert,
>>> Thanks a lot for your thorough explanation.
>>> However, in dt_init_action I notice this code snippet:
>>> +            int _32pae_l3 = c_nr_levels <= 3 && c_level == 3;
>>> Seems c_nr_levels is always <= 3, isn't it?
>>> I'm afraid the code means that c_nr_levels == 3.
>>>
>>> I suppose the c_nr_levels should be 3 if the guest is 32bit(pae or 
>>> not), and c_nr_levels should be 4 if the guest is 64bit. So I 
>>> mentioned you about the code for uncombine_nr_levels.
>>>
>>>
>>> _______________________________________________________
>>> Best Regards,
>>> hanzhu
>>>
>>>
>>> Robert S. Phillips дµÀ:
>>>> Hello Han,
>>>>
>>>> The puzzling function, uncombine_nr_levels(),  should be reviewed 
>>>> together with the previous two functions: combine nr_levels() and 
>>>> uncombine_level(), shown below.  The problem I'm attempting to solve 
>>>> is to minimize the size of the decision table.  It is currently 4096 
>>>> entries.
>>>>
>>>> The decision function depends on "level" (that is, the level of the 
>>>> current shadow page table) and, to a lesser extent, on "nr_levels" 
>>>> (that is, the number of guest paging levels).  It really only cares 
>>>> if nr_levels is 3 or not, which is why uncombine_nr_levels() returns 
>>>> 3 or not 3.
>>>>
>>>> Normally each of the two values can each have three values (2, 3, or 
>>>> 4) making 9 combinations.
>>>> I squeeze them down to 4 combinations, which cuts the decision table 
>>>> size by more than half.
>>>>
>>>> The function combine_levels() is used to encode nr_levels  and 
>>>> levels into a number from 0 to 3.
>>>>
>>>> The functions uncombine_nr_levels() and uncombine_level() do the 
>>>> inverse, sort of:
>>>> uncombine_level() returns the number of levels (1, 2, 3, or 4)
>>>> and uncombine_nr_levels() returns 3 if the number of guest paging 
>>>> levels was 3, 0 otherwise.  This is all dt_init_action() needs to 
>>>> initialize the decision table.  It only tests if c_nr_levels is 3.
>>>>
>>>> I hope this explanation helps.
>>>>
>>>> -- rsp
>>>>
>>>> /* Combine nr_levels and level into 4 combinations by
>>>> * taking advantage of the fact that the rules for PTE levels
>>>> * 3 and 4 are identical
>>>> * except that the PDPT (ie nr_level == 3, level == 3)
>>>> * is slightly irregular. */
>>>> static inline int combine_levels(int c_nr_levels,
>>>>                                 int c_level)
>>>> {
>>>>    switch (c_level) {
>>>>    case 1: return 0;
>>>>    case 2: return 1;
>>>>    case 3: return c_nr_levels == 3
>>>>            ? 2   /* PDPT  */
>>>>            : 3;  /* treat like level 4*/
>>>>    default:      /* the compiler needs this */
>>>>    case 4: return 3;
>>>>    }
>>>> }
>>>>
>>>> static inline int uncombine_level(int combo)
>>>> {
>>>>    return combo + 1;
>>>> }
>>>>
>>>> static inline int uncombine_nr_levels(int combo)
>>>> {
>>>>    return combo == 2 ? 3 : 0;
>>>> }
>>>>
>>>> Ben Thomas wrote:
>>>>>
>>>>>
>>>>> -------- Original Message --------
>>>>> Subject: Re: [Xen-devel] [PATCH - proposed] XI Shadow Page Table 
>>>>> Mechanism
>>>>> Date: Thu, 29 Jun 2006 20:25:13 +0800
>>>>> From: han <vanbas.han@gmail.com>
>>>>> To: Ben Thomas <bthomas@virtualiron.com>
>>>>> References: <44A055FA.1010405@virtualiron.com>
>>>>>
>>>>> Hi, Ben
>>>>> I'm learning about your patch for the rewritten shadow pagetable code.
>>>>> However, I feel confused about some part of your code:
>>>>> +static inline int uncombine_nr_levels(int combo)
>>>>> +{
>>>>> +    return combo == 2 ? 3 : 0;
>>>>> +}
>>>>> shouldn't it be like below snippet:
>>>>> +static inline int uncombine_nr_levels(int combo)
>>>>> +{
>>>>> +    return combo <= 2 ? 3 : 4;
>>>>> +}
>>>>>
>>>>> _______________________________________________________
>>>>> Best Regards,
>>>>> hanzhu
>>>>>
>>>>>
>>>>> Ben Thomas дµÀ:
>>>>>> A post last week contained the design document for a shadow page
>>>>>> table mechanism. This post contains more information, and a pointer
>>>>>> to the code.
>>>>>>
>>>>>> As a recap, the "XI Shadow Mechanism" is a design for shadow
>>>>>> page table code for fully virtualized HVM domains running on a 64-bit
>>>>>> Xen hypervisor.
>>>>>>
>>>>>> This work was undertaken to address a number of goals.  These are
>>>>>> enumerated in the document posted last week and include:
>>>>>>
>>>>>> - ability to run fully virtualized 32, 32PAE and 64 bit guest
>>>>>>   domains concurrently on a 64-bit hypervisor
>>>>>>
>>>>>> - start support of live migration of fully virtualized domains
>>>>>>
>>>>>> - provide good performance and robustness
>>>>>>
>>>>>> - limit size and scope of change to Xen hypervisor
>>>>>>
>>>>>> This design has been coded, and tested on a variety of operating
>>>>>> systems. I've included this weekend's test report.
>>>>>>
>>>>>> Of particular interest from the following table are the Average Time
>>>>>> comparisions (smaller is better):
>>>>>>
>>>>>> pfault   10.56 vs 2.43 (XI)
>>>>>> sync_all 39.72 vs 33.51 (XI)
>>>>>> invl_pg  22.75 vs 4.00 (XI)
>>>>>>
>>>>>> The patch is too large for the mailing list, and may be obtained from
>>>>>>
>>>>>> http://s3.amazonaws.com/VIDownloads/xi-20060626.patch
>>>>>>
>>>>>> The patch applies against changeset 10352 (and possibly others)
>>>>>>
>>>>>> We look forward to any comments and suggestions.
>>>>>>
>>>>>> Thanks,
>>>>>> -b
>>>>>>
>>>>>>
>>>>>>
>>>>>> XI Shadow Test Report
>>>>>>
>>>>>> This report shows a comparsion of test runs done on xen-unstable 
>>>>>> changeset
>>>>>> 10352 with and without the XI patch applied.  It also compares 
>>>>>> pfault,
>>>>>> sync_all, and invl_pg performance.
>>>>>>
>>>>>> With the XI patch 32bit SMP (PAE) guests now boot and pass the 
>>>>>> regression
>>>>>> tests, closing Bug #661.  With the XI patch 64bit SMP guests are 
>>>>>> much more
>>>>>> stable and are no longer experiencing SEGVs and GPFs, closing Bug 
>>>>>> #678.
>>>>>>
>>>>>>  ---------------------------------------------------------------------- 
>>>>>>
>>>>>> | XEN      | Guest Kernel SMP kernels booted with 2 
>>>>>> CPUs               |
>>>>>> | 
>>>>>> Changeset|-----------------------------------------------------------| 
>>>>>>
>>>>>> |          | 32bit Non-XI | 32bit XI     | 64bit Non-XI | 64bit 
>>>>>> XI     |
>>>>>> |          
>>>>>> |--------------|--------------|--------------|--------------|
>>>>>> |          | Boot | Test  | Boot | Test  | Boot | Test  | Boot | 
>>>>>> Test  |
>>>>>> |----------|------|-------|------|-------|------|-------|------|-------| 
>>>>>>
>>>>>> | 10352    | Fail |       | Pass | 852/1 | Pass | 852/0 | Pass | 
>>>>>> 852/0 |
>>>>>> |          | (1)  |       |      | (2)   |      |(3,4,5)|      
>>>>>> |(3,4)  |
>>>>>>  ---------------------------------------------------------------------- 
>>>>>>
>>>>>>
>>>>>> 1. BUG 661: 32bit SMP guest crashes on boot:
>>>>>>    "(XEN) __hvm_bug at vmx.c:2289"
>>>>>> 2. BUG 666: 32bit UP guest fail ltp gettimeofday02:
>>>>>>     "Time is going backwards"
>>>>>> 3. BUG 663: 64bit SMP guest report these messages while running:
>>>>>>     "(XEN) spurious IRQ irq got=-1"
>>>>>> 4. BUG 662: 64bit SMP guest report this message in ltp pth_str01/2/3:
>>>>>>     "(XEN) <vlapic_accept_irq>level trig mode repeatedly for 
>>>>>> vector 252"
>>>>>> 5. BUG 678: 64bit 2CPU SMP guest failed several ltp tests with 
>>>>>> SEGVs or GPF
>>>>>>     "sshd[23654] general protection rip:2a9555d426 rsp:7fbfffd000 
>>>>>> error 0"
>>>>>>
>>>>>> The XI patch can be enabled or disabled at build time and includes 
>>>>>> some
>>>>>> performance statistics that can be examined whether or not it's 
>>>>>> enabled.
>>>>>> This allows us to compare the performance of XI vs the existing 
>>>>>> shadow code.
>>>>>> You access the statistics from the XEN console (^a^a^a). Typing a 'Y'
>>>>>> clears the counters and a 'y' displays the counters since the last 
>>>>>> clear.
>>>>>> Here is a comparsion of XI vs Non-XI shadow code done on xen-unstable
>>>>>> changeset 10352. The guest is a RHEL4U2-64bit 2 CPU SMP 256MB 
>>>>>> Guest running
>>>>>> a usex -b24 load for 5 minutes.  Shadow statistics cleared at 
>>>>>> start of run.
>>>>>>
>>>>>> (XEN) Non-XI shadow performance statistics for the last 303 seconds
>>>>>> (XEN) Op type                                   # of 
>>>>>> Operations                                     Total       
>>>>>> Avg          Tick Profile
>>>>>> (XEN)              <2usec    <3usec    <4usec    <8usec   
>>>>>> <16usec   <32usec   <64usec  >=64usec  ops/msec      usec     In 
>>>>>> Op    Locked     Total
>>>>>> (XEN) pfault      6750685   1128922    624988   3249437   
>>>>>> 3833201   2863229   1224908    315992  19991362     10.56         
>>>>>> 0         0         0
>>>>>> (XEN)               33.7%      5.6%      3.1%     16.2%     
>>>>>> 19.1%     14.3%      6.1%      1.5%    211175
>>>>>> (XEN) sync_all       8176       380       333    159011    
>>>>>> 167619    305470    814598    298411   1753998     39.72         
>>>>>> 0         0         0
>>>>>> (XEN)                0.4%      0.0%      0.0%      9.0%      
>>>>>> 9.5%     17.4%     46.4%     17.0%     69670
>>>>>> (XEN) invl_pg         219        17       114     23333     
>>>>>> 40087     21904     24622      4639    114935     22.75         
>>>>>> 0         0         0
>>>>>> (XEN)                0.1%      0.0%      0.0%     20.3%     
>>>>>> 34.8%     19.0%     21.4%      4.0%      2615
>>>>>>
>>>>>> (XEN) XI shadow performance statistics for the last 302 seconds
>>>>>> (XEN) Op type                                   # of 
>>>>>> Operations                                     Total       
>>>>>> Avg          Tick Profile
>>>>>> (XEN)              <2usec    <3usec    <4usec    <8usec   
>>>>>> <16usec   <32usec   <64usec  >=64usec  ops/msec      usec     In 
>>>>>> Op    Locked     Total
>>>>>> (XEN) pfault     23078414   4694139    485499    613419    
>>>>>> 396088   1427265     11976      8228  30715028      2.43         
>>>>>> 0         0         0
>>>>>> (XEN)               75.1%     15.2%      1.5%      1.9%      
>>>>>> 1.2%      4.6%      0.0%      0.0%     74690
>>>>>> (XEN) sync_all       1734        97       169    305433    
>>>>>> 484991    174300    150199    130405   1247328     33.51         
>>>>>> 0         0         0
>>>>>> (XEN)                0.1%      0.0%      0.0%     24.4%     
>>>>>> 38.8%     13.9%     12.0%     10.4%     41810
>>>>>> (XEN) invl_pg       10169    302977    125270     44927      
>>>>>> 5385     17603       776      1656    508763      4.00         
>>>>>> 0         0         0
>>>>>> (XEN)                1.9%     59.5%     24.6%      8.8%      
>>>>>> 1.0%      3.4%      0.1%      0.3%      2037
>>>>>>
>>>>>>
>>>>>
>>>>
>>
> 

       reply	other threads:[~2006-06-30  2:37 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <44A3CA74.4000206@virtualiron.com>
     [not found] ` <44A3D3CE.3040408@virtualiron.com>
     [not found]   ` <44A3E08B.9090103@gmail.com>
     [not found]     ` <44A3E98B.9040602@virtualiron.com>
     [not found]       ` <44A3F073.50305@virtualiron.com>
2006-06-30  2:37         ` zhu [this message]
2006-06-30 12:24           ` [PATCH - proposed] XI Shadow Page Table Mechanism] Robert Phillips
     [not found]             ` <44A51DE9.1020909@gmail.com>
2006-06-30 13:18               ` Robert Phillips
2006-06-30 13:33                 ` Steve Ofsthun
2006-06-30 13:47                 ` zhu
     [not found]               ` <44A52215.8050505@virtualiron.com>
2006-07-01  9:22                 ` zhu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44A48E5C.8060609@gmail.com \
    --to=vanbas.han@gmail.com \
    --cc=Xen-devel@lists.xensource.com \
    --cc=rphillips@virtualiron.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.