All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: problem with netfront.c
@ 2005-04-03 13:33 Ian Pratt
  2005-04-04  3:06 ` Jacob Gorm Hansen
  0 siblings, 1 reply; 11+ messages in thread
From: Ian Pratt @ 2005-04-03 13:33 UTC (permalink / raw)
  To: Ling, Xiaofeng, xen-devel

[-- Attachment #1: Type: text/plain, Size: 1064 bytes --]

> Yes, but does it still need hypervisor to modify the 
> phys_to_machine table?
> I just hope to save some page faults or tlb flush vmexit to 
> get high performance and also have less modification to the 
> original netfront.c. So still only one multicall needed each 
> time when gets a net event.

The netfront/backdriver driver needs to be switched over to using the
grant tables interface. 

Chris Clark has already checked in the grant tables implementation, but
no-one has got around to updating netfront/back to use it. There are
patches to switch blkfront/back over to use grant tables, which will be
checked in soon, but since the blk driver use the foreign access rather
than the page transfer mechanism its probably not too much help. 

Using grant tables, the front end doesn't need to know about machine
addresses, and the whole thing ends up rather cleaner, particulary for
domains running with virtualized VMs.

I've attached a document from Chris that gives a rough description of
the grant table interface.

Best,
Ian


[-- Attachment #2: grant-tables.txt --]
[-- Type: text/plain, Size: 9905 bytes --]

********************************************************************************
 A Rough Introduction to Using Grant Tables
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                              Christopher Clark, March, 2005.

Grant tables are a mechanism for sharing and transferring frames between
domains, without requiring the participating domains to be privileged.

The first mode of use allows domA to grant domB access to a specific frame,
whilst retaining ownership. The block front driver uses this to grant memory
access to the block back driver, so that it may read or write as requested.

 1. domA creates a grant access reference, and transmits the ref id to domB.
 2. domB uses the reference to map the granted frame.
 3. domB performs the memory access.
 4. domB unmaps the granted frame.
 5. domA removes its grant.


The second mode allows domA to accept a transfer of ownership of a frame from
domB. The net front and back driver will use this for packet tx/rx. This
mechanism is still being implemented, though the xen<->guest interface design
is complete.

 1. domA creates an accept transfer grant reference, and transmits it to domB.
 2. domB uses the ref to hand over a frame it owns.
 3. domA accepts the transfer
 4. domA clears the used reference.


********************************************************************************

 Granting a foreign domain access to frames
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 domA [frame]--> domB


 domA:  #include <asm-xen/gnttab.h>
        grant_ref_t gref[BATCH_SIZE];

        for ( i = 0; i < BATCH_SIZE; i++ )
            gref[i] = gnttab_grant_foreign_access( domBid, mfn, (readonly ? 1 : 0) );


 .. gref is then somehow transmitted to domB for use.


 Mapping foreign frames
 ~~~~~~~~~~~~~~~~~~~~~~

 domB:  #include <asm-xen/hypervisor.h>
        unsigned long       mmap_vstart;
        gnttab_op_t         aop[BATCH_SIZE];
        grant_ref_t         mapped_handle[BATCH_SIZE];

        if ( (mmap_vstart = allocate_empty_lowmem_region(BATCH_SIZE)) == 0 )
            BUG();

        for ( i = 0; i < BATCH_SIZE; i++ )
        {
            aop[i].u.map_grant_ref.host_virt_addr =
                                              mmap_vstart + (i * PAGE_SIZE);
            aop[i].u.map_grant_ref.dom      = domAid;
            aop[i].u.map_grant_ref.ref      = gref[i];
            aop[i].u.map_grant_ref.flags    = ( GNTMAP_host_map | GNTMAP_readonly );
        }

        if ( unlikely(HYPERVISOR_grant_table_op(
                        GNTTABOP_map_grant_ref, aop, BATCH_SIZE)))
            BUG();

        for ( i = 0; i < BATCH_SIZE; i++ )
        {
            if ( unlikely(aop[i].u.map_grant_ref.dev_bus_addr == 0) )
            {
                tidyup_all(aop, i);
                goto panic;
            }

            phys_to_machine_mapping[__pa(mmap_vstart + (i * PAGE_SIZE))>>PAGE_SHIFT] =
                FOREIGN_FRAME(aop[i].u.map_grant_ref.dev_bus_addr);

            mapped_handle[i] = aop[i].u.map_grant_ref.handle;
        }



 Unmapping foreign frames
 ~~~~~~~~~~~~~~~~~~~~~~~~

 domB:
        for ( i = 0; i < BATCH_SIZE; i++ )
        {
            aop[i].u.unmap_grant_ref.host_virt_addr = mmap_vstart + (i * PAGE_SIZE);
            aop[i].u.unmap_grant_ref.dev_bus_addr   = 0;
            aop[i].u.unmap_grant_ref.handle         = mapped_handle[i];
        }
        if ( unlikely(HYPERVISOR_grant_table_op(
                        GNTTABOP_unmap_grant_ref, aop, BATCH_SIZE)))
            BUG();


 Ending foreign access
 ~~~~~~~~~~~~~~~~~~~~~

    Note that this only prevents further mappings; it does _not_ revoke access.
    Should _only_ be used when the remote domain has unmapped the frame.
    gnttab_query_foreign_access( gref ) will indicate the state of any mapping.

 domA:
        if ( gnttab_query_foreign_access( gref[i] ) == 0 )
            gnttab_end_foreign_access( gref[i], readonly );

        TODO: readonly yet to be implemented.


********************************************************************************

 Transferring ownership of a frame to another domain
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 [ XXX: Transfer mechanism is alpha-calibre code, untested, use at own risk XXX ]
 [ XXX: show use of batch operations below, rather than single frame XXX ]
 [ XXX: linux internal interface could/should be wrapped to be tidier XXX ]


 Prepare to accept a frame from a foreign domain
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  domA:
        if ( (p = alloc_page(GFP_HIGHUSER)) == NULL )
        {
            printk("Cannot alloc a frame to surrender\n");
            break;
        }
        pfn = p - mem_map;
        mfn = phys_to_machine_mapping[pfn];
                                                                                       
        if ( !PageHighMem(p) )
        {
            v = phys_to_virt(pfn << PAGE_SHIFT);
            scrub_pages(v, 1);
            queue_l1_entry_update(get_ptep((unsigned long)v), 0);
        }
                                                                                       
        /* Ensure that ballooned highmem pages don't have cached mappings. */
        kmap_flush_unused();

        /* Flush updates through and flush the TLB. */
        xen_tlb_flush();
                                                                                       
        phys_to_machine_mapping[pfn] = INVALID_P2M_ENTRY;
                                                                                       
        if ( HYPERVISOR_dom_mem_op(
            MEMOP_decrease_reservation, &mfn, 1, 0) != 1 )
        {
            printk("MEMOP_decrease_reservation failed\n");
            /* er... ok. free the page then */
            __free_page(p);
            break;
        }
                                                                                       
        accepting_pfn = pfn;
        ref = gnttab_grant_foreign_transfer( (domid_t) args.arg[0], pfn );
        printk("Accepting dom %lu frame at ref (%d)\n", args.arg[0], ref);
                                                                                       

 Transfer a frame to a foreign domain
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  domB:
        mmu_update_t            update;
        domid_t                 domid;
        grant_ref_t             gref;
        unsigned long           pfn, mfn, *v;
        struct page            *transfer_page = 0;
                                                                                       
        /* alloc a page and grant access.
         * alloc page returns a page struct. */
        if ( (transfer_page = alloc_page(GFP_HIGHUSER)) == NULL )
            return -ENOMEM;

        pfn = transfer_page - mem_map;
        mfn = phys_to_machine_mapping[pfn];

        /* need to remove all references to this page */
        if ( !PageHighMem(transfer_page) )
        {
            v = phys_to_virt(pfn << PAGE_SHIFT);
            scrub_pages(v, 1);
            sprintf((char *)v, "This page (%lx) was transferred.\n", mfn);
            queue_l1_entry_update(get_ptep((unsigned long)v), 0);
        }
#ifdef CONFIG_XEN_SCRUB_PAGES
        else
        {
            v = kmap(transfer_page);
            scrub_pages(v, 1);
            sprintf((char *)v, "This page (%lx) was transferred.\n", mfn);
            kunmap(transfer_page);
        }
#endif
        /* Delete any cached kmappings */
        kmap_flush_unused();

        /* Flush updates through and flush the TLB */
        xen_tlb_flush();

        /* invalidate in P2M */
        phys_to_machine_mapping[pfn] = INVALID_P2M_ENTRY;

        domid = (domid_t)args.arg[0];
        gref  = (grant_ref_t)args.arg[1];

        update.ptr  = MMU_EXTENDED_COMMAND;
        update.ptr |= ((gref & 0x00FF) << 2);
        update.ptr |= mfn << PAGE_SHIFT;
                                                                                       
        update.val  = MMUEXT_TRANSFER_PAGE;
        update.val |= (domid << 16);
        update.val |= (gref & 0xFF00);
                                                                                       
        ret = HYPERVISOR_mmu_update(&update, 1, NULL);
                                                                                       

 Map a transferred frame
 ~~~~~~~~~~~~~~~~~~~~~~~

 TODO:


 Clear the used transfer reference
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 TODO:


********************************************************************************

 Using a private reserve of grant references
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Where it is known in advance how many grant references are required, and
failure to allocate them on demand would cause difficulty, a batch can be
allocated and held in a private reserve.

To reserve a private batch:

    /* housekeeping data - treat as opaque: */
    grant_ref_t gref_head, gref_terminal;

    if ( 0 > gnttab_alloc_grant_references( number_to_reserve,
                                            &gref_head, &gref_terminal ))
        return -ENOSPC;


To release a batch back to the shared pool:

    gnttab_free_grant_references( number_reserved, gref_head );


To claim a reserved reference:

    ref = gnttab_claim_grant_reference( &gref_head, gref_terminal );


To release a claimed reference back to the reserve pool:

    gnttab_release_grant_reference( &gref_head, gref );


To use a claimed reference to grant access, use these alternative functions
that take an additional parameter of the grant reference to use:

    gnttab_grant_foreign_access_ref
    gnttab_grant_foreign_transfer_ref

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread
* RE: problem with netfront.c
@ 2005-04-04 12:42 Ian Pratt
  0 siblings, 0 replies; 11+ messages in thread
From: Ian Pratt @ 2005-04-04 12:42 UTC (permalink / raw)
  To: Ling, Xiaofeng, xen-devel

 
> > It's not actually a security problem, but using mfns is a bit ugly.
> > 
> I mean for a full-virtualization domain, if the guest can map 
> any mfn to its pfn, it will not be secure. 

It can't unless the fully virtualized domain is fully privileged, which
it shouldn't be.

> I have a quick look at the grant table, Is the main point 
> that put the mfn to the table and get an id, and then give 
> other domain an id, so the other domain is allowed to map that mfn?

Yes, that's how it works.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 11+ messages in thread
* RE: problem with netfront.c
@ 2005-04-04 10:06 Ling, Xiaofeng
  0 siblings, 0 replies; 11+ messages in thread
From: Ling, Xiaofeng @ 2005-04-04 10:06 UTC (permalink / raw)
  To: Ian Pratt, xen-devel



Ian Pratt <mailto:m+Ian.Pratt@cl.cam.ac.uk> wrote:
>>> Using grant tables, the front end doesn't need to know about machine
>>> addresses, and the whole thing ends up rather cleaner, particulary
>>> for domains running with virtualized VMs.
>> Yes, there do have security problem to use machine address in
>> netfront.
> 
> It's not actually a security problem, but using mfns is a bit ugly.
> 
I mean for a full-virtualization domain, if the guest can map any mfn to its pfn,
it will not be secure. 
I have a quick look at the grant table, Is the main point that put the mfn to the table and
get an id, and then give other domain an id, so the other domain is allowed to map that mfn?

^ permalink raw reply	[flat|nested] 11+ messages in thread
* RE: problem with netfront.c
@ 2005-04-04  7:21 Ian Pratt
  0 siblings, 0 replies; 11+ messages in thread
From: Ian Pratt @ 2005-04-04  7:21 UTC (permalink / raw)
  To: Jacob Gorm Hansen, xen-devel

> Are the grant references capabilities, or how do you prevent 
> domains from inventing their own? 

Domains create and maintain their own grant tables. They don't have to
be capabilities to be secure.

> Who takes care of garbage-collecting them when a domain exists or
dies? 

Since Xen tracks active grant references revocation is possible, but is
a slow-path operation. 

> Can a domain DoS a Xen-system by allocating all the grant refs in 
> the system?

No...


Ian

^ permalink raw reply	[flat|nested] 11+ messages in thread
* RE: problem with netfront.c
@ 2005-04-03 15:33 Ian Pratt
  0 siblings, 0 replies; 11+ messages in thread
From: Ian Pratt @ 2005-04-03 15:33 UTC (permalink / raw)
  To: Ling, Xiaofeng, xen-devel

[-- Attachment #1: Type: text/plain, Size: 1011 bytes --]

> > Using grant tables, the front end doesn't need to know 
> about machine 
> > addresses, and the whole thing ends up rather cleaner, 
> particulary for 
> > domains running with virtualized VMs.
> Yes, there do have security problem to use machine address in 
> netfront.

It's not actually a security problem, but using mfns is a bit ugly.

> I'm originally  thinking that we can pass virtual address to 
> backend through the ring buffer and let the backend to set 
> the mapping for the frontend. 
> I've already done a blkfront module driver in unmodified 
> linux with an event channel channel  and ctrlif module 
> driver. 

Great.

> I'll wait for the new blkfront with grant table. For 
> netfront, if no one has worked on the grant table update, I 
> can do it together.

I've attached the patch that updates the grant tables code and switches
the blk dev over to use it.

It would be great if you could workup something similar up for the
netfront/back.

Thanks,
Ian

[-- Attachment #2: block-over-gnttab.patch --]
[-- Type: application/octet-stream, Size: 66463 bytes --]

# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
#   2005/03/25 20:21:53+00:00 cwc22@centipede.cl.cam.ac.uk 
#   Support for using grant tables for communication between front and backend block devices.
#   Grant table support for mapping of remote domain memory now considered working.
# 
# xen/include/xen/grant_table.h
#   2005/03/25 20:21:52+00:00 cwc22@centipede.cl.cam.ac.uk +5 -2
#   Support for using grant tables for communication between front and backend block devices
# 
# xen/include/public/io/blkif.h
#   2005/03/25 20:21:51+00:00 cwc22@centipede.cl.cam.ac.uk +10 -2
#   Support for using grant tables for communication between front and backend block devices
# 
# xen/include/public/grant_table.h
#   2005/03/25 20:21:51+00:00 cwc22@centipede.cl.cam.ac.uk +6 -2
#   Support for using grant tables for communication between front and backend block devices
# 
# xen/common/grant_table.c
#   2005/03/25 20:21:51+00:00 cwc22@centipede.cl.cam.ac.uk +319 -207
#   Support for using grant tables for communication between front and backend block devices.
#   Grant table support for mapping of remote domain memory substantially more robust.
# 
# xen/arch/x86/mm.c
#   2005/03/25 20:21:51+00:00 cwc22@centipede.cl.cam.ac.uk +1 -1
#   Support for using grant tables for communication between front and backend block devices.
# 
# linux-2.6.11-xen-sparse/include/asm-xen/gnttab.h
#   2005/03/25 20:21:51+00:00 cwc22@centipede.cl.cam.ac.uk +34 -1
#   Added support for reserving batches of grant references.
#   Foreign transfer must specify pfn.
# 
# linux-2.6.11-xen-sparse/include/asm-xen/asm-i386/fixmap.h
#   2005/03/25 20:21:51+00:00 cwc22@centipede.cl.cam.ac.uk +3 -1
#   Support for multiple pages of grant references.
# 
# linux-2.6.11-xen-sparse/drivers/xen/blkfront/vbd.c
#   2005/03/25 20:21:51+00:00 cwc22@centipede.cl.cam.ac.uk +6 -0
#   Support for using grant tables for communication between front and backend block devices
# 
# linux-2.6.11-xen-sparse/drivers/xen/blkfront/block.h
#   2005/03/25 20:21:51+00:00 cwc22@centipede.cl.cam.ac.uk +4 -0
#   Support for using grant tables for communication between front and backend block devices
# 
# linux-2.6.11-xen-sparse/drivers/xen/blkfront/blkfront.c
#   2005/03/25 20:21:51+00:00 cwc22@centipede.cl.cam.ac.uk +114 -7
#   Support for using grant tables for communication between front and backend block devices
# 
# linux-2.6.11-xen-sparse/drivers/xen/blkback/blkback.c
#   2005/03/25 20:21:51+00:00 cwc22@centipede.cl.cam.ac.uk +129 -2
#   Support for using grant tables for communication between front and backend block devices
# 
# linux-2.6.11-xen-sparse/arch/xen/kernel/gnttab.c
#   2005/03/25 20:21:51+00:00 cwc22@centipede.cl.cam.ac.uk +106 -15
#   Support for using grant tables for communication between front and backend block devices
#   Grant table support for mapping of remote domain memory improved.
# 
# linux-2.6.11-xen-sparse/arch/xen/Kconfig
#   2005/03/25 20:21:51+00:00 cwc22@centipede.cl.cam.ac.uk +10 -0
#   Support for using grant tables for communication between front and backend block devices
# 
diff -Nru a/linux-2.6.11-xen-sparse/arch/xen/Kconfig b/linux-2.6.11-xen-sparse/arch/xen/Kconfig
--- a/linux-2.6.11-xen-sparse/arch/xen/Kconfig	2005-03-25 20:24:42 +00:00
+++ b/linux-2.6.11-xen-sparse/arch/xen/Kconfig	2005-03-25 20:24:42 +00:00
@@ -61,6 +61,16 @@
           with the blktap.  This option will be removed as the block drivers are
           modified to use grant tables.
 
+config XEN_BLKDEV_GRANT
+        bool "Grant table substrate for block drivers (DANGEROUS)"
+        depends on !XEN_BLKDEV_TAP_BE
+        default n
+        help
+          This introduces the use of grant tables as a data exhange mechanism
+          between the frontend and backend block drivers. This currently
+          conflicts with the block tap, and should be considered untested
+          and likely to render your system unstable.
+
 config XEN_NETDEV_BACKEND
 	bool "Network-device backend driver"
 	depends on XEN_PHYSDEV_ACCESS
diff -Nru a/linux-2.6.11-xen-sparse/arch/xen/kernel/gnttab.c b/linux-2.6.11-xen-sparse/arch/xen/kernel/gnttab.c
--- a/linux-2.6.11-xen-sparse/arch/xen/kernel/gnttab.c	2005-03-25 20:24:42 +00:00
+++ b/linux-2.6.11-xen-sparse/arch/xen/kernel/gnttab.c	2005-03-25 20:24:42 +00:00
@@ -41,9 +41,14 @@
 EXPORT_SYMBOL(gnttab_query_foreign_access);
 EXPORT_SYMBOL(gnttab_grant_foreign_transfer);
 EXPORT_SYMBOL(gnttab_end_foreign_transfer);
+EXPORT_SYMBOL(gnttab_alloc_grant_references);
+EXPORT_SYMBOL(gnttab_free_grant_references);
+EXPORT_SYMBOL(gnttab_claim_grant_reference);
+EXPORT_SYMBOL(gnttab_release_grant_reference);
+EXPORT_SYMBOL(gnttab_grant_foreign_access_ref);
+EXPORT_SYMBOL(gnttab_grant_foreign_transfer_ref);
 
-#define NR_GRANT_REFS 512
-static grant_ref_t gnttab_free_list[NR_GRANT_REFS];
+static grant_ref_t gnttab_free_list[NR_GRANT_ENTRIES];
 static grant_ref_t gnttab_free_head;
 
 static grant_entry_t *shared;
@@ -61,7 +66,7 @@
     void)
 {
     grant_ref_t fh, nfh = gnttab_free_head;
-    do { if ( unlikely((fh = nfh) == NR_GRANT_REFS) ) return -1; }
+    do { if ( unlikely((fh = nfh) == NR_GRANT_ENTRIES) ) return -1; }
     while ( unlikely((nfh = cmpxchg(&gnttab_free_head, fh,
                                     gnttab_free_list[fh])) != fh) );
     return fh;
@@ -97,6 +102,17 @@
     return ref;
 }
 
+void
+gnttab_grant_foreign_access_ref(
+    grant_ref_t ref, domid_t domid, unsigned long frame, int readonly)
+{
+    shared[ref].frame = frame;
+    shared[ref].domid = domid;
+    wmb();
+    shared[ref].flags = GTF_permit_access | (readonly ? GTF_readonly : 0);
+}
+
+
 int
 gnttab_query_foreign_access( grant_ref_t ref )
 {
@@ -124,14 +140,14 @@
 
 int
 gnttab_grant_foreign_transfer(
-    domid_t domid)
+    domid_t domid, unsigned long pfn )
 {
     int ref;
 
     if ( unlikely((ref = get_free_entry()) == -1) )
         return -ENOSPC;
 
-    shared[ref].frame = 0;
+    shared[ref].frame = pfn;
     shared[ref].domid = domid;
     wmb();
     shared[ref].flags = GTF_accept_transfer;
@@ -139,6 +155,16 @@
     return ref;
 }
 
+void
+gnttab_grant_foreign_transfer_ref(
+    grant_ref_t ref, domid_t domid, unsigned long pfn )
+{
+    shared[ref].frame = pfn;
+    shared[ref].domid = domid;
+    wmb();
+    shared[ref].flags = GTF_accept_transfer;
+}
+
 unsigned long
 gnttab_end_foreign_transfer(
     grant_ref_t ref)
@@ -163,6 +189,60 @@
     return frame;
 }
 
+void
+gnttab_free_grant_references( u16 count, grant_ref_t head )
+{
+    /* TODO: O(N)...? */
+    grant_ref_t to_die = 0, next = head;
+    int i;
+
+    for ( i = 0; i < count; i++ )
+        to_die = next;
+        next = gnttab_free_list[next];
+        put_free_entry( to_die );
+}
+
+int
+gnttab_alloc_grant_references( u16 count,
+                               grant_ref_t *head,
+                               grant_ref_t *terminal )
+{
+    int i;
+    grant_ref_t h = gnttab_free_head;
+
+    for ( i = 0; i < count; i++ )
+        if ( unlikely(get_free_entry() == -1) )
+            goto not_enough_refs;
+
+    *head = h;
+    *terminal = gnttab_free_head;
+
+    return 0;
+
+not_enough_refs:
+    gnttab_free_head = h;
+    return -ENOSPC;
+}
+
+int
+gnttab_claim_grant_reference( grant_ref_t *private_head,
+                              grant_ref_t  terminal )
+{
+    grant_ref_t g;
+    if ( unlikely((g = *private_head) == terminal) )
+        return -ENOSPC;
+    *private_head = gnttab_free_list[g];
+    return g;
+}
+
+void
+gnttab_release_grant_reference( grant_ref_t *private_head,
+                                grant_ref_t  release )
+{
+    gnttab_free_list[release] = *private_head;
+    *private_head = release;
+}
+
 static int grant_ioctl(struct inode *inode, struct file *file,
                        unsigned int cmd, unsigned long data)
 {
@@ -194,7 +274,7 @@
         TRAP_INSTR "; "
         "popl %%edi; popl %%esi; popl %%edx; popl %%ecx; popl %%ebx"
         : "=a" (ret) : "0" (&hypercall) : "memory" );
-                                                                                    
+
     return ret;
 }
 
@@ -212,7 +292,14 @@
     gt = (grant_entry_t *)shared;
     len = 0;
 
-    for ( i = 0; i < NR_GRANT_REFS; i++ )
+    for ( i = 0; i < NR_GRANT_ENTRIES; i++ )
+        /* TODO: safety catch here until this can handle >PAGE_SIZE output */
+        if (len > (PAGE_SIZE - 200))
+        {
+            len += sprintf( page + len, "Truncated.\n");
+            break;
+        }
+
         if ( gt[i].flags )
             len += sprintf( page + len,
                     "Grant: ref (0x%x) flags (0x%hx) dom (0x%hx) frame (0x%x)\n", 
@@ -235,22 +322,25 @@
 static int __init gnttab_init(void)
 {
     gnttab_setup_table_t setup;
-    unsigned long        frame;
+    unsigned long        frames[NR_GRANT_FRAMES];
     int                  i;
 
-    for ( i = 0; i < NR_GRANT_REFS; i++ )
-        gnttab_free_list[i] = i + 1;
-
     setup.dom        = DOMID_SELF;
-    setup.nr_frames  = 1;
-    setup.frame_list = &frame;
+    setup.nr_frames  = NR_GRANT_FRAMES;
+    setup.frame_list = frames;
+
     if ( HYPERVISOR_grant_table_op(GNTTABOP_setup_table, &setup, 1) != 0 )
         BUG();
     if ( setup.status != 0 )
         BUG();
 
-    set_fixmap_ma(FIX_GNTTAB, frame << PAGE_SHIFT);
-    shared = (grant_entry_t *)fix_to_virt(FIX_GNTTAB);
+    for ( i = 0; i < NR_GRANT_FRAMES; i++ )
+        set_fixmap_ma(FIX_GNTTAB_END - i, frames[i] << PAGE_SHIFT);
+
+    shared = (grant_entry_t *)fix_to_virt(FIX_GNTTAB_END);
+
+    for ( i = 0; i < NR_GRANT_ENTRIES; i++ )
+        gnttab_free_list[i] = i + 1;
 
     /*
      *  /proc/xen/grant : used by libxc to access grant tables
@@ -269,6 +359,7 @@
     grant_pde->read_proc  = &grant_read;
     grant_pde->write_proc = &grant_write;
 
+    printk("Grant table initialized\n");
     return 0;
 }
 
diff -Nru a/linux-2.6.11-xen-sparse/drivers/xen/blkback/blkback.c b/linux-2.6.11-xen-sparse/drivers/xen/blkback/blkback.c
--- a/linux-2.6.11-xen-sparse/drivers/xen/blkback/blkback.c	2005-03-25 20:24:42 +00:00
+++ b/linux-2.6.11-xen-sparse/drivers/xen/blkback/blkback.c	2005-03-25 20:24:42 +00:00
@@ -8,10 +8,14 @@
  *  arch/xen/drivers/blkif/frontend
  * 
  * Copyright (c) 2003-2004, Keir Fraser & Steve Hand
+ * Copyright (c) 2005, Christopher Clark
  */
 
 #include "common.h"
 #include <asm-xen/evtchn.h>
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+#include <asm-xen/xen-public/grant_table.h>
+#endif
 
 /*
  * These are rather arbitrary. They are fairly large because adjacent requests
@@ -69,6 +73,17 @@
 static kmem_cache_t *buffer_head_cachep;
 #endif
 
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+/* When using grant tables to map a frame for device access then the
+ * handle returned must be used to unmap the frame. This is needed to
+ * drop the ref count on the frame.
+ */
+static u16 pending_grant_handles[MMAP_PAGES_PER_REQUEST * MAX_PENDING_REQS];
+#define pending_handle(_idx, _i) \
+    (pending_grant_handles[((_idx) * MMAP_PAGES_PER_REQUEST) + (_i)])
+#define BLKBACK_INVALID_HANDLE (0xFFFF)
+#endif
+
 #ifdef CONFIG_XEN_BLKDEV_TAP_BE
 /*
  * If the tap driver is used, we may get pages belonging to either the tap
@@ -89,6 +104,26 @@
 
 static void fast_flush_area(int idx, int nr_pages)
 {
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+    gnttab_op_t       aop[MMAP_PAGES_PER_REQUEST];
+    unsigned int      i, invcount = 0;
+    u16               handle;
+
+    for ( i = 0; i < nr_pages; i++ )
+    {
+        if ( BLKBACK_INVALID_HANDLE != ( handle = pending_handle(idx, i) ) )
+        {
+            aop[i].u.unmap_grant_ref.host_virt_addr = MMAP_VADDR(idx, i);
+            aop[i].u.unmap_grant_ref.dev_bus_addr   = 0;
+            aop[i].u.unmap_grant_ref.handle         = handle;
+            pending_handle(idx, i) = BLKBACK_INVALID_HANDLE;
+            invcount++;
+        }
+    }
+    if ( unlikely(HYPERVISOR_grant_table_op(
+                    GNTTABOP_unmap_grant_ref, aop, invcount)))
+        BUG();
+#else
     multicall_entry_t mcl[MMAP_PAGES_PER_REQUEST];
     int               i;
 
@@ -103,6 +138,7 @@
     mcl[nr_pages-1].args[2] = UVMF_FLUSH_TLB;
     if ( unlikely(HYPERVISOR_multicall(mcl, nr_pages) != 0) )
         BUG();
+#endif
 }
 
 
@@ -334,6 +370,23 @@
          (blkif_last_sect(req->frame_and_sects[0]) != 7) )
         goto out;
 
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+    {
+        gnttab_op_t     op;
+
+        op.u.map_grant_ref.host_virt_addr = MMAP_VADDR(pending_idx, 0);
+        op.u.map_grant_ref.flags = GNTMAP_host_map;
+        op.u.map_grant_ref.ref = blkif_gref_from_fas(req->frame_and_sects[0]);
+        op.u.map_grant_ref.dom = blkif->domid;
+
+        if ( unlikely(HYPERVISOR_grant_table_op(
+                        GNTTABOP_map_grant_ref, &op, 1)))
+            BUG();
+
+        pending_handle(pending_idx, 0) = op.u.map_grant_ref.handle;
+    }
+#else /* else CONFIG_XEN_BLKDEV_GRANT */
+
 #ifdef CONFIG_XEN_BLKDEV_TAP_BE
     /* Grab the real frontend out of the probe message. */
     if (req->frame_and_sects[1] == BLKTAP_COOKIE) 
@@ -356,7 +409,8 @@
         
         goto out;
 #endif
-    
+#endif /* endif CONFIG_XEN_BLKDEV_GRANT */
+   
     rsp = vbd_probe(blkif, (vdisk_t *)MMAP_VADDR(pending_idx, 0), 
                     PAGE_SIZE / sizeof(vdisk_t));
 
@@ -373,8 +427,12 @@
     unsigned long buffer, fas;
     int i, tot_sects, pending_idx = pending_ring[MASK_PEND_IDX(pending_cons)];
     pending_req_t *pending_req;
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+    gnttab_op_t       aop[MMAP_PAGES_PER_REQUEST];
+#else
     unsigned long  remap_prot;
     multicall_entry_t mcl[MMAP_PAGES_PER_REQUEST];
+#endif
 
     /* We map virtual scatter/gather segments to physical segments. */
     int new_segs, nr_psegs = 0;
@@ -388,6 +446,54 @@
         goto bad_descriptor;
     }
 
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+    /* cwc22: FIXME: There is a BUG here. TODO: XXX
+     * The frames are mapped via grant tables, and _then_ the
+     * extents are determined, leading to possibly more physical
+     * scatter/gather segments than virtual segments.
+     * The front end needs to generate all required grant refs
+     * so that all required buffers can be mapped.
+     */
+
+    /* loop mapping all granted frames */
+    for ( i = 0; i < req->nr_segments; i++ )
+    {
+        fas      = req->frame_and_sects[i];
+        nr_sects = blkif_last_sect(fas) - blkif_first_sect(fas) + 1;
+
+        if ( nr_sects <= 0 )
+            goto bad_descriptor;
+
+        aop[i].u.map_grant_ref.host_virt_addr = MMAP_VADDR(pending_idx, i);
+
+        aop[i].u.map_grant_ref.dom = blkif->domid;
+        aop[i].u.map_grant_ref.ref = blkif_gref_from_fas(fas);
+        aop[i].u.map_grant_ref.flags = ( GNTMAP_host_map   |
+                                       ( ( operation == READ ) ?
+                                             0 : GNTMAP_readonly ) );
+    }
+
+    if ( unlikely(HYPERVISOR_grant_table_op(
+                    GNTTABOP_map_grant_ref, aop, req->nr_segments)))
+        BUG();
+
+    for ( i = 0; i < req->nr_segments; i++ )
+    {
+        if ( unlikely(aop[i].u.map_grant_ref.dev_bus_addr == 0) )
+        {
+            DPRINTK("invalid buffer -- could not remap it\n");
+            fast_flush_area(pending_idx, req->nr_segments);
+            goto bad_descriptor;
+        }
+
+        phys_to_machine_mapping[__pa(MMAP_VADDR(pending_idx, i))>>PAGE_SHIFT] =
+            FOREIGN_FRAME(aop[i].u.map_grant_ref.dev_bus_addr);
+
+        pending_handle(pending_idx, i) = aop[i].u.map_grant_ref.handle;
+
+    }
+#endif
+
     /*
      * Check each address/size pair is sane, and convert into a
      * physical device and block offset. Note that if the offset and size
@@ -397,12 +503,18 @@
     for ( i = tot_sects = 0; i < req->nr_segments; i++, tot_sects += nr_sects )
     {
         fas      = req->frame_and_sects[i];
-        buffer   = (fas & PAGE_MASK) | (blkif_first_sect(fas) << 9);
         nr_sects = blkif_last_sect(fas) - blkif_first_sect(fas) + 1;
 
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+        buffer   = (aop[i].u.map_grant_ref.dev_bus_addr << PAGE_SHIFT) |
+                   (blkif_first_sect(fas) << 9);
+#else
         if ( nr_sects <= 0 )
             goto bad_descriptor;
 
+        buffer   = (fas & PAGE_MASK) | (blkif_first_sect(fas) << 9);
+#endif
+
         phys_seg[nr_psegs].dev           = req->device;
         phys_seg[nr_psegs].sector_number = req->sector_number + tot_sects;
         phys_seg[nr_psegs].buffer        = buffer;
@@ -419,6 +531,13 @@
                     req->device); 
             goto bad_descriptor;
         }
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+        if ( new_segs > 1 )
+        {
+            DPRINTK("Block translation: bad: virt to phys seg overrun\n");
+            goto bad_descriptor;
+        }
+#endif
   
         nr_psegs += new_segs;
         ASSERT(nr_psegs <= (BLKIF_MAX_SEGMENTS_PER_REQUEST+1));
@@ -428,6 +547,7 @@
     if ( unlikely(nr_psegs == 0) )
         goto bad_descriptor;
 
+#ifndef CONFIG_XEN_BLKDEV_GRANT
     if ( operation == READ )
         remap_prot = _PAGE_PRESENT|_PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_RW;
     else
@@ -460,6 +580,7 @@
             goto bad_descriptor;
         }
     }
+#endif /* end ifndef CONFIG_XEN_BLKDEV_GRANT */
 
     pending_req = &pending_reqs[pending_idx];
     pending_req->blkif     = blkif;
@@ -612,9 +733,15 @@
 
     blkif_ctrlif_init();
     
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+    memset( pending_grant_handles,  BLKBACK_INVALID_HANDLE,
+            MMAP_PAGES_PER_REQUEST * MAX_PENDING_REQS );
+#endif
+
 #ifdef CONFIG_XEN_BLKDEV_TAP_BE
     printk(KERN_ALERT "NOTE: Blkif backend is running with tap support on!\n");
 #endif
+
     return 0;
 }
 
diff -Nru a/linux-2.6.11-xen-sparse/drivers/xen/blkfront/blkfront.c b/linux-2.6.11-xen-sparse/drivers/xen/blkfront/blkfront.c
--- a/linux-2.6.11-xen-sparse/drivers/xen/blkfront/blkfront.c	2005-03-25 20:24:42 +00:00
+++ b/linux-2.6.11-xen-sparse/drivers/xen/blkfront/blkfront.c	2005-03-25 20:24:42 +00:00
@@ -7,6 +7,7 @@
  * Modifications by Mark A. Williamson are (c) Intel Research Cambridge
  * Copyright (c) 2004, Christian Limpach
  * Copyright (c) 2004, Andrew Warfield
+ * Copyright (c) 2005, Christopher Clark
  * 
  * This file may be distributed separately from the Linux kernel, or
  * incorporated into other software packages, subject to the following license:
@@ -30,6 +31,14 @@
  * IN THE SOFTWARE.
  */
 
+#if 1
+#define ASSERT(_p) \
+    if ( !(_p) ) { printk("Assertion '%s' failed, line %d, file %s", #_p , \
+    __LINE__, __FILE__); *(int*)0=0; }
+#else
+#define ASSERT(_p)
+#endif
+
 #include <linux/version.h>
 
 #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0)
@@ -46,6 +55,10 @@
 #include <scsi/scsi.h>
 #include <asm-xen/ctrl_if.h>
 #include <asm-xen/evtchn.h>
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+#include <asm-xen/xen-public/grant_table.h>
+#include <asm-xen/gnttab.h>
+#endif
 
 typedef unsigned char byte; /* from linux/ide.h */
 
@@ -74,6 +87,13 @@
 
 static blkif_front_ring_t blk_ring;
 
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+static domid_t rdomid = 0;
+static grant_ref_t gref_head, gref_terminal;
+#define MAXIMUM_OUTSTANDING_BLOCK_REQS \
+    (BLKIF_MAX_SEGMENTS_PER_REQUEST * BLKIF_RING_SIZE)
+#endif
+
 unsigned long rec_ring_free;
 blkif_request_t rec_ring[RING_SIZE(&blk_ring)];
 
@@ -129,7 +149,11 @@
     xreq->sector_number = req->sector_number;
 
     for ( i = 0; i < req->nr_segments; i++ )
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+        xreq->frame_and_sects[i] = req->frame_and_sects[i];
+#else
         xreq->frame_and_sects[i] = machine_to_phys(req->frame_and_sects[i]);
+#endif
 }
 
 static inline void translate_req_to_mfn(blkif_request_t *xreq,
@@ -144,7 +168,11 @@
     xreq->sector_number = req->sector_number;
 
     for ( i = 0; i < req->nr_segments; i++ )
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+        xreq->frame_and_sects[i] = req->frame_and_sects[i];
+#else
         xreq->frame_and_sects[i] = phys_to_machine(req->frame_and_sects[i]);
+#endif
 }
 
 
@@ -333,6 +361,9 @@
     int idx;
     unsigned long id;
     unsigned int fsect, lsect;
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+    int ref;
+#endif
 
     if ( unlikely(blkif_state != BLKIF_STATE_CONNECTED) )
         return 1;
@@ -358,8 +389,23 @@
             buffer_ma = page_to_phys(bvec->bv_page);
             fsect = bvec->bv_offset >> 9;
             lsect = fsect + (bvec->bv_len >> 9) - 1;
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+            /* install a grant reference. */
+            ref = gnttab_claim_grant_reference(&gref_head, gref_terminal);
+            ASSERT( ref != -ENOSPC );
+
+            gnttab_grant_foreign_access_ref(
+                        ref,
+                        rdomid,
+                        buffer_ma >> PAGE_SHIFT,
+                        rq_data_dir(req) );
+
+            ring_req->frame_and_sects[ring_req->nr_segments++] =
+                (((u32) ref) << 16) | (fsect << 3) | lsect;
+#else
             ring_req->frame_and_sects[ring_req->nr_segments++] =
                 buffer_ma | (fsect << 3) | lsect;
+#endif
         }
     }
 
@@ -779,6 +825,9 @@
     blkif_request_t    *req;
     struct buffer_head *bh;
     unsigned int        fsect, lsect;
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+    int ref;
+#endif
 
     fsect = (buffer_ma & ~PAGE_MASK) >> 9;
     lsect = fsect + nr_sectors - 1;
@@ -826,11 +875,25 @@
      
             bh->b_reqnext = (struct buffer_head *)rec_ring[req->id].id;
      
-
             rec_ring[req->id].id = id;
+                                                                                                
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+            /* install a grant reference. */
+            ref = gnttab_claim_grant_reference(&gref_head, gref_terminal);
+            ASSERT( ref >= 0 );
+
+            gnttab_grant_foreign_access_ref(
+                        ref,
+                        rdomid,
+                        buffer_ma >> PAGE_SHIFT,
+                        rq_data_dir(req) );
 
-            req->frame_and_sects[req->nr_segments] = 
-                buffer_ma | (fsect<<3) | lsect;
+            req->frame_and_sects[req->nr_segments] =
+                (((u32) ref ) << 16) | (fsect << 3) | lsect;
+#else
+             req->frame_and_sects[req->nr_segments] =
+                 buffer_ma | (fsect << 3) | lsect;
+#endif
             if ( ++req->nr_segments < BLKIF_MAX_SEGMENTS_PER_REQUEST )
                 sg_next_sect += nr_sectors;
             else
@@ -868,7 +931,21 @@
     req->sector_number = (blkif_sector_t)sector_number;
     req->device        = device; 
     req->nr_segments   = 1;
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+    /* install a grant reference. */
+    ref = gnttab_claim_grant_reference(&gref_head, gref_terminal);
+    ASSERT( ref >= 0 );
+
+    gnttab_grant_foreign_access_ref(
+                ref,
+                rdomid,
+                buffer_ma >> PAGE_SHIFT,
+                rq_data_dir(req) );
+
+    req->frame_and_sects[0] = (((u32) ref)<<16)  | (fsect<<3) | lsect;
+#else
     req->frame_and_sects[0] = buffer_ma | (fsect<<3) | lsect;
+#endif
 
     /* Keep a private copy so we can reissue requests when recovering. */    
     translate_req_to_pfn(&rec_ring[xid], req );
@@ -1026,6 +1103,20 @@
 
 /*****************************  COMMON CODE  *******************************/
 
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+void blkif_control_probe_send(blkif_request_t *req, blkif_response_t *rsp,
+                              unsigned long address)
+{
+    int ref = gnttab_claim_grant_reference(&gref_head, gref_terminal);
+    ASSERT( ref >= 0 );
+
+    gnttab_grant_foreign_access_ref( ref, rdomid, address >> PAGE_SHIFT, 0 );
+
+    req->frame_and_sects[0] = (((u32) ref) << 16)  | 7;
+
+    blkif_control_send(req, rsp);
+}
+#endif
 
 void blkif_control_send(blkif_request_t *req, blkif_response_t *rsp)
 {
@@ -1206,6 +1297,9 @@
 
     blkif_evtchn = status->evtchn;
     blkif_irq    = bind_evtchn_to_irq(blkif_evtchn);
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+    rdomid       = status->domid;
+#endif
 
     err = request_irq(blkif_irq, blkif_int, SA_SAMPLE_RANDOM, "blkif", NULL);
     if ( err )
@@ -1367,7 +1461,13 @@
 int __init xlblk_init(void)
 {
     int i;
-    
+
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+    if ( 0 > gnttab_alloc_grant_references( MAXIMUM_OUTSTANDING_BLOCK_REQS,
+                                            &gref_head, &gref_terminal ))
+        return 1;
+#endif
+
     if ( (xen_start_info.flags & SIF_INITDOMAIN) ||
          (xen_start_info.flags & SIF_BLK_BE_DOMAIN) )
         return 0;
@@ -1396,12 +1496,19 @@
     send_driver_status(1);
 }
 
-/* XXXXX THIS IS A TEMPORARY FUNCTION UNTIL WE GET GRANT TABLES */
-
 void blkif_completion(blkif_request_t *req)
 {
     int i;
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+    grant_ref_t gref;
 
+    for ( i = 0; i < req->nr_segments; i++ )
+    {
+        gref = blkif_gref_from_fas(req->frame_and_sects[i]);
+        gnttab_release_grant_reference(&gref_head, gref);
+    }
+#else
+    /* This is a hack to get the dirty logging bits set */
     switch ( req->operation )
     {
     case BLKIF_OP_READ:
@@ -1413,5 +1520,5 @@
         }
         break;
     }
-    
+#endif
 }
diff -Nru a/linux-2.6.11-xen-sparse/drivers/xen/blkfront/block.h b/linux-2.6.11-xen-sparse/drivers/xen/blkfront/block.h
--- a/linux-2.6.11-xen-sparse/drivers/xen/blkfront/block.h	2005-03-25 20:24:42 +00:00
+++ b/linux-2.6.11-xen-sparse/drivers/xen/blkfront/block.h	2005-03-25 20:24:42 +00:00
@@ -104,6 +104,10 @@
 extern int blkif_check(dev_t dev);
 extern int blkif_revalidate(dev_t dev);
 extern void blkif_control_send(blkif_request_t *req, blkif_response_t *rsp);
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+extern void blkif_control_probe_send(
+    blkif_request_t *req, blkif_response_t *rsp, unsigned long address);
+#endif
 extern void do_blkif_request (request_queue_t *rq); 
 
 extern void xlvbd_update_vbds(void);
diff -Nru a/linux-2.6.11-xen-sparse/drivers/xen/blkfront/vbd.c b/linux-2.6.11-xen-sparse/drivers/xen/blkfront/vbd.c
--- a/linux-2.6.11-xen-sparse/drivers/xen/blkfront/vbd.c	2005-03-25 20:24:42 +00:00
+++ b/linux-2.6.11-xen-sparse/drivers/xen/blkfront/vbd.c	2005-03-25 20:24:42 +00:00
@@ -112,9 +112,15 @@
     memset(&req, 0, sizeof(req));
     req.operation   = BLKIF_OP_PROBE;
     req.nr_segments = 1;
+
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+    blkif_control_probe_send(&req, &rsp,
+                             (unsigned long)(virt_to_machine(buf)));
+#else
     req.frame_and_sects[0] = virt_to_machine(buf) | 7;
 
     blkif_control_send(&req, &rsp);
+#endif
 
     if ( rsp.status <= 0 )
     {
diff -Nru a/linux-2.6.11-xen-sparse/include/asm-xen/asm-i386/fixmap.h b/linux-2.6.11-xen-sparse/include/asm-xen/asm-i386/fixmap.h
--- a/linux-2.6.11-xen-sparse/include/asm-xen/asm-i386/fixmap.h	2005-03-25 20:24:42 +00:00
+++ b/linux-2.6.11-xen-sparse/include/asm-xen/asm-i386/fixmap.h	2005-03-25 20:24:42 +00:00
@@ -27,6 +27,7 @@
 #include <asm/acpi.h>
 #include <asm/apicdef.h>
 #include <asm/page.h>
+#include <asm-xen/gnttab.h>
 #ifdef CONFIG_HIGHMEM
 #include <linux/threads.h>
 #include <asm/kmap_types.h>
@@ -84,7 +85,8 @@
 	FIX_PCIE_MCFG,
 #endif
 	FIX_SHARED_INFO,
-	FIX_GNTTAB,
+	FIX_GNTTAB_BEGIN,
+	FIX_GNTTAB_END = FIX_GNTTAB_BEGIN + NR_GRANT_FRAMES - 1,
 #ifdef CONFIG_XEN_PHYSDEV_ACCESS
 #define NR_FIX_ISAMAPS	256
 	FIX_ISAMAP_END,
diff -Nru a/linux-2.6.11-xen-sparse/include/asm-xen/gnttab.h b/linux-2.6.11-xen-sparse/include/asm-xen/gnttab.h
--- a/linux-2.6.11-xen-sparse/include/asm-xen/gnttab.h	2005-03-25 20:24:42 +00:00
+++ b/linux-2.6.11-xen-sparse/include/asm-xen/gnttab.h	2005-03-25 20:24:42 +00:00
@@ -7,6 +7,7 @@
  * (i.e., mechanisms for both sender and recipient of grant references)
  * 
  * Copyright (c) 2004, K A Fraser
+ * Copyright (c) 2005, Christopher Clark
  */
 
 #ifndef __ASM_GNTTAB_H__
@@ -16,6 +17,10 @@
 #include <asm-xen/hypervisor.h>
 #include <asm-xen/xen-public/grant_table.h>
 
+/* NR_GRANT_FRAMES must be less than or equal to that configured in Xen */
+#define NR_GRANT_FRAMES 4
+#define NR_GRANT_ENTRIES (NR_GRANT_FRAMES * PAGE_SIZE / sizeof(grant_entry_t))
+
 int
 gnttab_grant_foreign_access(
     domid_t domid, unsigned long frame, int readonly);
@@ -26,7 +31,7 @@
 
 int
 gnttab_grant_foreign_transfer(
-    domid_t domid);
+    domid_t domid, unsigned long pfn);
 
 unsigned long
 gnttab_end_foreign_transfer(
@@ -35,5 +40,33 @@
 int
 gnttab_query_foreign_access( 
     grant_ref_t ref );
+
+/*
+ * operations on reserved batches of grant references
+ */
+int
+gnttab_alloc_grant_references(
+    u16 count, grant_ref_t *pprivate_head, grant_ref_t *private_terminal );
+
+void
+gnttab_free_grant_references(
+    u16 count, grant_ref_t private_head );
+
+int
+gnttab_claim_grant_reference( grant_ref_t *pprivate_head, grant_ref_t terminal
+);
+
+void
+gnttab_release_grant_reference(
+    grant_ref_t *private_head, grant_ref_t release );
+
+void
+gnttab_grant_foreign_access_ref(
+    grant_ref_t ref, domid_t domid, unsigned long frame, int readonly);
+
+void
+gnttab_grant_foreign_transfer_ref(
+    grant_ref_t, domid_t domid, unsigned long pfn);
+
 
 #endif /* __ASM_GNTTAB_H__ */
diff -Nru a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
--- a/xen/arch/x86/mm.c	2005-03-25 20:24:42 +00:00
+++ b/xen/arch/x86/mm.c	2005-03-25 20:24:42 +00:00
@@ -1527,7 +1527,7 @@
         spin_unlock(&e->page_alloc_lock);
 
         /* Transfer is all done: tell the guest about its new page frame. */
-        gnttab_notify_transfer(e, gntref, pfn);
+        gnttab_notify_transfer(e, d, gntref, pfn);
         
         put_domain(e);
         break;
diff -Nru a/xen/common/grant_table.c b/xen/common/grant_table.c
--- a/xen/common/grant_table.c	2005-03-25 20:24:42 +00:00
+++ b/xen/common/grant_table.c	2005-03-25 20:24:42 +00:00
@@ -26,14 +26,14 @@
 
 #include <xen/config.h>
 #include <xen/sched.h>
-#include <asm-x86/mm.h>
-#include <asm-x86/shadow.h>
+#include <asm/mm.h>
+#include <asm/shadow.h>
 
-#define PIN_FAIL(_rc, _f, _a...)   \
+#define PIN_FAIL(_lbl, _rc, _f, _a...)   \
     do {                           \
         DPRINTK( _f, ## _a );      \
         rc = (_rc);                \
-        goto fail;                 \
+        goto _lbl;                 \
     } while ( 0 )
 
 static inline int
@@ -58,23 +58,38 @@
 }
 
 static int
-__gnttab_map_grant_ref(
-    gnttab_map_grant_ref_t *uop,
-    unsigned long *va)
+__gnttab_activate_grant_ref(
+    struct domain          *mapping_d,          /* IN */
+    struct exec_domain     *mapping_ed,
+    struct domain          *granting_d,
+    grant_ref_t             ref,
+    u16                     dev_hst_ro_flags,
+    unsigned long           host_virt_addr,
+    unsigned long          *pframe )            /* OUT */
 {
-    domid_t               dom, sdom;
-    grant_ref_t           ref;
-    struct domain        *ld, *rd;
-    struct exec_domain   *led;
-    u16                   flags, sflags;
-    int                   handle;
+    domid_t               sdom;
+    u16                   sflags;
     active_grant_entry_t *act;
     grant_entry_t        *sha;
-    s16                   rc = 0;
-    unsigned long         frame = 0, host_virt_addr;
+    s16                   rc = 1;
+    unsigned long         frame = 0;
+    int                   retries = 0;
 
-    /* Returns 0 if TLB flush / invalidate required by caller.
-     * va will indicate the address to be invalidated. */
+    /*
+     * Objectives of this function:
+     * . Make the record ( granting_d, ref ) active, if not already.
+     * . Update shared grant entry of owner, indicating frame is mapped.
+     * . Increment the owner act->pin reference counts.
+     * . get_page on shared frame if new mapping.
+     * . get_page_type if this is first RW mapping of frame.
+     * . Add PTE to virtual address space of mapping_d, if necessary.
+     * Returns:
+     * .  -ve: error
+     * .    1: ok
+     * .    0: ok and TLB invalidate of host_virt_addr needed.
+     *
+     * On success, *pframe contains mfn.
+     */
 
     /*
      * We bound the number of times we retry CMPXCHG on memory locations that
@@ -84,61 +99,11 @@
      * the guest to race our updates (e.g., to change the GTF_readonly flag),
      * so we allow a few retries before failing.
      */
-    int            retries = 0;
-
-    led = current;
-    ld = led->domain;
 
-    /* Bitwise-OR avoids short-circuiting which screws control flow. */
-    if ( unlikely(__get_user(dom, &uop->dom) |
-                  __get_user(ref, &uop->ref) |
-                  __get_user(host_virt_addr, &uop->host_virt_addr) |
-                  __get_user(flags, &uop->flags)) )
-    {
-        DPRINTK("Fault while reading gnttab_map_grant_ref_t.\n");
-        return -EFAULT; /* don't set status */
-    }
-
-    if ( ((host_virt_addr != 0) || (flags & GNTMAP_host_map) ) &&
-         unlikely(!__addr_ok(host_virt_addr)))
-    {
-        DPRINTK("Bad virtual address (%x) or flags (%x).\n", host_virt_addr, flags);
-        (void)__put_user(GNTST_bad_virt_addr, &uop->handle);
-        return GNTST_bad_gntref;
-    }
+    act = &granting_d->grant_table->active[ref];
+    sha = &granting_d->grant_table->shared[ref];
 
-    if ( unlikely(ref >= NR_GRANT_ENTRIES) ||
-         unlikely((flags & (GNTMAP_device_map|GNTMAP_host_map)) == 0) )
-    {
-        DPRINTK("Bad ref (%d) or flags (%x).\n", ref, flags);
-        (void)__put_user(GNTST_bad_gntref, &uop->handle);
-        return GNTST_bad_gntref;
-    }
-
-    if ( unlikely((rd = find_domain_by_id(dom)) == NULL) ||
-         unlikely(ld == rd) )
-    {
-        if ( rd != NULL )
-            put_domain(rd);
-        DPRINTK("Could not find domain %d\n", dom);
-        (void)__put_user(GNTST_bad_domain, &uop->handle);
-        return GNTST_bad_domain;
-    }
-
-    if ( unlikely((handle = get_maptrack_handle(ld->grant_table)) == -1) )
-    {
-        put_domain(rd);
-        DPRINTK("No more map handles available\n");
-        (void)__put_user(GNTST_no_device_space, &uop->handle);
-        return GNTST_no_device_space;
-    }
-    DPRINTK("Mapping grant ref (%hu) for domain (%hu) with flags (%x)\n",
-            ref, dom, flags);
-
-    act = &rd->grant_table->active[ref];
-    sha = &rd->grant_table->shared[ref];
-
-    spin_lock(&rd->grant_table->lock);
+    spin_lock(&granting_d->grant_table->lock);
 
     if ( act->pin == 0 )
     {
@@ -152,21 +117,21 @@
             u32 scombo, prev_scombo, new_scombo;
 
             if ( unlikely((sflags & GTF_type_mask) != GTF_permit_access) ||
-                 unlikely(sdom != ld->id) )
-                PIN_FAIL(GNTST_general_error,
+                 unlikely(sdom != mapping_d->id) )
+                PIN_FAIL(unlock_out, GNTST_general_error,
                          "Bad flags (%x) or dom (%d). (NB. expected dom %d)\n",
-                        sflags, sdom, ld->id);
+                        sflags, sdom, mapping_d->id);
 
             /* Merge two 16-bit values into a 32-bit combined update. */
             /* NB. Endianness! */
             prev_scombo = scombo = ((u32)sdom << 16) | (u32)sflags;
 
             new_scombo = scombo | GTF_reading;
-            if ( !(flags & GNTMAP_readonly) )
+            if ( !(dev_hst_ro_flags & GNTMAP_readonly) )
             {
                 new_scombo |= GTF_writing;
                 if ( unlikely(sflags & GTF_readonly) )
-                    PIN_FAIL(GNTST_general_error,
+                    PIN_FAIL(unlock_out, GNTST_general_error,
                              "Attempt to write-pin a r/o grant entry.\n");
             }
 
@@ -174,7 +139,7 @@
             if ( unlikely(cmpxchg_user((u32 *)&sha->flags,
                                        prev_scombo,
                                        new_scombo)) )
-                PIN_FAIL(GNTST_general_error,
+                PIN_FAIL(unlock_out, GNTST_general_error,
                          "Fault while modifying shared flags and domid.\n");
 
             /* Did the combined update work (did we see what we expected?). */
@@ -182,7 +147,7 @@
                 break;
 
             if ( retries++ == 4 )
-                PIN_FAIL(GNTST_general_error,
+                PIN_FAIL(unlock_out, GNTST_general_error,
                          "Shared grant entry is unstable.\n");
 
             /* Didn't see what we expected. Split out the seen flags & dom. */
@@ -193,25 +158,25 @@
 
         /* rmb(); */ /* not on x86 */
 
-        frame = __translate_gpfn_to_mfn(rd, sha->frame);
+        frame = __translate_gpfn_to_mfn(granting_d, sha->frame);
 
         if ( unlikely(!pfn_is_ram(frame)) ||
-             unlikely(!((flags & GNTMAP_readonly) ?
-                        get_page(&frame_table[frame], rd) :
-                        get_page_and_type(&frame_table[frame], rd,
+             unlikely(!((dev_hst_ro_flags & GNTMAP_readonly) ?
+                        get_page(&frame_table[frame], granting_d) :
+                        get_page_and_type(&frame_table[frame], granting_d,
                                           PGT_writable_page))) )
         {
             clear_bit(_GTF_writing, &sha->flags);
             clear_bit(_GTF_reading, &sha->flags);
-            PIN_FAIL(GNTST_general_error,
-                     "Could not pin the granted frame!\n");
+            PIN_FAIL(unlock_out, GNTST_general_error,
+                     "Could not pin the granted frame (%lx)!\n", frame);
         }
 
-        if ( flags & GNTMAP_device_map )
-            act->pin += (flags & GNTMAP_readonly) ? 
+        if ( dev_hst_ro_flags & GNTMAP_device_map )
+            act->pin += (dev_hst_ro_flags & GNTMAP_readonly) ?
                 GNTPIN_devr_inc : GNTPIN_devw_inc;
-        if ( flags & GNTMAP_host_map )
-            act->pin += (flags & GNTMAP_readonly) ?
+        if ( dev_hst_ro_flags & GNTMAP_host_map )
+            act->pin += (dev_hst_ro_flags & GNTMAP_readonly) ?
                 GNTPIN_hstr_inc : GNTPIN_hstw_inc;
         act->domid = sdom;
         act->frame = frame;
@@ -225,11 +190,11 @@
          * A more accurate check cannot be done with a single comparison.
          */
         if ( (act->pin & 0x80808080U) != 0 )
-            PIN_FAIL(ENOSPC, "Risk of counter overflow %08x\n", act->pin);
+            PIN_FAIL(unlock_out, ENOSPC, "Risk of counter overflow %08x\n", act->pin);
 
         frame = act->frame;
 
-        if ( !(flags & GNTMAP_readonly) && 
+        if ( !(dev_hst_ro_flags & GNTMAP_readonly) && 
              !((sflags = sha->flags) & GTF_writing) )
         {
             for ( ; ; )
@@ -237,7 +202,7 @@
                 u16 prev_sflags;
                 
                 if ( unlikely(sflags & GTF_readonly) )
-                    PIN_FAIL(GNTST_general_error,
+                    PIN_FAIL(unlock_out, GNTST_general_error,
                              "Attempt to write-pin a r/o grant entry.\n");
 
                 prev_sflags = sflags;
@@ -245,14 +210,14 @@
                 /* NB. prev_sflags is updated in place to seen value. */
                 if ( unlikely(cmpxchg_user(&sha->flags, prev_sflags, 
                                            prev_sflags | GTF_writing)) )
-                    PIN_FAIL(GNTST_general_error,
+                    PIN_FAIL(unlock_out, GNTST_general_error,
                          "Fault while modifying shared flags.\n");
 
                 if ( likely(prev_sflags == sflags) )
                     break;
 
                 if ( retries++ == 4 )
-                    PIN_FAIL(GNTST_general_error,
+                    PIN_FAIL(unlock_out, GNTST_general_error,
                              "Shared grant entry is unstable.\n");
 
                 sflags = prev_sflags;
@@ -262,97 +227,176 @@
                                          PGT_writable_page)) )
             {
                 clear_bit(_GTF_writing, &sha->flags);
-                PIN_FAIL(GNTST_general_error,
+                PIN_FAIL(unlock_out, GNTST_general_error,
                          "Attempt to write-pin a unwritable page.\n");
             }
         }
 
-        if ( flags & GNTMAP_device_map )
-            act->pin += (flags & GNTMAP_readonly) ? 
+        if ( dev_hst_ro_flags & GNTMAP_device_map )
+            act->pin += (dev_hst_ro_flags & GNTMAP_readonly) ? 
                 GNTPIN_devr_inc : GNTPIN_devw_inc;
-        if ( flags & GNTMAP_host_map )
-            act->pin += (flags & GNTMAP_readonly) ?
+        if ( dev_hst_ro_flags & GNTMAP_host_map )
+            act->pin += (dev_hst_ro_flags & GNTMAP_readonly) ?
                 GNTPIN_hstr_inc : GNTPIN_hstw_inc;
     }
 
     /* At this point:
-     * act->pin updated to reflect mapping
-     * sha->flags updated to indicate to granting domain mapping done
-     * frame contains the mfn
+     * act->pin updated to reflect mapping.
+     * sha->flags updated to indicate to granting domain mapping done.
+     * frame contains the mfn.
      */
 
-    if ( (host_virt_addr != 0) && (flags & GNTMAP_host_map) )
+    spin_unlock(&granting_d->grant_table->lock);
+
+    if ( (host_virt_addr != 0) && (dev_hst_ro_flags & GNTMAP_host_map) )
     {
         /* Write update into the pagetable
          */
 
-        /* cwc22: TODO: check locking... */
-
-        spin_unlock(&rd->grant_table->lock);
+        spin_unlock(&granting_d->grant_table->lock);
 
         rc = update_grant_va_mapping( host_virt_addr,
                                 (frame << PAGE_SHIFT) | _PAGE_PRESENT  |
                                                         _PAGE_ACCESSED |
                                                         _PAGE_DIRTY    |
-                       ((flags & GNTMAP_readonly) ? 0 : _PAGE_RW),
-                       ld, led );
+                       ((dev_hst_ro_flags & GNTMAP_readonly) ? 0 : _PAGE_RW),
+                       mapping_d, mapping_ed );
 
-        spin_lock(&rd->grant_table->lock);
+        /* IMPORTANT: (rc == 0) => must flush / invalidate entry in TLB.
+         * This is done in the outer gnttab_map_grant_ref.
+         */
 
         if ( 0 > rc )
         {
             /* Abort. */
-            act->pin -= (flags & GNTMAP_readonly) ?
-                GNTPIN_hstr_inc : GNTPIN_hstw_inc;
 
-            if ( flags & GNTMAP_readonly )
+            spin_lock(&granting_d->grant_table->lock);
+
+            if ( dev_hst_ro_flags & GNTMAP_readonly )
                 act->pin -= GNTPIN_hstr_inc;
             else
             {
                 act->pin -= GNTPIN_hstw_inc;
                 if ( (act->pin & (GNTPIN_hstw_mask|GNTPIN_devw_mask)) == 0 )
                 {
-                    put_page_type(&frame_table[frame]);
                     clear_bit(_GTF_writing, &sha->flags);
+                    put_page_type(&frame_table[frame]);
                 }
             }
             if ( act->pin == 0 )
             {
-                put_page(&frame_table[frame]);
                 clear_bit(_GTF_reading, &sha->flags);
+                put_page(&frame_table[frame]);
             }
-            goto fail;
+
+            spin_unlock(&granting_d->grant_table->lock);
         }
 
-        rc = 0;
-        *va = host_virt_addr;
+    }
+    *pframe = frame;
+    return rc;
 
-        /* IMPORTANT: must flush / invalidate entry in TLB.
-         * This is done in the outer gnttab_map_grant_ref when return 0.
-         */
+ unlock_out:
+    spin_unlock(&granting_d->grant_table->lock);
+    return rc;
+}
+
+static int
+__gnttab_map_grant_ref(
+    gnttab_map_grant_ref_t *uop,
+    unsigned long *va)
+{
+    domid_t               dom;
+    grant_ref_t           ref;
+    struct domain        *ld, *rd;
+    struct exec_domain   *led;
+    u16                   dev_hst_ro_flags;
+    int                   handle;
+    unsigned long         frame, host_virt_addr;
+    int                   rc;
+
+    /* Returns 0 if TLB flush / invalidate required by caller.
+     * va will indicate the address to be invalidated. */
+
+    led = current;
+    ld = led->domain;
+
+    /* Bitwise-OR avoids short-circuiting which screws control flow. */
+    if ( unlikely(__get_user(dom, &uop->dom) |
+                  __get_user(ref, &uop->ref) |
+                  __get_user(host_virt_addr, &uop->host_virt_addr) |
+                  __get_user(dev_hst_ro_flags, &uop->flags)) )
+    {
+        DPRINTK("Fault while reading gnttab_map_grant_ref_t.\n");
+        return -EFAULT; /* don't set status */
     }
 
-    /* Only make the maptrack live _after_ writing the pte, in case
-     * we overwrite the same frame number, causing a maptrack walk to find it */
-    ld->grant_table->maptrack[handle].domid         = dom;
-    ld->grant_table->maptrack[handle].ref_and_flags =
-        (ref << MAPTRACK_REF_SHIFT) | (flags & MAPTRACK_GNTMAP_MASK);
-
-    /* Unchecked and unconditional writes to user uop. */
-    if ( flags & GNTMAP_device_map )
-        (void)__put_user(frame,  &uop->dev_bus_addr);
 
-    (void)__put_user(handle, &uop->handle);
+    if ( ((host_virt_addr != 0) || (dev_hst_ro_flags & GNTMAP_host_map) ) &&
+         unlikely(!__addr_ok(host_virt_addr)))
+    {
+        DPRINTK("Bad virtual address (%x) or flags (%x).\n",
+                host_virt_addr, dev_hst_ro_flags);
+        (void)__put_user(GNTST_bad_virt_addr, &uop->handle);
+        return GNTST_bad_gntref;
+    }
 
-    spin_unlock(&rd->grant_table->lock);
-    put_domain(rd);
-    return 0;
+    if ( unlikely(ref >= NR_GRANT_ENTRIES) ||
+         unlikely((dev_hst_ro_flags & (GNTMAP_device_map|GNTMAP_host_map)) ==
+0) )
+    {
+        DPRINTK("Bad ref (%d) or flags (%x).\n", ref, dev_hst_ro_flags);
+        (void)__put_user(GNTST_bad_gntref, &uop->handle);
+        return GNTST_bad_gntref;
+    }
+
+    if ( unlikely((rd = find_domain_by_id(dom)) == NULL) ||
+         unlikely(ld == rd) )
+    {
+        if ( rd != NULL )
+            put_domain(rd);
+        DPRINTK("Could not find domain %d\n", dom);
+        (void)__put_user(GNTST_bad_domain, &uop->handle);
+        return GNTST_bad_domain;
+    }
+
+    /* get a maptrack handle */
+    if ( unlikely((handle = get_maptrack_handle(ld->grant_table)) == -1) )
+    {
+        put_domain(rd);
+        DPRINTK("No more map handles available\n");
+        (void)__put_user(GNTST_no_device_space, &uop->handle);
+        return GNTST_no_device_space;
+    }
+
+    if ( 0 >= ( rc = __gnttab_activate_grant_ref( ld, led, rd, ref,
+                                                  dev_hst_ro_flags,
+                                                  host_virt_addr, &frame)))
+    {
+        /* Only make the maptrack live _after_ writing the pte,
+         * in case we overwrite the same frame number, causing a
+         *  maptrack walk to find it
+         */
+        ld->grant_table->maptrack[handle].domid = dom;
+
+        ld->grant_table->maptrack[handle].ref_and_flags
+            = (ref << MAPTRACK_REF_SHIFT) |
+              (dev_hst_ro_flags & MAPTRACK_GNTMAP_MASK);
+
+        (void)__put_user(frame, &uop->dev_bus_addr);
+
+        if ( dev_hst_ro_flags & GNTMAP_host_map )
+            *va = host_virt_addr;
+
+        (void)__put_user(handle, &uop->handle);
+    }
+    else
+    {
+        (void)__put_user(rc, &uop->handle);
+        put_maptrack_handle(ld->grant_table, handle);
+    }
 
- fail:
-    (void)__put_user(rc, &uop->handle);
-    spin_unlock(&rd->grant_table->lock);
     put_domain(rd);
-    put_maptrack_handle(ld->grant_table, handle);
     return rc;
 }
 
@@ -388,6 +432,7 @@
     active_grant_entry_t *act;
     grant_entry_t *sha;
     grant_mapping_t *map;
+    u16            flags;
     s16            rc = 1;
     unsigned long  frame, virt;
 
@@ -412,8 +457,9 @@
         return GNTST_bad_handle;
     }
 
-    dom = map->domid;
-    ref = map->ref_and_flags >> MAPTRACK_REF_SHIFT;
+    dom   = map->domid;
+    ref   = map->ref_and_flags >> MAPTRACK_REF_SHIFT;
+    flags = map->ref_and_flags & MAPTRACK_GNTMAP_MASK;
 
     if ( unlikely((rd = find_domain_by_id(dom)) == NULL) ||
          unlikely(ld == rd) )
@@ -424,33 +470,40 @@
         (void)__put_user(GNTST_bad_domain, &uop->status);
         return GNTST_bad_domain;
     }
-    DPRINTK("Unmapping grant ref (%hu) for domain (%hu) with handle (%hu)\n",
-            ref, dom, handle);
 
     act = &rd->grant_table->active[ref];
     sha = &rd->grant_table->shared[ref];
 
     spin_lock(&rd->grant_table->lock);
 
-    if ( frame != 0 )
+    if ( frame == 0 )
+        frame = act->frame;
+    else if ( frame == GNTUNMAP_DEV_FROM_VIRT )
+    {
+        if ( !( flags & GNTMAP_device_map ) )
+            PIN_FAIL(unmap_out, GNTST_bad_dev_addr,
+                     "Bad frame number: frame not mapped for device access.\n");
+        frame = act->frame;
+
+        /* frame will be unmapped for device access below if virt addr ok */
+    }
+    else
     {
         if ( unlikely(frame != act->frame) )
-            PIN_FAIL(GNTST_general_error,
+            PIN_FAIL(unmap_out, GNTST_general_error,
                      "Bad frame number doesn't match gntref.\n");
-        if ( map->ref_and_flags & GNTMAP_device_map )
-            act->pin -= (map->ref_and_flags & GNTMAP_readonly) ? 
-                GNTPIN_devr_inc : GNTPIN_devw_inc;
+        if ( flags & GNTMAP_device_map )
+            act->pin -= (flags & GNTMAP_readonly) ? GNTPIN_devr_inc
+                                                  : GNTPIN_devw_inc;
 
         map->ref_and_flags &= ~GNTMAP_device_map;
         (void)__put_user(0, &uop->dev_bus_addr);
-    }
-    else
-        frame = act->frame;
 
-    /* frame is now unmapped for device access */
+        /* frame is now unmapped for device access */
+    }
 
     if ( (virt != 0) &&
-         (map->ref_and_flags & GNTMAP_host_map) &&
+         (flags & GNTMAP_host_map) &&
          ((act->pin & (GNTPIN_hstw_mask | GNTPIN_hstr_mask)) > 0))
     {
         l1_pgentry_t   *pl1e;
@@ -462,7 +515,7 @@
         {
             DPRINTK("Could not find PTE entry for address %x\n", virt);
             rc = -EINVAL;
-            goto fail;
+            goto unmap_out;
         }
 
         /* check that the virtual address supplied is actually
@@ -473,7 +526,7 @@
             DPRINTK("PTE entry %x for address %x doesn't match frame %x\n",
                     _ol1e, virt, frame);
             rc = -EINVAL;
-            goto fail;
+            goto unmap_out;
         }
 
         /* Delete pagetable entry
@@ -483,35 +536,53 @@
             DPRINTK("Cannot delete PTE entry at %x for virtual address %x\n",
                     pl1e, virt);
             rc = -EINVAL;
-            goto fail;
+            goto unmap_out;
         }
 
         map->ref_and_flags &= ~GNTMAP_host_map;
 
-        act->pin -= (map->ref_and_flags & GNTMAP_readonly) ?
-                        GNTPIN_hstr_inc : GNTPIN_hstw_inc;
+        act->pin -= (flags & GNTMAP_readonly) ? GNTPIN_hstr_inc
+                                              : GNTPIN_hstw_inc;
+
+        if ( frame == GNTUNMAP_DEV_FROM_VIRT )
+        {
+            act->pin -= (flags & GNTMAP_readonly) ? GNTPIN_devr_inc
+                                                  : GNTPIN_devw_inc;
+
+            map->ref_and_flags &= ~GNTMAP_device_map;
+            (void)__put_user(0, &uop->dev_bus_addr);
+        }
+
         rc = 0;
         *va = virt;
     }
 
     if ( (map->ref_and_flags & (GNTMAP_device_map|GNTMAP_host_map)) == 0)
+    {
+        map->ref_and_flags = 0;
         put_maptrack_handle(ld->grant_table, handle);
+    }
+
+    /* If just unmapped a writable mapping, mark as dirtied */
+    if ( unlikely(shadow_mode_log_dirty(rd)) &&
+        !( flags & GNTMAP_readonly ) )
+         mark_dirty(rd, frame);
 
     /* If the last writable mapping has been removed, put_page_type */
-    if ( ((act->pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask)) == 0) &&
-              !(map->ref_and_flags & GNTMAP_readonly) )
+    if ( ( (act->pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask) ) == 0) &&
+         ( !( flags & GNTMAP_readonly ) ) )
     {
-        put_page_type(&frame_table[frame]);
         clear_bit(_GTF_writing, &sha->flags);
+        put_page_type(&frame_table[frame]);
     }
 
     if ( act->pin == 0 )
     {
-        put_page(&frame_table[frame]);
         clear_bit(_GTF_reading, &sha->flags);
+        put_page(&frame_table[frame]);
     }
 
- fail:
+ unmap_out:
     (void)__put_user(rc, &uop->status);
     spin_unlock(&rd->grant_table->lock);
     put_domain(rd);
@@ -544,6 +615,7 @@
 {
     gnttab_setup_table_t  op;
     struct domain        *d;
+    int                   i;
 
     if ( count != 1 )
         return -EINVAL;
@@ -554,9 +626,10 @@
         return -EFAULT;
     }
 
-    if ( unlikely(op.nr_frames > 1) )
+    if ( unlikely(op.nr_frames > NR_GRANT_FRAMES) )
     {
-        DPRINTK("Xen only supports one grant-table frame per domain.\n");
+        DPRINTK("Xen only supports at most %d grant-table frames per domain.\n",
+                NR_GRANT_FRAMES);
         (void)put_user(GNTST_general_error, &uop->status);
         return 0;
     }
@@ -578,12 +651,15 @@
         return 0;
     }
 
-    if ( op.nr_frames == 1 )
+    if ( op.nr_frames <= NR_GRANT_FRAMES )
     {
         ASSERT(d->grant_table != NULL);
         (void)put_user(GNTST_okay, &uop->status);
-        (void)put_user(virt_to_phys(d->grant_table->shared) >> PAGE_SHIFT,
-                       &uop->frame_list[0]);
+
+        for ( i = 0; i < op.nr_frames; i++ )
+            (void)put_user( (
+                virt_to_phys( (char*)(d->grant_table->shared)+(i*PAGE_SIZE) )
+                              >> PAGE_SHIFT ), &uop->frame_list[i]);
     }
 
     put_domain(d);
@@ -631,29 +707,33 @@
     DPRINTK("Grant table for dom (%hu) MFN (%x)\n",
             op.dom, shared_mfn);
 
-    spin_lock(&gt->lock);
-
     ASSERT(d->grant_table->active != NULL);
     ASSERT(d->grant_table->shared != NULL);
+    ASSERT(d->grant_table->maptrack != NULL);
 
     for ( i = 0; i < NR_GRANT_ENTRIES; i++ )
     {
-        act      = &gt->active[i];
         sha_copy =  gt->shared[i];
 
-        if ( act->pin || act->domid || act->frame ||
-             sha_copy.flags || sha_copy.domid || sha_copy.frame )
+        if ( sha_copy.flags )
         {
-            DPRINTK("Grant: dom (%hu) ACTIVE (%d) pin:(%x) dom:(%hu) frame:(%lx)\n",
-                    op.dom, i, act->pin, act->domid, act->frame);
             DPRINTK("Grant: dom (%hu) SHARED (%d) flags:(%hx) dom:(%hu) frame:(%lx)\n",
                     op.dom, i, sha_copy.flags, sha_copy.domid, sha_copy.frame);
-
         }
-
     }
 
-    ASSERT(d->grant_table->maptrack != NULL);
+    spin_lock(&gt->lock);
+
+    for ( i = 0; i < NR_GRANT_ENTRIES; i++ )
+    {
+        act = &gt->active[i];
+
+        if ( act->pin )
+        {
+            DPRINTK("Grant: dom (%hu) ACTIVE (%d) pin:(%x) dom:(%hu) frame:(%lx)\n",
+                    op.dom, i, act->pin, act->domid, act->frame);
+        }
+    }
 
     for ( i = 0; i < NR_MAPTRACK_ENTRIES; i++ )
     {
@@ -747,13 +827,6 @@
     if ( lgt->map_count == 0 )
         return 0;
 
-#ifdef GRANT_DEBUG
-    if ( ld->id != 0 ) {
-        DPRINTK("Foreign unref rd(%d) ld(%d) frm(%x) flgs(%x).\n",
-                rd->id, ld->id, frame, readonly);
-    }
-#endif
-
     if ( get_domain(rd) == 0 )
     {
         DPRINTK("gnttab_check_unmap: couldn't get_domain rd(%d)\n", rd->id);
@@ -806,15 +879,15 @@
                 /* any more granted writable mappings? */
                 if ( (act->pin & (GNTPIN_hstw_mask|GNTPIN_devw_mask)) == 0 )
                 {
-                    put_page_type(&frame_table[frame]);
                     clear_bit(_GTF_writing, &rgt->shared[ref].flags);
+                    put_page_type(&frame_table[frame]);
                 }
             }
 
             if ( act->pin == 0 )
             {
-                put_page(&frame_table[frame]);
                 clear_bit(_GTF_reading, &rgt->shared[ref].flags);
+                put_page(&frame_table[frame]);
             }
             spin_unlock(&rgt->lock);
 
@@ -836,29 +909,41 @@
 gnttab_prepare_for_transfer(
     struct domain *rd, struct domain *ld, grant_ref_t ref)
 {
-    grant_table_t *t;
-    grant_entry_t *e;
+    grant_table_t *rgt;
+    grant_entry_t *sha;
     domid_t        sdom;
     u16            sflags;
     u32            scombo, prev_scombo;
     int            retries = 0;
+    unsigned long  target_pfn;
+
+    DPRINTK("gnttab_prepare_for_transfer rd(%hu) ld(%hu) ref(%hu).\n",
+            rd->id, ld->id, ref);
 
-    if ( unlikely((t = rd->grant_table) == NULL) ||
+    if ( unlikely((rgt = rd->grant_table) == NULL) ||
          unlikely(ref >= NR_GRANT_ENTRIES) )
     {
         DPRINTK("Dom %d has no g.t., or ref is bad (%d).\n", rd->id, ref);
         return 0;
     }
 
-    spin_lock(&t->lock);
+    spin_lock(&rgt->lock);
 
-    e = &t->shared[ref];
+    sha = &rgt->shared[ref];
     
-    sflags = e->flags;
-    sdom   = e->domid;
+    sflags = sha->flags;
+    sdom   = sha->domid;
 
     for ( ; ; )
     {
+        target_pfn = sha->frame;
+
+        if ( unlikely(target_pfn >= max_page ) )
+        {
+            DPRINTK("Bad pfn (%x)\n", target_pfn);
+            goto fail;
+        }
+
         if ( unlikely(sflags != GTF_accept_transfer) ||
              unlikely(sdom != ld->id) )
         {
@@ -872,7 +957,7 @@
         prev_scombo = scombo = ((u32)sdom << 16) | (u32)sflags;
 
         /* NB. prev_scombo is updated in place to seen value. */
-        if ( unlikely(cmpxchg_user((u32 *)&e->flags, prev_scombo, 
+        if ( unlikely(cmpxchg_user((u32 *)&sha->flags, prev_scombo, 
                                    prev_scombo | GTF_transfer_committed)) )
         {
             DPRINTK("Fault while modifying shared flags and domid.\n");
@@ -895,29 +980,50 @@
         sdom   = (u16)(prev_scombo >> 16);
     }
 
-    spin_unlock(&t->lock);
+    spin_unlock(&rgt->lock);
     return 1;
 
  fail:
-    spin_unlock(&t->lock);
+    spin_unlock(&rgt->lock);
     return 0;
 }
 
 void 
 gnttab_notify_transfer(
-    struct domain *rd, grant_ref_t ref, unsigned long sframe)
+    struct domain *rd, struct domain *ld, grant_ref_t ref, unsigned long frame)
 {
-    unsigned long frame;
+    grant_entry_t  *sha;
+    unsigned long   pfn;
 
-    /* cwc22
-     * TODO: this requires that the machine_to_phys_mapping
-     *       has already been updated, so the accept_transfer hypercall
-     *       must do this.
-     */
-    frame = __mfn_to_gpfn(rd, sframe);
+    DPRINTK("gnttab_notify_transfer rd(%hu) ld(%hu) ref(%hu).\n",
+            rd->id, ld->id, ref);
+
+    sha = &rd->grant_table->shared[ref];
 
-    wmb(); /* Ensure that the reassignment is globally visible. */
-    rd->grant_table->shared[ref].frame = frame;
+    spin_lock(&rd->grant_table->lock);
+
+    pfn = sha->frame;
+
+    if ( unlikely(pfn >= max_page ) )
+        DPRINTK("Bad pfn (%x)\n", pfn);
+    else
+    {
+        machine_to_phys_mapping[frame] = pfn;
+
+        if ( unlikely(shadow_mode_log_dirty(ld)))
+             mark_dirty(ld, frame);
+
+        if (shadow_mode_translate(ld))
+            __phys_to_machine_mapping[pfn] = frame;
+    }
+    sha->frame = __mfn_to_gpfn(rd, frame);
+    sha->domid = rd->id;
+    wmb();
+    sha->flags = ( GTF_accept_transfer | GTF_transfer_completed );
+
+    spin_unlock(&rd->grant_table->lock);
+
+    return;
 }
 
 int 
@@ -947,10 +1053,16 @@
         t->maptrack[i].ref_and_flags = (i+1) << MAPTRACK_REF_SHIFT;
 
     /* Shared grant table. */
-    if ( (t->shared = (void *)alloc_xenheap_page()) == NULL )
+    if ( (t->shared = (void *)alloc_xenheap_pages(ORDER_GRANT_FRAMES)) == NULL )
         goto no_mem;
-    memset(t->shared, 0, PAGE_SIZE);
-    SHARE_PFN_WITH_DOMAIN(virt_to_page(t->shared), d);
+    memset(t->shared, 0, NR_GRANT_FRAMES * PAGE_SIZE);
+
+    for ( i = 0; i < NR_GRANT_FRAMES; i++ )
+    {
+        SHARE_PFN_WITH_DOMAIN(virt_to_page((char *)(t->shared)+(i*PAGE_SIZE)), d);
+        machine_to_phys_mapping[ (virt_to_phys((char*)(t->shared)+(i*PAGE_SIZE))
+                                 >> PAGE_SHIFT) ] = INVALID_M2P_ENTRY;
+    }
 
     /* Okay, install the structure. */
     wmb(); /* avoid races with lock-free access to d->grant_table */
@@ -1053,7 +1165,7 @@
         /* Free memory relating to this grant table. */
         d->grant_table = NULL;
         free_xenheap_page((unsigned long)t->shared);
-        free_xenheap_page((unsigned long)t->maptrack);
+        free_xenheap_pages((unsigned long)t->shared, ORDER_GRANT_FRAMES);
         xfree(t->active);
         xfree(t);
     }
diff -Nru a/xen/include/public/grant_table.h b/xen/include/public/grant_table.h
--- a/xen/include/public/grant_table.h	2005-03-25 20:24:42 +00:00
+++ b/xen/include/public/grant_table.h	2005-03-25 20:24:42 +00:00
@@ -185,6 +185,8 @@
     u32         __pad;
 } PACKED gnttab_unmap_grant_ref_t; /* 24 bytes */
 
+#define GNTUNMAP_DEV_FROM_VIRT (~0U)
+
 /*
  * GNTTABOP_setup_table: Set up a grant table for <dom> comprising at least
  * <nr_frames> pages. The frame addresses are written to the <frame_list>.
@@ -248,8 +250,9 @@
 #define GNTST_bad_gntref       (-3) /* Unrecognised or inappropriate gntref. */
 #define GNTST_bad_handle       (-4) /* Unrecognised or inappropriate handle. */
 #define GNTST_bad_virt_addr    (-5) /* Inappropriate virtual address to map. */
-#define GNTST_no_device_space  (-6) /* Out of space in I/O MMU.              */
-#define GNTST_permission_denied (-7) /* Not enough privilege for operation.  */
+#define GNTST_bad_dev_addr     (-6) /* Inappropriate device address to unmap.*/
+#define GNTST_no_device_space  (-7) /* Out of space in I/O MMU.              */
+#define GNTST_permission_denied (-8) /* Not enough privilege for operation.  */
 
 #define GNTTABOP_error_msgs {                   \
     "okay",                                     \
@@ -258,6 +261,7 @@
     "invalid grant reference",                  \
     "invalid mapping handle",                   \
     "invalid virtual address",                  \
+    "invalid device address",                   \
     "no spare translation slot in the I/O MMU", \
     "permission denied"                         \
 }
diff -Nru a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
--- a/xen/include/public/io/blkif.h	2005-03-25 20:24:42 +00:00
+++ b/xen/include/public/io/blkif.h	2005-03-25 20:24:42 +00:00
@@ -34,15 +34,23 @@
     blkif_vdev_t   device;       /*  2: only for read/write requests         */
     unsigned long  id;           /*  4: private guest value, echoed in resp  */
     blkif_sector_t sector_number;    /* start sector idx on disk (r/w only)  */
-    /* @f_a_s[2:0]=last_sect ; @f_a_s[5:3]=first_sect ; @f_a_s[:12]=frame.   */
+    /* @f_a_s[2:0]=last_sect ; @f_a_s[5:3]=first_sect                        */
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+    /* @f_a_s[:16]= grant reference (16 bits)                                */
+#else
+    /* @f_a_s[:12]=@frame: machine page frame number.                        */
+#endif
     /* @first_sect: first sector in frame to transfer (inclusive).           */
     /* @last_sect: last sector in frame to transfer (inclusive).             */
-    /* @frame: machine page frame number.                                    */
     unsigned long  frame_and_sects[BLKIF_MAX_SEGMENTS_PER_REQUEST];
 } PACKED blkif_request_t;
 
 #define blkif_first_sect(_fas) (((_fas)>>3)&7)
 #define blkif_last_sect(_fas)  ((_fas)&7)
+
+#ifdef CONFIG_XEN_BLKDEV_GRANT
+#define blkif_gref_from_fas(_fas) ((_fas)>>16)
+#endif
 
 typedef struct {
     unsigned long   id;              /* copied from request */
diff -Nru a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h
--- a/xen/include/xen/grant_table.h	2005-03-25 20:24:42 +00:00
+++ b/xen/include/xen/grant_table.h	2005-03-25 20:24:42 +00:00
@@ -51,7 +51,10 @@
 #define GNTPIN_devr_inc      (1 << GNTPIN_devr_shift)
 #define GNTPIN_devr_mask     (0xFFU << GNTPIN_devr_shift)
 
-#define NR_GRANT_ENTRIES     (PAGE_SIZE / sizeof(grant_entry_t))
+#define ORDER_GRANT_FRAMES   2
+#define NR_GRANT_FRAMES      (1U << ORDER_GRANT_FRAMES)
+#define NR_GRANT_ENTRIES     (NR_GRANT_FRAMES * PAGE_SIZE / sizeof(grant_entry_t))
+
 
 /*
  * Tracks a mapping of another domain's grant reference. Each domain has a
@@ -104,7 +107,7 @@
 /* Notify 'rd' of a completed transfer via an already-locked grant entry. */
 void 
 gnttab_notify_transfer(
-    struct domain *rd, grant_ref_t ref, unsigned long frame);
+    struct domain *rd, struct domain *ld, grant_ref_t ref, unsigned long frame);
 
 /* Pre-domain destruction release of granted device mappings of other domains.*/
 void

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread
* RE: problem with netfront.c
@ 2005-04-03 15:00 Ling, Xiaofeng
  0 siblings, 0 replies; 11+ messages in thread
From: Ling, Xiaofeng @ 2005-04-03 15:00 UTC (permalink / raw)
  To: Ian Pratt, xen-devel



Ian Pratt <mailto:m+Ian.Pratt@cl.cam.ac.uk> wrote:
>> Yes, but does it still need hypervisor to modify the phys_to_machine
>> table? I just hope to save some page faults or tlb flush vmexit to
>> get high performance and also have less modification to the
>> original netfront.c. So still only one multicall needed each
>> time when gets a net event.
> 
> The netfront/backdriver driver needs to be switched over to using the
> grant tables interface.
> 
> Chris Clark has already checked in the grant tables implementation,
> but no-one has got around to updating netfront/back to use it. There
> are patches to switch blkfront/back over to use grant tables, which
> will be checked in soon, but since the blk driver use the foreign
> access rather than the page transfer mechanism its probably not too
> much help. 
> 
> Using grant tables, the front end doesn't need to know about machine
> addresses, and the whole thing ends up rather cleaner, particulary for
> domains running with virtualized VMs.
Yes, there do have security problem to use machine address in netfront.
I'm originally  thinking that we can pass virtual address to backend through 
the ring buffer and let the backend to set the mapping for the frontend. 
I've already done a blkfront module driver in unmodified linux with an event 
channel channel  and ctrlif module driver. I'll wait for the new blkfront with 
grant table. For netfront, if no one has worked on the grant table update,
I can do it together.

^ permalink raw reply	[flat|nested] 11+ messages in thread
* RE: problem with netfront.c
@ 2005-04-03 12:50 Ling, Xiaofeng
  0 siblings, 0 replies; 11+ messages in thread
From: Ling, Xiaofeng @ 2005-04-03 12:50 UTC (permalink / raw)
  To: Ian Pratt, xen-devel



Ian Pratt <mailto:m+Ian.Pratt@cl.cam.ac.uk> wrote:
>> Another main code I did is the do_update_va_mapping hypercall.
>> I wrote a new  implement for unmodified guest domain.
>> see the patch. anyone can see the problem in it?
>> (some commented code are what I've ever tried but still same result)
> 
> Hmm, I think you're confused: if you're in a VT-x guest then you don't
> need to use update_va_mpping as you can update your own pagetables
> directly and shadow mode will take care of propagating the update.
> 
> Ian
Yes, but does it still need hypervisor to modify the phys_to_machine table?
I just hope to save some page faults or tlb flush vmexit to get high performance and also have less 
modification to the original netfront.c. So still only one multicall needed each time when gets a
net event.

^ permalink raw reply	[flat|nested] 11+ messages in thread
* RE: problem with netfront.c
@ 2005-04-03  9:43 Ian Pratt
  0 siblings, 0 replies; 11+ messages in thread
From: Ian Pratt @ 2005-04-03  9:43 UTC (permalink / raw)
  To: Ling, Xiaofeng, xen-devel

> Another main code I did is the do_update_va_mapping hypercall.
> I wrote a new  implement for unmodified guest domain.
> see the patch. anyone can see the problem in it?
> (some commented code are what I've ever tried but still same result)

Hmm, I think you're confused: if you're in a VT-x guest then you don't
need to use update_va_mpping as you can update your own pagetables
directly and shadow mode will take care of propagating the update. 

Ian

^ permalink raw reply	[flat|nested] 11+ messages in thread
* problem with netfront.c
@ 2005-04-03  3:24 Ling, Xiaofeng
  0 siblings, 0 replies; 11+ messages in thread
From: Ling, Xiaofeng @ 2005-04-03  3:24 UTC (permalink / raw)
  To: xen-devel

Hi,
     I'm trying to enable a VNIF driver module in an unmodified Redhat
EL3 kernel.
Now, it can work, but after some network communication, the guest kernel
will crash when running some command like ifconfig, tcpdump, ttcp.
For different applications, they are at the different place. but each
time, it is same.
for example, with ifconfig, it is in sock_create():
   if ((i = net_families[family]->create(sock, protocol)) < 0)
the net_families[family] is a illegal point, net_families is 0xc03180a0

one problem is I use kernel 2.6.10 as the xen0, Redhat EL3 is 2.4.21
kernel, the backend net driver skb->head is page aligned, but in
netfront.c it is not, like 0xcxxxx800. so I did a workround as following
patch, can this cause problem?

@@ -353,18 +359,20 @@
       * ourself and for other kernel subsystems.
       */
      batch_target = np->rx_target - (req_prod - np->rx_resp_cons);
-        if (unlikely((skb = alloc_xen_skb(dev->mtu + RX_HEADROOM)) ==
NULL))
+        if ( unlikely((skb = alloc_xen_skb(dev->mtu + RX_HEADROOM +
1500)) == NULL) )
              break;
          __skb_queue_tail(&np->rx_batch, skb);
      }


Another main code I did is the do_update_va_mapping hypercall.
I wrote a new  implement for unmodified guest domain.
see the patch. anyone can see the problem in it?
(some commented code are what I've ever tried but still same result)

Index: arch/x86/mm.c
===================================================================
--- arch/x86/mm.c       (revision 512)
+++ arch/x86/mm.c       (working copy)
@@ -1707,7 +1757,14 @@
              break;
          }

-        if ( unlikely(__copy_from_user(&req, ureqs, sizeof(req)) != 0) )
+        if(VMX_DOMAIN(current)){
+            vbdprintk("copy form guest\n");
+            rc = copy_from_guest(&req, ureqs, sizeof(req));
+        }
+        else {
+            rc = __copy_from_user(&req, ureqs, sizeof(req));
+        }
+        if ( unlikely((rc) != 0) )
          {
              MEM_LOG("Bad __copy_from_user");
              rc = -EFAULT;
@@ -1716,7 +1773,7 @@

          cmd = req.ptr & (sizeof(l1_pgentry_t)-1);
          pfn = req.ptr >> PAGE_SHIFT;
-
+        vbdprintk("req.ptr, cmd, pfn:%x, %x, %x\n", req.ptr, cmd, pfn);
          okay = 0;

          switch ( cmd )
@@ -1911,7 +1968,92 @@
      return rc;
  }

+void shadow_map_l1_into_current_l2(unsigned long va);

+int do_update_vmx_va_mapping(unsigned long va,
+                         unsigned long val,
+                         unsigned long flags)
+{
+    struct exec_domain *ed = current;
+    struct domain *d = ed->domain;
+    int err = 0;
+    unsigned int cpu = ed->processor;
+    unsigned long deferred_ops;
+    unsigned long gpa;
+    unsigned long sval = 0;
+    vnifprintk("do_update_va_mapping, va:%p, val:%p, flags:%d\n", va,
val, flags);
+    vnifprintk("shadow_mode:%x\n", d->arch.shadow_mode);
+
+    LOCK_BIGLOCK(d);
+
+//    cleanup_writable_pagetable(d);
+
+    /*
+     * XXX When we make this support 4MB superpages we should also deal
with
+     * the case of updating L2 entries.
+     */
+
+#if 0
+    gval = val ? ((va - KERNEL_PAGE_OFFSET) & PAGE_MASK) |
+        (val & ~PAGE_MASK) : 0;
+#endif
+
+    gpa = gva_to_gpa(va);
+    if(gpa) {
+        if(val)
+           set_phystomachine(gpa >> PAGE_SHIFT,
+                val >> PAGE_SHIFT);
+        else
+           set_phystomachine(gpa >> PAGE_SHIFT, ~0UL);
+    }
+
+#if 0
+    if ( unlikely(!mod_vmx_l1_entry(va, mk_l1_pgentry(gval))) ) {
+        printk("mod l1 error\n");
+        err = -EINVAL;
+    }
+#endif
+
+    sval = val;
+
+    if((va >> 16) == 0xc031)
+        printk("c031 va:%p\n", va);
+
+    vnifprintk("shadow_mode_enabled:%p, %p\n", val, sval);
+
+    if ( unlikely(__put_user(sval, ((unsigned long *)(
+
&shadow_linear_pg_table[l1_linear_offset(va)])))) )
+    {
+        printk("put_user shadow error:%p %p\n", va, sval,val);
+        printk("spgd:%p\n",
ed->arch.shadow_vtable[va>>L2_PAGETABLE_SHIFT]);
+
+        shadow_map_l1_into_current_l2(va);
+        if(__put_user(sval, ((unsigned long *)(
+                     &shadow_linear_pg_table[l1_linear_offset(va)]))))
+            printk("still put_user shadow error:%p %p\n", va, val);
+
+        check_pagetable(d, ed->arch.guest_table, "va"); /* debug */
+    }
+
+    deferred_ops = percpu_info[cpu].deferred_ops;
+    percpu_info[cpu].deferred_ops = 0;
+
+    if ( unlikely(deferred_ops & DOP_FLUSH_TLB) ||
+            unlikely(flags & UVMF_FLUSH_TLB) )
+        local_flush_tlb();
+    else if ( unlikely(flags & UVMF_INVLPG) )
+        __flush_tlb_one(va);
+
+    if ( unlikely(deferred_ops & DOP_RELOAD_LDT) )
+        (void)map_ldt_shadow_page(0);
+
+    UNLOCK_BIGLOCK(d);
+
+    vnifprintk("exit va mapping\n");
+    return err;
+
+}
+
  int do_update_va_mapping(unsigned long va,
                           unsigned long val,
                           unsigned long flags)
@@ -1924,6 +2066,11 @@

      perfc_incrc(calls_to_update_va);

+    if(unlikely(VMX_DOMAIN(ed))) {
+        return do_update_vmx_va_mapping(va, val, flags);
+    }
+
+
      if ( unlikely(!__addr_ok(va)) )
          return -EINVAL;

@@ -1949,6 +2096,7 @@

          l1pte_propagate_from_guest(d, &val, &sval);

+        vnifprintk("shadow_mode_enabled:%p, %p\n", val, sval);
          if ( unlikely(__put_user(sval, ((unsigned long *)(
              &shadow_linear_pg_table[l1_linear_offset(va)])))) )
          {
@@ -2975,3 +3123,5 @@
  }

  #endif
+
+
Index: include/asm-x86/mm.h
===================================================================
--- include/asm-x86/mm.h        (revision 512)
+++ include/asm-x86/mm.h        (working copy)
@@ -150,6 +150,9 @@
          free_domheap_page(page);
  }

+#ifndef vnifprintk
+#define vnifprintk(_a...)
+#endif

  static inline int get_page(struct pfn_info *page,
                             struct domain *domain)
@@ -252,6 +255,21 @@
     mfn = l1_pgentry_to_phys(pte) >> PAGE_SHIFT;
     return mfn;
  }
+
+static inline unsigned long set_phystomachine(unsigned long pfn,
unsigned long ma)
+{
+    l1_pgentry_t pte;
+    if (__get_user(l1_pgentry_val(pte), (__phys_to_machine_mapping +
pfn))) {
+        return 0;
+    }
+    l1_pgentry_val(pte) = (__pa(ma) & PAGE_MASK) | (l1_pgentry_val(pte)
& ~PAGE_MASK);
+    l1_pgentry_val(pte) = ma << PAGE_SHIFT;
+    if(__put_user(l1_pgentry_val(pte), (__phys_to_machine_mapping +
pfn))) {
+        return 0;
+    }
+    return ma;
+}
+
  #define set_machinetophys(_mfn, _pfn) machine_to_phys_mapping[(_mfn)]
= (_pfn)

  #define DEFAULT_GDT_ENTRIES     (LAST_RESERVED_GDT_ENTRY+1)




-------------------
Ling Xiaofeng(Daniel)
Intel China Software Center.
xfling@users.sourceforge.net
Opinions are my own and don’t represent those of my employer

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-04-04 12:42 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-03 13:33 problem with netfront.c Ian Pratt
2005-04-04  3:06 ` Jacob Gorm Hansen
2005-04-03 14:21   ` Mark Williamson
  -- strict thread matches above, loose matches on Subject: below --
2005-04-04 12:42 Ian Pratt
2005-04-04 10:06 Ling, Xiaofeng
2005-04-04  7:21 Ian Pratt
2005-04-03 15:33 Ian Pratt
2005-04-03 15:00 Ling, Xiaofeng
2005-04-03 12:50 Ling, Xiaofeng
2005-04-03  9:43 Ian Pratt
2005-04-03  3:24 Ling, Xiaofeng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.