From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ross Philipson Subject: Re: [PATCH (V9) 2/2] xen: Add V4V implementation Date: Tue, 4 Jun 2013 14:01:39 -0400 Message-ID: <51AE2B83.1070808@citrix.com> References: <1369770211-4509-1-git-send-email-ross.philipson@citrix.com> <1369770211-4509-3-git-send-email-ross.philipson@citrix.com> <20130530162009.GB64390@ocelot.phlegethon.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20130530162009.GB64390@ocelot.phlegethon.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: xen-devel@lists.xen.org Cc: Tim Deegan List-Id: xen-devel@lists.xenproject.org On 05/30/2013 12:20 PM, Tim Deegan wrote: > Hi, > > Nice to see this series again. :) Thanks for your work cleaning it up. Thanks, and thanks for your feedback. > > At 15:43 -0400 on 28 May (1369755811), Ross Philipson wrote: >> Setup of v4v domains a domain gets created and cleanup >> when a domain die. Wire up the v4v hypercall. >> >> Include v4v internal and public headers. >> >> Signed-off-by: Ross Philipson > > I'm a bit worried about resource exhaustion attacks in this interface. > In particular: > - It looks like a malicious guest can register large numbers of rings > (like, 2^48 of them), each of which causes Xen to xmalloc bookkeeping > space (up to 2^22 bytes per ring). > - A malicious guest that has enough memory can send a very large vector > (2^31 1-byte entries), which would cause Xen to read and check 2^37 > bytes of control structures before even starting the copy of the data. > At a rough guess that could be>10seconds without checking for preemption. > Maybe reduce the limits a bit (say, 32MB per hypercall, may 100 elements > per vector)? > - Something similar for V4VOP_notify, I guess, though that could perhaps > be handled by the usual pending-events check + hypercall continuations. Yes we agree. I think the plan is to have some reasonable default limits and allow the values to be adjusted during domain creation if that is needed. This would be done through a new V4V op for the hypercall. > > And now, some detailed review: > >> +/* >> + * Messages on the ring are padded to 128 bits >> + * Len here refers to the exact length of the data not including the >> + * 128 bit header. The message uses >> + * ((len +0xf)& ~0xf) + sizeof(v4v_ring_message_header) bytes >> + */ >> +#define V4V_ROUNDUP(a) (((a) +0xf )& ~0xf) > > Even though this is always used on values<2^32 and so is safe, > it would be nice to have it as (((a) + 0xf)& ~(typeof (a))0xf) > so it's clear that it doesn't truncate any high-order bits. Ack > > [...] >> +/* >> + * lock to protect the filtering rules list: v4vtable_rules >> + * >> + * The write lock is held for viptables_del and viptables_add >> + * The read lock is held for viptable_list > > s/viptables/v4vtables/g/ For all the minor ones I don't address directly, just assume I will fix it per your suggestions. > > [...] >> +static int >> +v4v_ringbuf_get_rx_ptr(struct domain *d, struct v4v_ring_info *ring_info, >> + uint32_t * rx_ptr) >> +{ >> + v4v_ring_t *ringp; >> + >> + if ( ring_info->npage == 0 ) >> + return -1; >> + >> + ringp = map_domain_page(mfn_x(ring_info->mfns[0])); >> + >> + v4v_dprintk("v4v_ringbuf_payload_space: mapped %p to %p\n", >> + (void *)mfn_x(ring_info->mfns[0]), ringp); > > The function name is wrong there, presumably after this code was > refctored. You might add %s/__func__ to the definition of v4v_dprintk() > instead? > >> + if ( !ringp ) >> + return -1; >> + >> + write_atomic(rx_ptr, ringp->rx_ptr); > > AIUI ringp->rx_ptr is the shared variable here, so this should be > *rx_ptr = read_atomic(ringp->rx_ptr); > > Also, maybe comment in the hypercall interface that the guest should > also use atomic operations when accessing rx_ptr and tx_ptr. (Though I > should say the hypercall comments are already pretty good compared to > some things we already have!) What you say sounds right. I will address this. > >> + mb(); > > I'm not 100% convinced this barrier is needed (since we're not going to > read the 'free' memory in the ring and I don't think that writes to it > can possibly be hoisted past this read since they're gated on the > result). OTOH I'm happy to err on the side of caution. :) This could be cargo. There have been quite a number of modifications to this code since its birth in our product 3+ years ago. But if I cannot be sure, I will probably leave it be. > >> + unmap_domain_page(ringp); >> + return 0; >> +} >> + >> +uint32_t >> +v4v_ringbuf_payload_space(struct domain * d, struct v4v_ring_info * ring_info) >> +{ >> + v4v_ring_t ring; >> + int32_t ret; >> + >> + ring.tx_ptr = ring_info->tx_ptr; >> + ring.len = ring_info->len; >> + >> + if ( v4v_ringbuf_get_rx_ptr(d, ring_info,&ring.rx_ptr) ) >> + return 0; >> + >> + v4v_dprintk("v4v_ringbuf_payload_space:tx_ptr=%d rx_ptr=%d\n", >> + (int)ring.tx_ptr, (int)ring.rx_ptr); >> + if ( ring.rx_ptr == ring.tx_ptr ) >> + return ring.len - sizeof(struct v4v_ring_message_header); >> + >> + ret = ring.rx_ptr - ring.tx_ptr; >> + if ( ret< 0 ) >> + ret += ring.len; >> + >> + ret -= sizeof(struct v4v_ring_message_header); >> + ret -= V4V_ROUNDUP(1); >> + >> + return (ret< 0) ? 0 : ret; > > Sanity check for ret> ring.len as well? I guess it's not clear what > you'd do about that. > > [...] >> +static long >> +v4v_ringbuf_insertv(struct domain *d, >> + struct v4v_ring_info *ring_info, >> + v4v_ring_id_t *src_id, uint32_t message_type, >> + XEN_GUEST_HANDLE_PARAM(v4v_iov_t) iovs, >> + uint32_t niov, size_t len) >> +{ >> + v4v_ring_t ring; >> + struct v4v_ring_message_header mh = { 0 }; >> + int32_t sp; >> + long happy_ret; >> + int32_t ret = 0; >> + XEN_GUEST_HANDLE(uint8_t) empty_hnd = { 0 }; >> + >> + ASSERT(spin_is_locked(&ring_info->lock)); >> + >> + happy_ret = len; >> + >> + if ( (V4V_ROUNDUP(len) + sizeof(struct v4v_ring_message_header) )>= >> + ring_info->len) >> + return -EMSGSIZE; >> + >> + do { >> + if ( (ret = v4v_memcpy_from_guest_ring(&ring, ring_info, 0, >> + sizeof(ring))) ) >> + break; > > Should this be using an atomic operation to read ring.rx_ptr instead? > That's the only field we actually use from the guest copy anyway. Probably. I will look at this as well as other uses of the t/rx_ptr usage mentioned above. > >> + >> + ring.tx_ptr = ring_info->tx_ptr; >> + ring.len = ring_info->len; >> + >> + v4v_dprintk("ring.tx_ptr=%d ring.rx_ptr=%d ring.len=%d ring_info->tx_ptr=%d\n", >> + ring.tx_ptr, ring.rx_ptr, ring.len, ring_info->tx_ptr); >> + >> + if ( ring.rx_ptr == ring.tx_ptr ) >> + sp = ring_info->len; >> + else >> + { >> + sp = ring.rx_ptr - ring.tx_ptr; >> + if ( sp< 0 ) >> + sp += ring.len; >> + } >> + >> + if ( (V4V_ROUNDUP(len) + sizeof(struct v4v_ring_message_header))>= sp ) >> + { >> + v4v_dprintk("EAGAIN\n"); >> + ret = -EAGAIN; >> + break; >> + } >> + >> + mh.len = len + sizeof (struct v4v_ring_message_header); >> + mh.source.port = src_id->addr.port; >> + mh.source.domain = src_id->addr.domain; >> + mh.message_type = message_type; >> + >> + if ( (ret = v4v_memcpy_to_guest_ring(ring_info, >> + ring.tx_ptr + sizeof(v4v_ring_t), >> +&mh, empty_hnd, >> + sizeof(mh))) ) >> + break; > > OK, I think this is correct, because the tx_ptr is always 16-byte > aligned and the message header is 16 bytes long, but it's not > obvious. :) Maybe add a comment here, and a BUILD_BUG_ON() to assert > that sizeof (struct v4v_ring_message_header)<= V4V_ROUNDUP(1)? Ack > >> + ring.tx_ptr += sizeof(mh); >> + if ( ring.tx_ptr == ring_info->len ) >> + ring.tx_ptr = 0; >> + >> + while ( niov-- ) >> + { >> + XEN_GUEST_HANDLE_PARAM(uint8_t) bufp_hnd; >> + XEN_GUEST_HANDLE(uint8_t) buf_hnd; >> + v4v_iov_t iov; >> + >> + if ( copy_from_guest(&iov, iovs, 1) ) > > Hrmn. So here we re-copy the iov from the guest, but the guest might > have changed it since we looked at it last (when we calculated the total > length). That means we need to keep a running count of how much we copy, > to be sure the guest doesn't swap short iovs for long ones and make > us overrun the ring's rx_ptr. Yea I was starting to have concerns about just this right after I posted patch set 9. I really don't like that we read it twice. Your suggestion sounds reasonable. > >> + { >> + ret = -EFAULT; >> + break; >> + } >> + >> + bufp_hnd = guest_handle_from_ptr(iov.iov_base, uint8_t); >> + buf_hnd = guest_handle_from_param(bufp_hnd, uint8_t); >> + len = iov.iov_len; >> + >> + if ( unlikely(!guest_handle_okay(buf_hnd, len)) ) >> + { >> + ret = -EFAULT; >> + break; >> + } >> + >> + sp = ring.len - ring.tx_ptr; >> + >> + if ( len> sp ) >> + { >> + ret = v4v_memcpy_to_guest_ring(ring_info, >> + ring.tx_ptr + sizeof(v4v_ring_t), >> + NULL, buf_hnd, sp); >> + if ( ret ) >> + break; >> + >> + ring.tx_ptr = 0; >> + len -= sp; >> + guest_handle_add_offset(buf_hnd, sp); >> + } >> + >> + ret = v4v_memcpy_to_guest_ring(ring_info, >> + ring.tx_ptr + sizeof(v4v_ring_t), >> + NULL, buf_hnd, len); >> + if ( ret ) >> + break; >> + >> + ring.tx_ptr += len; >> + >> + if ( ring.tx_ptr == ring_info->len ) >> + ring.tx_ptr = 0; >> + >> + guest_handle_add_offset(iovs, 1); >> + } >> + if ( ret ) >> + break; >> + >> + ring.tx_ptr = V4V_ROUNDUP(ring.tx_ptr); >> + >> + if ( ring.tx_ptr>= ring_info->len ) >> + ring.tx_ptr -= ring_info->len; >> + >> + mb(); >> + ring_info->tx_ptr = ring.tx_ptr; >> + if ( (ret = v4v_update_tx_ptr(ring_info, ring.tx_ptr)) ) >> + break; >> + } while ( 0 ); >> + >> + v4v_ring_unmap(ring_info); >> + >> + return ret ? ret : happy_ret; >> +} > [...] >> +static int >> +v4v_pending_requeue(struct v4v_ring_info *ring_info, domid_t src_id, int len) >> +{ >> + struct hlist_node *node; >> + struct v4v_pending_ent *ent; >> + >> + hlist_for_each_entry(ent, node,&ring_info->pending, node) >> + { >> + if ( ent->id == src_id ) >> + { >> + if ( ent->len< len ) >> + ent->len = len; > > Does this mean that if I call V4VOP_notify with some large space > requirement, and then again with a smaller one, I won't get a > notification until the original large amount is free? Seems like this > should always overwrite the old length instead, and notify on the most > recently requested amount. It has been a while since I have looked closely at the queuing code so I will have to study this some. I believe it does the right thing but I will check. > >> +static int >> +v4v_fill_ring_data(struct domain *src_d, >> + XEN_GUEST_HANDLE(v4v_ring_data_ent_t) data_ent_hnd) >> +{ >> + v4v_ring_data_ent_t ent; >> + struct domain *dst_d; >> + struct v4v_ring_info *ring_info; >> + >> + if ( copy_from_guest(&ent, data_ent_hnd, 1) ) >> + { >> + v4v_dprintk("EFAULT\n"); >> + return -EFAULT; >> + } >> + >> + v4v_dprintk("v4v_fill_ring_data: ent.ring.domain=%d,ent.ring.port=%u\n", >> + (int)ent.ring.domain, (int)ent.ring.port); >> + >> + ent.flags = 0; >> + >> + dst_d = get_domain_by_id(ent.ring.domain); >> + >> + if ( dst_d&& dst_d->v4v ) >> + { >> + read_lock(&dst_d->v4v->lock); >> + ring_info = v4v_ring_find_info_by_addr(dst_d,&ent.ring, >> + src_d->domain_id); >> + > > Should there be a call to v4vtables_check() here? Otherwise we leak a > little bit of information to a guest who's not supposed to see the ring, > (and also it's a bit odd to say 'yes, you may send' and then bounce the > sendv call). Yea I think so. Concerning all the v4vtables issues, we are now looking at reworking a lot of this. We just started discussing options so I don't have a lot to reply with yet. > > [...] >> +static void v4v_ring_remove_mfns(struct domain *d, struct v4v_ring_info *ring_info) >> +{ >> + int i; >> + >> + ASSERT(rw_is_write_locked(&d->v4v->lock)); >> + >> + if ( ring_info->mfns ) >> + { >> + for ( i = 0; i< ring_info->npage; ++i ) >> + if ( mfn_x(ring_info->mfns[i]) != 0 ) > > Here, and where the array is populated, it would be better to use > INVALID_MFN in any empty slots. On x86, MFN 0 is never allocated to a > guest, but that may not always be true on all architectures. Ack. > > [...] >> +static long >> +v4v_ring_add(struct domain *d, XEN_GUEST_HANDLE_PARAM(v4v_ring_t) ring_hnd, >> + uint32_t npage, XEN_GUEST_HANDLE_PARAM(v4v_pfn_t) pfn_hnd) >> +{ > [...] >> + /* >> + * no need for a lock yet, because only we know about this >> + * set the tx pointer if it looks bogus (we don't reset it >> + * because this might be a re-register after S4) >> + */ >> + if ( (ring.tx_ptr>= ring.len) >> + || (V4V_ROUNDUP(ring.tx_ptr) != ring.tx_ptr) ) >> + { >> + ring.tx_ptr = ring.rx_ptr; >> + } > > Maybe check here that the guest-supplied rx_ptr is correctly aligned? > I don't think an unaligned one would be a security risk, but it might > cause entertaining breakage on the first sendv(). Ack. > >> + copy_field_to_guest(ring_hnd,&ring, tx_ptr); >> + >> + read_lock(&d->v4v->lock); >> + ring_info = v4v_ring_find_info(d,&ring.id); >> + >> + if ( !ring_info ) >> + { >> + read_unlock(&d->v4v->lock); >> + ring_info = xmalloc(struct v4v_ring_info); >> + if ( !ring_info ) >> + { >> + v4v_dprintk("ENOMEM\n"); >> + ret = -ENOMEM; >> + break; >> + } >> + need_to_insert++; >> + spin_lock_init(&ring_info->lock); >> + INIT_HLIST_HEAD(&ring_info->pending); >> + ring_info->mfns = NULL; >> + } >> + else >> + { >> + /* >> + * Ring info already existed. >> + */ >> + printk(KERN_INFO "v4v: dom%d ring already registered\n", >> + current->domain->domain_id); >> + ret = -EEXIST; >> + break; >> + } > > If you reshuffle these so that the already-exists case is first, then > the not-already-exists case doesn't need to be indented, i.e. > > if ( ring_info ) > { > printk(); > ret = -EEXIST; > break; > } > /* Normal case goes here */ > > Then I think it becomes clear that you don't need a 'need_to_insert' > flag -- and that the read_unlock() below is dead code, so the error path > doesn't unlock at all! > >> + >> + spin_lock(&ring_info->lock); >> + ring_info->id = ring.id; >> + ring_info->len = ring.len; >> + ring_info->tx_ptr = ring.tx_ptr; >> + ring_info->ring = ring_hnd; > > I can't see where this field is used, but AFAICT the ring_hnd isn't > guaranteed to be valid after this hypercall finishes (and certainly not > after, say, the next context switch), so storing it doesn't seem useful. Yes I don't understand that one either. It will go away. > >> + if ( ring_info->mfns ) >> + xfree(ring_info->mfns); >> + ret = v4v_find_ring_mfns(d, ring_info, npage, pfn_hnd); >> + spin_unlock(&ring_info->lock); >> + if ( ret ) >> + break; >> + >> + if ( !need_to_insert ) >> + { >> + read_unlock(&d->v4v->lock); >> + } >> + else >> + { >> + uint16_t hash = v4v_hash_fn(&ring.id); >> + >> + write_lock(&d->v4v->lock); >> + hlist_add_head(&ring_info->node,&d->v4v->ring_hash[hash]); >> + write_unlock(&d->v4v->lock); >> + } >> + } while ( 0 ); >> + >> + read_unlock(&v4v_lock); >> + return ret; >> +} > [...] >> +void >> +v4vtables_print_rule(struct v4vtables_rule_node *node) >> +{ > [...] >> + if ( rule->accept == 1 ) >> + printk("ACCEPT"); > > The logic in the actual ACL is (effectively) 'if ( rule->accept != 0 )'; > so this probably ought to follow suit. :) Ack > > [...] >> +int >> +v4vtables_del(struct domain *src_d, >> + XEN_GUEST_HANDLE_PARAM(v4vtables_rule_t) rule_hnd, >> + int32_t position) >> +{ >> + struct list_head *tmp = NULL; >> + struct list_head *to_delete = NULL; >> + struct list_head *next = NULL; >> + struct v4vtables_rule_node *node; >> + >> + ASSERT(rw_is_write_locked(&v4vtables_rules_lock)); >> + >> + v4v_dprintk("v4vtables_del position:%d\n", position); >> + >> + if ( position != -1 ) >> + { >> + /* We want to delete the rule number */ >> + list_for_each(tmp,&v4vtables_rules) >> + { >> + to_delete = tmp; >> + if (position == 0) >> + break; >> + position--; >> + } > > Hmmm. If this is called with position = 1 with a multi-element list, it > selects the second element (which conflicts with the semantics of the > add operation, where '1' is the first element). And if it's called with > '1' on a single-element list, it selects the first (and only) element. Needs fixing along with other problems in the v4vtables. > >> + /* Can't find the position */ >> + if (position != 0) >> + to_delete = NULL; >> + } >> + else if ( !guest_handle_is_null(rule_hnd) ) >> + { >> + struct v4vtables_rule r; >> + >> + if ( copy_from_guest(&r, rule_hnd, 1) ) >> + return -EFAULT; >> + >> + list_for_each(tmp,&v4vtables_rules) >> + { >> + node = list_entry(tmp, struct v4vtables_rule_node, list); >> + >> + if ( (node->rule.src.domain == r.src.domain)&& >> + (node->rule.src.port == r.src.port)&& >> + (node->rule.dst.domain == r.dst.domain)&& >> + (node->rule.dst.port == r.dst.port)) > > Not matching by 'accept' too? That should be mentioned in the hypercall > interface, I think. > >> + { >> + to_delete = tmp; >> + break; >> + } >> + } >> + } >> + else >> + { >> + /* We want to flush the rules! */ >> + printk(KERN_ERR "VIPTables: flushing rules\n"); >> + list_for_each_safe(tmp, next,&v4vtables_rules) >> + { >> + node = list_entry(tmp, struct v4vtables_rule_node, list); >> + list_del(tmp); >> + xfree(node); >> + } >> + } >> + >> + if ( to_delete ) >> + { >> + node = list_entry(to_delete, struct v4vtables_rule_node, list); >> +#ifdef V4V_DEBUG >> + printk(KERN_ERR "VIPTables: deleting rule: "); >> + v4vtables_print_rule(node); >> +#endif /* V4V_DEBUG */ >> + list_del(to_delete); >> + xfree(node); >> + } >> + >> + return 0; >> +} >> + >> +static size_t >> +v4vtables_list(struct domain *src_d, >> + XEN_GUEST_HANDLE_PARAM(v4vtables_list_t) list_hnd) >> +{ >> + struct list_head *ptr; >> + struct v4vtables_rule_node *node; >> + struct v4vtables_list rules_list; >> + uint32_t nbrules; >> + XEN_GUEST_HANDLE(v4vtables_rule_t) guest_rules; >> + >> + ASSERT(rw_is_locked(&v4vtables_rules_lock)); >> + >> + memset(&rules_list, 0, sizeof (rules_list)); >> + if ( copy_from_guest(&rules_list, list_hnd, 1) ) >> + return -EFAULT; >> + >> + ptr = v4vtables_rules.next; >> + while ( rules_list.start_rule != 0&& ptr->next !=&v4vtables_rules ) >> + { >> + ptr = ptr->next; >> + rules_list.start_rule--; >> + } > > Again, that seems to conflict with the 'add' semantics that the first > rule is rule 1. > > And what if the supplied start_rule was too large? As it stands I think > this will return the last rule in the list rather than returning no > rules (or an error). > >> + >> + if ( rules_list.nb_rules == 0 ) >> + return -EINVAL; >> + >> + guest_rules = guest_handle_for_field(list_hnd, v4vtables_rule_t, rules[0]); >> + >> + nbrules = 0; >> + while ( nbrules< rules_list.nb_rules&& ptr !=&v4vtables_rules ) >> + { >> + node = list_entry(ptr, struct v4vtables_rule_node, list); >> + >> + if ( copy_to_guest(guest_rules,&node->rule, 1) ) >> + break; > > Maybe also arrange to return -EFAULT? > >> + >> + guest_handle_add_offset(guest_rules, 1); >> + >> + nbrules++; >> + ptr = ptr->next; >> + } >> + >> + rules_list.nb_rules = nbrules; >> + if ( copy_field_to_guest(list_hnd,&rules_list, nb_rules) ) >> + return -EFAULT; >> + >> + return 0; >> +} >> + >> +static size_t >> +v4vtables_check(v4v_addr_t *src, v4v_addr_t *dst) >> +{ >> + struct list_head *ptr; >> + struct v4vtables_rule_node *node; >> + size_t ret = 0; /* Defaulting to ACCEPT */ > > size_t? bool_t would be better. Also, having this function return the > inverse of the rule encoding is just perverse. > > [...] >> + case V4VOP_tables_add: >> + { >> + uint32_t position = arg3; >> + >> + XEN_GUEST_HANDLE_PARAM(v4vtables_rule_t) rule_hnd = >> + guest_handle_cast(arg1, v4vtables_rule_t); >> + rc = -EPERM; >> + if ( !is_hardware_domain(d) ) >> + goto out; > > I think this ought to be an XSM check now, with XSM_PRIV to indicate > that the default policy is to check IS_PRIV(). (Likewise for the > other table ops.) Ok. Again, we have v4vtables high on our list of needs overhauling. > > [...] >> +struct keyhandler v4v_info_keyhandler = >> +{ >> + .diagnostic = 1, >> + .u.fn = dump_state, >> + .desc = "dump v4v states and interupt" > > s/interupt/interrupt/ > > [...] >> +/* >> + * V4VOP_register_ring >> + * >> + * Registers a ring with Xen. If a ring with the same v4v_ring_id exists, >> + * the hypercall will return -EEXIST. >> + * >> + * do_v4v_op(V4VOP_register_ring, >> + * XEN_GUEST_HANDLE(v4v_ring_t), XEN_GUEST_HANDLE(v4v_pfn_t), >> + * npage, 0) > > Maybe add a 'uint32_t ' before npage, to match the same style below? > > [...] >> +/* >> + * V4VOP_sendv >> + * >> + * Sends of list of buffer contained in iov. >> + * >> + * For each iov entry send iov_len bytes of iov_base to addr.dst, giving >> + * src as the source address (xen will ignore src->domain and put your >> + * domain in the actually message), xen first looks for a ring with id.addr==dst >> + * and id.partner==sending_domain if that fails it looks for id.addr==dst and > > These lines are longer than 80 chars - can you wrap a little narrower please? > Also s/actually/actual/. > >> + * id.partner==DOMID_ANY. >> + * message_type is the 32 bit number used from the message >> + * most likely V4V_MESSAGE_DGRAM or V4V_MESSAGE_STREAM. If insufficient space exists >> + * it will return -EAGAIN and xen will twing the V4V_INTERRUPT when > > 'twing'? :) Not my "verb" :) > > [...] >> +/* >> + * V4VOP_tables_add >> + * >> + * Insert a filtering rules after a given position. >> + * > > Can you add a description of the semantics of the rules? > E.g. does the first match or the last match or the tightest match dominate? > >> + * do_v4v_op(V4VOP_tables_add, >> + * XEN_GUEST_HANDLE(v4vtables_rule_t) rule, >> + * NULL, >> + * uint32_t position, 0) >> + */ >> +#define V4VOP_tables_add 7 >> + >> +/* >> + * V4VOP_tables_del >> + * >> + * Delete a filtering rules at a position or the rule >> + * that matches "rule". >> + * >> + * do_v4v_op(V4VOP_tables_del, >> + * XEN_GUEST_HANDLE(v4vtables_rule_t) rule, >> + * NULL, >> + * uint32_t position, 0) >> + */ >> +#define V4VOP_tables_del 8 > > Can you describe the calling convention a bit more fully? In particular > that position should == -1 if you want to match by rule (though I think > == 0 would make more sense), and that position == -1&& rule == NULL > means to flush everything. Yes, I will take all your v4vtables concerns into account when we work on it. Again, thanks a bunch. Ross > > Cheers, > > Tim.