From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christopher Clark Subject: Re: possible grant table issue Date: Fri, 15 Jul 2005 11:58:47 -0700 Message-ID: References: Reply-To: cwc22@cam.ac.uk Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Stefan Berger Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org Hi Stefan Good catch. I vote apply. Christopher On 7/15/05, Stefan Berger wrote: > =20 > Hello Christopher,=20 > =20 > thanks for the explanation. I actually see the strange occurrence in a > user domain where I have grant table-enabled backend drivers. It happens = if > another domain connecting to the backend drivers is killed and possibly d= ue > to some timing issues with XEN-D the unmapping of the grant tables does n= ot > happen through the HYPERVISOR_grant_table_op first, but through some > cleaning up after the dying domain which places a call to=20 > =20 > int > gnttab_check_unmap( > struct domain *rd, struct domain *ld, unsigned long frame, int readon= ly) > =20 > [called from put_page_from_l1e()] and selects the wrong handle (=3D0) the= re. > Only after this happens does the backend receive the destroy message from > XEN-D and wants to clean up the grant table with the hypervisor call usin= g > the correct handle, but fails.=20 > =20 > The fix for this is attached. The problem is that the selected map is not > tested whether it was created for the (remote) domain 'rd'.=20 > =20 > Stefan=20 > =20 > Signed-off-by: Stefan Berger =20 > =20 >=20 > =20 > =20 > =20 > =20 > xen-devel-bounces@lists.xensource.com wrote on 07/13/2005 > 04:16:10 AM: >=20 > =20 > > Hi Stefan > >=20 > > Here's how it works: > >=20 > > maptrack_head is the index of the next free entry in the maptrack > > array. A handle is just an index into the array. > > = =20 > =20 > > Ignoring the shift, grant_table->maptrack array entries that are not i= n > use > > contain the index of the next free entry, forming a list and > > terminating with a sentinel value. When entries are freed, the array > > contents are overwritten in order to add the entry to the free list. > > = =20 > =20 > > Thus the grant_table->maptrack[index] entries are initialised to an > increasing > > series starting from 1 at index 0, and maptrack_head =3D 0. The sentin= el > > value is the number of elements in the array. > > = =20 > =20 > > So the sequence you describe will result in: > > = =20 > =20 > > get_maptrack_handle(t) > > =3D> maptrack_head is now 1 > > =3D> returns handle of 0 > > =3D> t->maptrack_entry[0] in use > > = =20 > =20 > > get_maptrack_handle(t) > > =3D> maptrack_head is now 2 > > =3D> returns handle of 1 > > =3D> t->maptrack_entry[1] in use > > = =20 > =20 > > get_maptrack_handle(t) > > =3D> maptrack_head is now 3 > > =3D> returns handle of 2 > > =3D> t->maptrack_entry[2] in use > > = =20 > =20 > > put_maptrack_handle(t, 2) > > =3D> t->maptrack_entry[2] points to next free index 3 > > =3D> maptrack_head is now 2 > > = =20 > =20 > > get_maptrack_handle(t) > > =3D> maptrack_head is now 3 > > =3D> returns handle of 2 > > =3D> t->maptrack_entry[2] in use > >=20 > > You shouldn't be returned a handle of 0 when getting a handle > > immediately after freeing handle 2. > >=20 > > Christopher > >=20 > >=20 > > On 7/12/05, Stefan Berger wrote: > > > Hello! > > >=20 > > > Attached is a patch that dumps some debugging output for the block > > > interface backend. The reason why I am posting this patch is due to = the > > > somewhat strange assignments of the handles that are returned from t= he > > > HYPERVISOR_grant_table_op. I am stopping short of saying it's a bug, > > > because I don't know the code well enough, but when looking at the > > > hypervisor code I see some place where I doubt that this is right. > > > Particularly one should try the following: > > >=20 > > > Create user domains that use the block interfaces. > > >=20 > > > 1st user domain witll be assigned handle 0x0. - should be ok > > > 2nd user domain will be assigned handle 0x1. - should be ok > > > 3rd user domain will be assigned handle 0x2. - should be ok > > >=20 > > > (handle numbers have obviously been increasing so far) > > >=20 > > > bring down 3rd user domain - free'ed handle will be 0x2 - should be = ok > > >=20 > > > create 3rd user domain again - will be assigned handle 0x0 - this is > not > > > what I would expect. > > >=20 > > > (the code that's causing this is called when handle 0x2 was free'ed > > > static inline void > > > put_maptrack_handle( > > > grant_table_t *t, int handle) > > > { > > > t->maptrack[handle].ref_and_flags =3D > t->maptrack_head << > > > MAPTRACK_REF_SHIFT; > > > t->maptrack_head =3D handle; > > > ^^^^^^ > > > t->map_count--; > > > } > > > ) > > >=20 > > >=20 > > >=20 > > > Now when I look at xen/common/grant_tables.c I see how the handles = are > > > used in : > > >=20 > > >=20 > > > static int > > > __gnttab_map_grant_ref( > > > gnttab_map_grant_ref_t *uop, > > > unsigned long *va) > > > { > > > [...] // much omitted > > >=20 > > > if ( 0 <=3D ( rc =3D __gnttab_activate_grant_ref( ld, led, rd, r= ef, > > > dev_hst_ro_flags, > > > host_virt_addr, > > > &frame))) > > > { > > > /* > > > * Only make the maptrack live _after_ writing the pte, in c= ase > we > > >=20 > > > * overwrite the same frame number, causing a maptrack walk = to > > > find it > > > */ > > > ld->grant_table->maptrack[handle].domid =3D dom; > > > ^^^^^^ > > > =20 > ld->grant_table->maptrack[handle].ref_and_flags > > > ^^^^^^ > > > =3D (ref << MAPTRACK_REF_SHIFT) | > > > (dev_hst_ro_flags & MAPTRACK_GNTMAP_MASK); > > >=20 > > > (void)__put_user(frame, &uop->dev_bus_addr); > > >=20 > > > if ( dev_hst_ro_flags & GNTMAP_host_map ) > > > *va =3D host_virt_addr; > > >=20 > > > (void)__put_user(handle, &uop->handle); > > >=20 > > >=20 > > > I think this newly assigned handle of '0' (for the re-created 3rd us= er > > > domain) is overwriting some previously assign array entry for the fi= rst > > > user domain. Please someone who knows have a look at this. All this = is > > > happening in the domain where the blockdevice backend is located. > > >=20 > > > Stefan > > >=20 > > >=20 > > > Signed-off-by : Stefan Berger > > >=20 > > >=20 > > >=20 > > > _______________________________________________ > > > Xen-devel mailing list > > > Xen-devel@lists.xensource.com > > > http://lists.xensource.com/xen-devel > > >=20 > > >=20 > > >=20 > > > > >=20 > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > =20 >