From mboxrd@z Thu Jan  1 00:00:00 1970
From: Marek =?utf-8?Q?Marczykowski-G=C3=B3recki?=
 <marmarek@invisiblethingslab.com>
Subject: Re: Python 3 bindings
Date: Tue, 21 Feb 2017 14:03:00 +0100
Message-ID: <20170221130300.GF1146@mail-itl>
References: <20170217123601.GF12171@mail-itl>
 <20170220171844.ifeumcygyk4hglb6@citrix.com>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============5401155132867826320=="
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <20170220171844.ifeumcygyk4hglb6@citrix.com>
List-Unsubscribe: <https://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
 <mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <https://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
 <mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Errors-To: xen-devel-bounces@lists.xen.org
Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>
To: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>, xen-devel <xen-devel@lists.xen.org>
List-Id: xen-devel@lists.xenproject.org


--===============5401155132867826320==
Content-Type: multipart/signed; micalg=pgp-sha256;
	protocol="application/pgp-signature"; boundary="SWTRyWv/ijrBap1m"
Content-Disposition: inline


--SWTRyWv/ijrBap1m
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Feb 20, 2017 at 05:18:44PM +0000, Wei Liu wrote:
> On Fri, Feb 17, 2017 at 01:36:01PM +0100, Marek Marczykowski-G=C3=B3recki=
 wrote:
> > Hi,
> >=20
> > I'm adjusting python bindings to work on python3 too. This will require
> > few #if in the code (to compile for both python2 and python3), but it
> > isn't that bad. But there are some major changes in python3, which
> > require some decision about the bindings API:
> >=20
> > 1. Python3 has no longer separate 'int' and 'long' type - old 'long'
> > type was renamed to 'int' (but on C-API level, it uses PyLong_*). I see
> > two options:
> >   - switch to PyLong_* everywhere, including python2 bindings - this
> >     makes the code much cleaner, but it is an API change in python2
> >   - switch to PyLong_* only for python3 - this will introduce some
> >     #ifdefs, but python2 API will be unchanged
>=20
> Could you be more specific? Like, provide a code snippet?

Here is compile tested only version:
https://github.com/marmarek/xen/tree/python3

It uses PyLong_* only for python3, here is how it looks in code (I've
skipped s/PyInt_/PyLongOrInt_/ for readability):

-----8<-----
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -34,6 +34,17 @@
=20
 #define FLASK_CTX_LEN 1024
=20
+/* Python 2 compatibility */
+#if PY_VERSION_HEX >=3D 0x03000000
+#define PyLongOrInt_FromLong PyLong_FromLong
+#define PyLongOrInt_Check PyLong_Check
+#define PyLongOrInt_AsLong PyLong_AsLong
+#else
+#define PyLongOrInt_FromLong PyInt_FromLong
+#define PyLongOrInt_Check PyInt_Check
+#define PyLongOrInt_AsLong PyInt_AsLong
+#endif
+
 static PyObject *xc_error_obj, *zero;
=20
 typedef struct {
--- a/tools/python/xen/lowlevel/xs/xs.c
+++ b/tools/python/xen/lowlevel/xs/xs.c
@@ -43,6 +43,14 @@
 #define PKG "xen.lowlevel.xs"
 #define CLS "xs"
=20
+#if PY_VERSION_HEX < 0x03000000
+/* Python 2 compatibility */
+#define PyLong_FromLong PyInt_FromLong
+#undef PyLong_Check
+#define PyLong_Check PyInt_Check
+#define PyLong_AsLong PyInt_AsLong
+#endif
+
 static PyObject *xs_error;
=20
 /** Python wrapper round an xs handle.
-----8<-----

>=20
> >=20
> > 2. Python3 has no longer separate 'str' and 'unicode' type, new 'str' is
> > the same as 'unicode' (PyUnicode_* at C-API level). For things not
> > really unicode-aware, 'bytes' type should be used. On the other hand, in
> > python2 'bytes' type was the same as 'str'.
> > This affects various places, where in most cases 'bytes' type is
> > appropriate (for example cpuid). But I'm not sure about xenstore paths -
> > those should also be 'bytes', or maybe 'unicode' (which is implicitly
> > using 'utf-8' encoding)? I think the only reason to use 'unicode' is
>=20
> According to docs/txt/misc/xenstore.txt, paths should be ASCII
> alphanumerics plus four punctuation characters. Not sure if this is
> relevant to what you describe.

It's easy to make function accept both 'bytes' and 'unicode'. The
question is what should be return type (read_watch, ls etc) - given
limited character set used there, I'm in favor of 'unicode' - easier to
handle, but we shouldn't hit any unicode decoding problems.
Maybe the same should apply to path arguments (use 'unicode')? Most
file-handling methods in python3 use 'unicode' for paths, if that
matters.

> > convenience for API users - in python3 if you write 'some string' it
> > will be unicode type, to create bytes data you need to write b'some
> > string'.
> > As for python2, it should definitely be still 'str'/'bytes' type.
> >=20
> > There is one more little detail - build process. Here I'm going to
> > follow popular standard - use $(PYTHON) variable - if that points to
> > python3, build for python3. Actually this means no change in the current
> > makefile. If someone want to build for both python2 and python3, will
> > need to call the build twice - at packaging level.

--=20
Best Regards,
Marek Marczykowski-G=C3=B3recki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

--SWTRyWv/ijrBap1m
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCAAGBQJYrDqFAAoJENuP0xzK19csXNcH/iR2khawcS4LN9kc+LYa+Yz3
IGKXcFTqDQaWO4ssJJpmU+17ywGYuwz0fnp1gwuCt+Wnt6Qk8NzAiji9XLvxrPET
ziC5HRm9GEH1qKjwzzdgyTASlocA8tC0xnZ0M6QSooib8VLQPn9WNQvA3v43ZQpK
aCjTTzUQMkpbnQ8Tevpa+V80yktacqm/pdawR7yzMdqDAXPl+ceCfWvPxv4v3zO3
m23iOtToscg7hBOMD7be7+GPKM5R2bHssaYmlEGPtMGLn6Y8Z+/+IdGXRLksbT5z
pcydZB7jpMVyb9cPWUMuq28Fv/Zud47Wg5m9F+OO1bB2qKyEGhzouOzE6HQcRkw=
=uvP5
-----END PGP SIGNATURE-----

--SWTRyWv/ijrBap1m--


--===============5401155132867826320==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline

X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs
IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v
cmcveGVuLWRldmVsCg==

--===============5401155132867826320==--