From: Stefan Hajnoczi <stefanha@redhat.com>
To: Joannah Nanjekye <nanjekyejoannah@gmail.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC PATCH] remove numpy dependency
Date: Tue, 7 Nov 2017 10:52:44 +0000 [thread overview]
Message-ID: <20171107105244.GD6809@stefanha-x1.localdomain> (raw)
In-Reply-To: <1509986155-4735-1-git-send-email-nanjekyejoannah@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 4541 bytes --]
On Mon, Nov 06, 2017 at 07:35:55PM +0300, Joannah Nanjekye wrote:
> Users tend to hit an ImportError when running analyze-migration.py due
> to the numpy dependency. numpy functionality isn't actually used, just
> binary serialization that the standard library 'struct' module already
> provides. Removing the dependency allows the script to run
> out-of-the-box.
>
> Signed-off-by: Joannah Nanjekye <nanjekyejoannah@gmail.com>
> ---
> scripts/analyze-migration.py | 23 +++++++++++++++--------
> 1 file changed, 15 insertions(+), 8 deletions(-)
>
> diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
> index 1455387..6175c99 100755
> --- a/scripts/analyze-migration.py
> +++ b/scripts/analyze-migration.py
> @@ -17,7 +17,6 @@
> # You should have received a copy of the GNU Lesser General Public
> # License along with this library; if not, see <http://www.gnu.org/licenses/>.
>
> -import numpy as np
> import json
> import os
> import argparse
> @@ -36,23 +35,29 @@ class MigrationFile(object):
> self.file = open(self.filename, "rb")
>
> def read64(self):
> - return np.asscalar(np.fromfile(self.file, count=1, dtype='>i8')[0])
dtype='>i8' is a 64-bit (8 bytes) big-endian signed integer according to
https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.dtypes.html#arrays-dtypes-constructing.
> + buffer = file.read(64)
This reads 64 bytes (not bits!). It should read 8 bytes.
> + return struct.unpack('>i16', buffer)[0]
This should be '>q' according to
https://docs.python.org/2.7/library/struct.html#format-strings.
>
> def read32(self):
> - return np.asscalar(np.fromfile(self.file, count=1, dtype='>i4')[0])
> + buffer = file.read(32)
read(4)
> + return struct.unpack('>i8', buffer)[0]
'>i'
>
> def read16(self):
> - return np.asscalar(np.fromfile(self.file, count=1, dtype='>i2')[0])
> + buffer = file.read(16)
read(2)
> + return struct.unpack('>i4', buffer)[0]
'>h'
>
> def read8(self):
> - return np.asscalar(np.fromfile(self.file, count=1, dtype='>i1')[0])
> + buffer = file.read(8)
read(1)
> + return struct.unpack('>i2', buffer)[0]
'b'
>
> def readstr(self, len = None):
> + read_format = str(len) + 'd'
> if len is None:
> len = self.read8()
> if len == 0:
> return ""
> - return np.fromfile(self.file, count=1, dtype=('S%d' % len))[0]
> + buffer = file.read(8)
> + return struct.unpack(read_format, buffer[0:(0 + struct.calcsize(read_format))])
According to the numpy documentation 'S' produces raw bytes (not a
unicode string). To get the raw bytes we just need file.read(len). The
struct module isn't needed.
Something like this should work:
if len is None:
len = self.read8()
if len == 0:
return ""
return file.read(len).split(b'\0')[0]
>
> def readvar(self, size = None):
> if size is None:
> @@ -303,8 +308,10 @@ class VMSDFieldInt(VMSDFieldGeneric):
>
> def read(self):
> super(VMSDFieldInt, self).read()
> - self.sdata = np.fromstring(self.data, count=1, dtype=(self.sdtype))[0]
> - self.udata = np.fromstring(self.data, count=1, dtype=(self.udtype))[0]
> + buffer = file.read(self.data)
This statement doesn't make sense. According to the Python
documentation:
read([size]) -> read at most size bytes, returned as a string.
self.data *is* the buffer. There's no need to call file.read().
> + read_format = self.sdtype
Unused.
> + self.sdata = struct.unpack(self.sdtype, buffer[0:(0 + struct.calcsize(self.sdtype))])
struct.calcsize() is unnecessary since self.size already contains the
size in bytes.
struct.unpack returns a tuple and we need the first element, so it
should be:
self.sdata = struct.unpack(...)[0]
> + self.sdata = struct.unpack(self.udtype, buffer[0:(0 + struct.calcsize(self.udtype))])
The data type specifiers are incorrect because the struct module uses
different syntax than numpy:
self.sdtype = '>i%d' % self.size
self.udtype = '>u%d' % self.size
The struct data type specifiers can be looked up like this:
sdtypes = {1: 'b', 2: '>h', 4: '>i', 8: '>q'}
udtypes = {1: 'B', 2: '>H', 4: '>I', 8: '>Q'}
self.sdtype = sdtypes[self.size]
self.udtype = udtypes[self.size]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
prev parent reply other threads:[~2017-11-07 10:53 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-06 16:35 [Qemu-devel] [RFC PATCH] remove numpy dependency Joannah Nanjekye
2017-11-06 16:47 ` no-reply
2017-11-07 10:52 ` Stefan Hajnoczi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171107105244.GD6809@stefanha-x1.localdomain \
--to=stefanha@redhat.com \
--cc=nanjekyejoannah@gmail.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).