qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Joannah Nanjekye <nanjekyejoannah@gmail.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC PATCH] remove numpy dependency
Date: Tue, 7 Nov 2017 10:52:44 +0000	[thread overview]
Message-ID: <20171107105244.GD6809@stefanha-x1.localdomain> (raw)
In-Reply-To: <1509986155-4735-1-git-send-email-nanjekyejoannah@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4541 bytes --]

On Mon, Nov 06, 2017 at 07:35:55PM +0300, Joannah Nanjekye wrote:
> Users tend to hit an ImportError when running analyze-migration.py due
> to the numpy dependency.  numpy functionality isn't actually used, just
> binary serialization that the standard library 'struct' module already
> provides.  Removing the dependency allows the script to run
> out-of-the-box.
> 
> Signed-off-by: Joannah Nanjekye <nanjekyejoannah@gmail.com>
> ---
>  scripts/analyze-migration.py | 23 +++++++++++++++--------
>  1 file changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
> index 1455387..6175c99 100755
> --- a/scripts/analyze-migration.py
> +++ b/scripts/analyze-migration.py
> @@ -17,7 +17,6 @@
>  # You should have received a copy of the GNU Lesser General Public
>  # License along with this library; if not, see <http://www.gnu.org/licenses/>.
>  
> -import numpy as np
>  import json
>  import os
>  import argparse
> @@ -36,23 +35,29 @@ class MigrationFile(object):
>          self.file = open(self.filename, "rb")
>  
>      def read64(self):
> -        return np.asscalar(np.fromfile(self.file, count=1, dtype='>i8')[0])

dtype='>i8' is a 64-bit (8 bytes) big-endian signed integer according to
https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.dtypes.html#arrays-dtypes-constructing.

> +        buffer = file.read(64)

This reads 64 bytes (not bits!).  It should read 8 bytes.

> +        return struct.unpack('>i16', buffer)[0]

This should be '>q' according to
https://docs.python.org/2.7/library/struct.html#format-strings.

>  
>      def read32(self):
> -        return np.asscalar(np.fromfile(self.file, count=1, dtype='>i4')[0])
> +        buffer = file.read(32)

read(4)

> +        return struct.unpack('>i8', buffer)[0]

'>i'

>  
>      def read16(self):
> -        return np.asscalar(np.fromfile(self.file, count=1, dtype='>i2')[0])
> +        buffer = file.read(16)

read(2)

> +        return struct.unpack('>i4', buffer)[0]

'>h'

>  
>      def read8(self):
> -        return np.asscalar(np.fromfile(self.file, count=1, dtype='>i1')[0])
> +        buffer = file.read(8)

read(1)

> +        return struct.unpack('>i2', buffer)[0]

'b'

>  
>      def readstr(self, len = None):
> +        read_format = str(len) + 'd'
>          if len is None:
>              len = self.read8()
>          if len == 0:
>              return ""
> -        return np.fromfile(self.file, count=1, dtype=('S%d' % len))[0]
> +        buffer = file.read(8)
> +        return struct.unpack(read_format, buffer[0:(0 + struct.calcsize(read_format))])

According to the numpy documentation 'S' produces raw bytes (not a
unicode string).  To get the raw bytes we just need file.read(len).  The
struct module isn't needed.

Something like this should work:

  if len is None:
      len = self.read8()
  if len == 0:
      return ""
  return file.read(len).split(b'\0')[0]

>  
>      def readvar(self, size = None):
>          if size is None:
> @@ -303,8 +308,10 @@ class VMSDFieldInt(VMSDFieldGeneric):
>  
>      def read(self):
>          super(VMSDFieldInt, self).read()
> -        self.sdata = np.fromstring(self.data, count=1, dtype=(self.sdtype))[0]
> -        self.udata = np.fromstring(self.data, count=1, dtype=(self.udtype))[0]
> +        buffer = file.read(self.data)

This statement doesn't make sense.  According to the Python
documentation:

  read([size]) -> read at most size bytes, returned as a string.

self.data *is* the buffer.  There's no need to call file.read().

> +        read_format = self.sdtype

Unused.

> +        self.sdata = struct.unpack(self.sdtype, buffer[0:(0 + struct.calcsize(self.sdtype))])

struct.calcsize() is unnecessary since self.size already contains the
size in bytes.

struct.unpack returns a tuple and we need the first element, so it
should be:

  self.sdata = struct.unpack(...)[0]

> +        self.sdata = struct.unpack(self.udtype, buffer[0:(0 + struct.calcsize(self.udtype))])

The data type specifiers are incorrect because the struct module uses
different syntax than numpy:

  self.sdtype = '>i%d' % self.size
  self.udtype = '>u%d' % self.size

The struct data type specifiers can be looked up like this:

  sdtypes = {1: 'b', 2: '>h', 4: '>i', 8: '>q'}
  udtypes = {1: 'B', 2: '>H', 4: '>I', 8: '>Q'}
  self.sdtype = sdtypes[self.size]
  self.udtype = udtypes[self.size]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

      parent reply	other threads:[~2017-11-07 10:53 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-06 16:35 [Qemu-devel] [RFC PATCH] remove numpy dependency Joannah Nanjekye
2017-11-06 16:47 ` no-reply
2017-11-07 10:52 ` Stefan Hajnoczi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171107105244.GD6809@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=nanjekyejoannah@gmail.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).