From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yann E. MORIN Date: Tue, 8 Jan 2019 20:27:34 +0100 Subject: [Buildroot] [PATCH 11/19] support: introduce new format for packages-file-list files In-Reply-To: References: Message-ID: <20190108192734.GL19623@scaer> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: buildroot@busybox.net Thomas, All, On 2019-01-08 16:07 +0100, Thomas De Schampheleire spake thusly: > El lun., 7 ene. 2019 a las 23:06, Yann E. MORIN > () escribi?: > > The existing format for the packages-files lists has two drawbacks: [--SNIP--] > > AS such, introduce a new format for those files, that solves both > > issues. [--SNIP--] > > +# Read the binary-opened file object f with \0\n separated records (aka lines). > > +# Highly inspired by: > > +# https://stackoverflow.com/questions/19600475/how-to-read-records-terminated-by-custom-separator-from-file-in-python > > +def _readlines0n(f): > > + buf = b'' > > + while True: > > + newbuf = f.read(1048576) > I would find 1024 * 1024 more readable. We definitely have a different eyesight, as I find this easier to grok! ;-) I should note that we absolutely do not care about the size we buffer. We could very well read it whole, or use smaller chunks (the original inspiration read 4kiB blocks). But I can switch to 1024*1024. [--SNIP--] > > # Iterate on all records of the packages-file-list file passed as filename > > # Returns an iterator over a list of dictionaries. Each dictionary contains > > # these keys (others maybe added in the future): > > @@ -12,11 +32,11 @@ import subprocess > > # 'pkg': the last package that installed that file > > def parse_pkg_file_list(path): > > with open(path, 'rb') as f: > I now understand why you read as binary. Even though here we really want to read a binary file, we already wanted to in the previous patch, because filenames can be any binary sequence. > > - for rec in f.readlines(): > > - l = rec.split(',0') > I still think this ',0' was wrong. Yes it was, I'll fix it (well, I already fixed it now). > > + for rec in _readlines0n(f): > > + srec = rec.split(b'\x00') > > d = { > > - 'file': l[0], > > - 'pkg': l[1], > > + 'file': srec[0], > > + 'pkg': srec[1], > and I now see how the swap in a previous commit could go unnoticed in > your testing :-) Yeah. Thanks for spotting it. I'll still fix the interim patches to be consistent. Thank you! :-) Regards, Yann E. MORIN. -- .-----------------.--------------------.------------------.--------------------. | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: | | +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ | | +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no | | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. | '------------------------------^-------^------------------^--------------------'