Buildroot Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] autobuild: group similar failures
@ 2012-05-14 21:22 Yann E. MORIN
  2012-05-15  6:32 ` Arnout Vandecappelle
  0 siblings, 1 reply; 2+ messages in thread
From: Yann E. MORIN @ 2012-05-14 21:22 UTC (permalink / raw)
  To: buildroot

Hello All!

Today, the autobuild logs are displayed on http://autobuild.buildroot.org/
and provide access to all the build logs, whether they were a successful
build, or there was a failure.

Although successful builds are nice, the interesting logs are those that
relate to build failures. Having the ability to sort and categorise those
failed logs would hopefully help in solving the most common build failures.

For example, in the recent past, there were two related alsa-libs build
failures on PowerPC:
  http://autobuild.buildroot.net/results/a8471598a379e5dbecef539479b968947474ef22/
  http://autobuild.buildroot.net/results/05b306ee2fa909420e4bc1f9faa7a35d17d786a7/

I did not look in details, but it may well be that there are other similar
build failures of alsa-libs on PowerPC. Solving the error in one of those
builds would also solve the other(s).

So, we need a heuristic to detect how similar two failed build logs are, and
eventually group those failures together. Below is such simple heuristic:

 1- extract the last 100 lines-or-so from two build logs (already done)
 2- sort each extract
 3- diffstat the sorted extracts
 4- the lower the number of different lines, the higher the similarity

For example, for these two build logs:
  diff -du a847159-100.sorted 05b306e-100.sorted |diffstat 
  05b306e-100.sorted |   18 +++++++++---------
   1 file changed, 9 insertions(+), 9 deletions(-)

Which is 18 changed lines out of 200, for a difference score of 9%. This is
a really rough and dumb metric, but one that is easy to compute, and we can
easily refine it a little bit (eg. by using more lines in the extracts, or
by munging some well-known patterns).

Setting up the scripts that gather all failed logs, and compute that
similiarity scores would be relatively easy (and I can do it!).

We could of course use other metrics, such as true text similarity as used
for plagiarism detection, but they would IMHO be much more involved.

Ready to read comments and suggestions! ;-)

Regards,
Yann E. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Buildroot] autobuild: group similar failures
  2012-05-14 21:22 [Buildroot] autobuild: group similar failures Yann E. MORIN
@ 2012-05-15  6:32 ` Arnout Vandecappelle
  0 siblings, 0 replies; 2+ messages in thread
From: Arnout Vandecappelle @ 2012-05-15  6:32 UTC (permalink / raw)
  To: buildroot

On 05/14/12 23:22, Yann E. MORIN wrote:
> Although successful builds are nice, the interesting logs are those that
> relate to build failures. Having the ability to sort and categorise those
> failed logs would hopefully help in solving the most common build failures.

  +1

> For example, in the recent past, there were two related alsa-libs build
> failures on PowerPC:
>    http://autobuild.buildroot.net/results/a8471598a379e5dbecef539479b968947474ef22/
>    http://autobuild.buildroot.net/results/05b306ee2fa909420e4bc1f9faa7a35d17d786a7/
>
> I did not look in details, but it may well be that there are other similar
> build failures of alsa-libs on PowerPC. Solving the error in one of those
> builds would also solve the other(s).
>
> So, we need a heuristic to detect how similar two failed build logs are, and
> eventually group those failures together. Below is such simple heuristic:
>
>   1- extract the last 100 lines-or-so from two build logs (already done)
>   2- sort each extract
>   3- diffstat the sorted extracts
>   4- the lower the number of different lines, the higher the similarity

  A much simpler heuristic would be to put failures of the same package together.
It's very likely that they're the same.  The diffstat has the disadvantage that
it can still be pretty different if the compiler path is different (for packages
that don't do a quiet make like alsa-libs does).

  Regards,
  Arnout

-- 
Arnout Vandecappelle                               arnout at mind be
Senior Embedded Software Architect                 +32-16-286540
Essensium/Mind                                     http://www.mind.be
G.Geenslaan 9, 3001 Leuven, Belgium                BE 872 984 063 RPR Leuven
LinkedIn profile: http://www.linkedin.com/in/arnoutvandecappelle
GPG fingerprint:  7CB5 E4CC 6C2E EFD4 6E3D A754 F963 ECAB 2450 2F1F

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-05-15  6:32 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-14 21:22 [Buildroot] autobuild: group similar failures Yann E. MORIN
2012-05-15  6:32 ` Arnout Vandecappelle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox