All of lore.kernel.org
 help / color / mirror / Atom feed
From: Magnus Damm <magnus.damm@gmail.com>
To: Chris Mason <mason@suse.com>
Cc: Linus Torvalds <torvalds@osdl.org>,
	Mike Taht <mike.taht@timesys.com>, Matt Mackall <mpm@selenic.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	git@vger.kernel.org
Subject: Re: Mercurial 0.3 vs git benchmarks
Date: Tue, 26 Apr 2005 18:23:11 +0200	[thread overview]
Message-ID: <aec7e5c305042609231a5d3f0@mail.gmail.com> (raw)
In-Reply-To: <200504261138.46339.mason@suse.com>

On 4/26/05, Chris Mason <mason@suse.com> wrote:
> On Tuesday 26 April 2005 11:09, Magnus Damm wrote:
> > On 4/26/05, Chris Mason <mason@suse.com> wrote:
> > > This agrees with my tests here, the time to apply patches is somewhat
> > > disk bound, even for the small 100 or 200 patch series.  The io should be
> > > coming from data=ordered, since the commits are still every 5 seconds or
> > > so.
> >
> > Yes, as long as you apply the patches to disk that is. I've hacked up
> > a small backend tool that applies patches to files kept in memory and
> > uses a modifed rabin-karp search to match hunks. So you basically read
> > once and write once per file instead of moving data around for each
> > applied patch. But it needs two passes.
> >
> > And no, the source code for the entire Linux kernel is not kept in
> > memory - you need a smart frontend to manage the file cache. Drop me a
> > line if you are interested.
> 
> Sorry, you've lost me.  Right now the cycle goes like this:

Ehrm, maybe I'm way off. =)

> 1) patch reads patch file, reads source file, writes source file
> 2) update-cache reads source file, writes git file

Ok.

> Which of those writes are you avoiding?  We have a smart way to manage the
> cache already for the source files...the vm does pretty well.  There's
> nothing to manage for the git files.  For the apply a bunch of patches
> workload, they are write once, read never (except for the index).

Well, maybe I misunderstood everything, but I thought you were
applying a lot of patches and complained that it took a lot of time
due to the data order.

When I applied a lot of patches to the kernel recently the cpu load
dropped to zero after a while and the HD worked hard a sec or two and
then things came back again. My primitive guess is that it was because
the ext3 journal became full. To workaround this fact I started
hacking on this in-memory patcher.

In the cycle above, I'm trying to speed up step 1:
If the patch modifies each source file multiple times (either using
multiple hunks or multiple ---/+++) then the lines below the hunk in
the source file will be moved multiple times. And if the source file
is written to disk after each hunk or ---/+++ is applied then this
will generate a lot of writes that can be avoided if the entire patch
procedure is broken down into a first pass that analyzes the patches
and a second pass that applies the patches and keeps source files in
memory.

But my rather trivial observation above is of course only suitable if
you have a lot of patches that should be applied and you are only
interested in the final version of the patched source files. If you
apply one patch at a time and import each source file as a new
revision then my little hack is probably not for you.

/ magnus

  reply	other threads:[~2005-04-26 16:28 UTC|newest]

Thread overview: 119+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-26  0:41 Mercurial 0.3 vs git benchmarks Matt Mackall
2005-04-26  1:49 ` Daniel Phillips
2005-04-26  2:08 ` Linus Torvalds
2005-04-26  2:30   ` Mike Taht
2005-04-26  3:04     ` Linus Torvalds
2005-04-26  4:00       ` Linus Torvalds
2005-04-26 11:13         ` Chris Mason
2005-04-26 15:09           ` Magnus Damm
2005-04-26 15:38             ` Chris Mason
2005-04-26 16:23               ` Magnus Damm [this message]
2005-04-26 18:18                 ` Chris Mason
2005-04-26 20:56                 ` Andrew Morton
2005-04-26 21:07                   ` Linus Torvalds
2005-04-26 22:50                     ` H. Peter Anvin
2005-04-26 22:56                     ` Andrew Morton
2005-04-26 23:43                       ` H. Peter Anvin
2005-04-27 15:01                         ` Florian Weimer
2005-04-27 15:13                           ` Thomas Glanzmann
2005-04-27 18:54                             ` H. Peter Anvin
2005-04-27 19:01                               ` Thomas Glanzmann
2005-04-27 19:57                                 ` Theodore Ts'o
2005-04-27 20:06                                   ` Thomas Glanzmann
2005-04-27 20:35                                 ` H. Peter Anvin
2005-04-27 20:39                                   ` Thomas Glanzmann
2005-04-27 20:47                                   ` Florian Weimer
2005-04-27 20:55                                 ` Florian Weimer
2005-04-27 21:04                                   ` H. Peter Anvin
2005-04-27 21:06                                     ` Florian Weimer
2005-04-27 21:32                                       ` Theodore Ts'o
2005-04-27 19:55                       ` Theodore Ts'o
2005-04-27  6:34                   ` Ingo Molnar
2005-04-27 21:10                     ` Bill Davidsen
2005-04-27 21:39                       ` Linus Torvalds
2005-04-26 16:42           ` Linus Torvalds
2005-04-26 17:39             ` Chris Mason
2005-04-26 19:52               ` Chris Mason
2005-04-26 18:15         ` H. Peter Anvin
2005-04-26 20:30           ` Bill Davidsen
2005-04-26 16:11       ` Bill Davidsen
2005-04-26  4:01   ` Matt Mackall
2005-04-26  4:20     ` Linus Torvalds
2005-04-26  4:09   ` Chris Wedgwood
2005-04-26  4:22     ` Andreas Gal
2005-04-26  4:22     ` Linus Torvalds
2005-04-29  6:01   ` Mercurial 0.4b vs git patchbomb benchmark Matt Mackall
2005-04-29  6:40     ` Sean
2005-04-29  7:40       ` Matt Mackall
2005-04-29  8:40         ` Sean
2005-04-29 14:34         ` Linus Torvalds
2005-04-29 15:18           ` Morten Welinder
2005-04-29 16:52             ` Matt Mackall
2005-05-02 16:10               ` Bill Davidsen
2005-05-02 19:02                 ` Sean
2005-05-02 22:02                 ` Linus Torvalds
2005-05-02 22:30                   ` Matt Mackall
2005-05-02 22:49                     ` Linus Torvalds
2005-05-03  0:00                       ` Matt Mackall
2005-05-03  2:48                         ` Linus Torvalds
2005-05-03  3:29                           ` Matt Mackall
2005-05-03  4:18                             ` Linus Torvalds
2005-05-03  4:24                         ` Linus Torvalds
2005-05-03  4:27                           ` Matt Mackall
2005-05-03  8:45                           ` Chris Wedgwood
2005-04-29 15:44           ` Tom Lord
2005-04-29 15:58             ` Linus Torvalds
2005-04-29 17:34               ` Tom Lord
2005-04-29 17:56                 ` Linus Torvalds
2005-04-29 18:08                   ` Tom Lord
2005-04-29 18:33                     ` Sean
2005-04-29 18:54                       ` Tom Lord
2005-04-29 19:13                         ` Sean
2005-04-29 19:22                           ` Tom Lord
2005-04-29 19:28                           ` Tom Lord
2005-04-29 19:47                             ` Noel Maddy
2005-04-29 19:54                               ` Tom Lord
2005-04-29 20:13                                 ` Andrew Timberlake-Newell
2005-04-29 20:26                                   ` Tom Lord
2005-04-29 20:57                                     ` Andrew Timberlake-Newell
2005-04-29 20:16                                 ` Morgan Schweers
2005-04-29 20:21                                 ` Noel Maddy
2005-04-29 20:42                                   ` git network protocol David Lang
2005-04-29 21:15                                     ` Daniel Barkalow
2005-04-29 20:44                                   ` Mercurial 0.4b vs git patchbomb benchmark Tom Lord
2005-04-29 21:57                                     ` Denys Duchier
2005-04-29 20:29                                 ` Signed commit vulnerabilities? (was: Mercurial 0.4b vs git patchbomb benchmark) Kevin Smith
2005-04-29 21:45                             ` Mercurial 0.4b vs git patchbomb benchmark Horst von Brand
2005-05-02 21:06                               ` Tom Lord
2005-05-03  0:24                                 ` Kevin Smith
2005-05-02 16:15                           ` Bill Davidsen
2005-04-29 16:37           ` Matt Mackall
2005-04-29 17:09             ` Linus Torvalds
2005-04-29 19:12               ` Matt Mackall
2005-04-29 19:50                 ` Linus Torvalds
2005-04-29 20:23                   ` Matt Mackall
2005-04-29 20:49                     ` Linus Torvalds
2005-04-29 21:20                       ` Matt Mackall
2005-04-29 16:46           ` Bill Davidsen
2005-04-29 20:19       ` Andrea Arcangeli
2005-04-29 22:30         ` Olivier Galibert
2005-04-29 22:47           ` Andrea Arcangeli
2005-04-29 20:30     ` Andrea Arcangeli
2005-04-29 20:39       ` Matt Mackall
2005-04-30  2:52         ` Andrea Arcangeli
2005-04-30 15:20           ` Matt Mackall
2005-04-30 16:37             ` Andrea Arcangeli
2005-05-02 15:49           ` Bill Davidsen
2005-05-02 16:14             ` Valdis.Kletnieks
2005-05-03 17:40               ` Bill Davidsen
2005-05-04  2:10                 ` Mercurial 0.4b vs git patchbomb benchmark (/usr/bin/env again) David A. Wheeler
2005-05-02 16:17             ` Mercurial 0.4b vs git patchbomb benchmark Andrea Arcangeli
2005-05-02 16:31             ` Linus Torvalds
2005-05-02 17:18               ` Daniel Jacobowitz
2005-05-02 17:32                 ` Linus Torvalds
2005-05-02 18:17                 ` Edgar Toernig
2005-05-02 20:54                 ` Sam Ravnborg
2005-05-02 17:20               ` Ryan Anderson
2005-05-02 17:31                 ` Linus Torvalds
2005-05-02 21:17               ` Kyle Moffett
2005-05-03 17:43               ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aec7e5c305042609231a5d3f0@mail.gmail.com \
    --to=magnus.damm@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mason@suse.com \
    --cc=mike.taht@timesys.com \
    --cc=mpm@selenic.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.