git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Thomas Rast <trast@student.ethz.ch>
Cc: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>,
	"Eric Herman" <eric@freesa.org>,
	git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>
Subject: Re: [PATCH v2 0/3] grep multithreading and scaling
Date: Mon, 5 Dec 2011 19:40:12 -0500	[thread overview]
Message-ID: <20111206004012.GA12760@sigill.intra.peff.net> (raw)
In-Reply-To: <201112051038.16423.trast@student.ethz.ch>

On Mon, Dec 05, 2011 at 10:38:16AM +0100, Thomas Rast wrote:

> I just found out that on Linux, there's mincore() that can tell us
> (racily, but who cares) whether a given file mapping is in memory.  If
> you would like to try it, see the source at the end, but I'm getting
> things such as

Neat, I didn't know about mincore.

> So that looks fairly promising, and the order would then be:
> 
> - if stat-clean, and we have mincore(), and it tells us we can do it
>   cheaply: grab file from tree
> 
> - if it's a loose object: decompress it
> 
> - if stat-clean: grab file from tree
> 
> - access packs as usual

I don't think your third one makes sense. If the working tree file isn't
stat clean, then either:

  1. the pack file is in cache, and it's way faster than faulting in the
     working tree file from disk

  2. the pack file is not in cache, and it's a toss-up whether it is
     faster to fault in the smaller compressed pack-file version and
     uncompress it, or to fault in the larger on-disk version. The
     exact result will depend on the ratio of CPU to disk speed, the
     quality of your filesystem, and the size and contents of your file.

     And possibly on the exact delta chains you have. Though this
     optimization only happens when the file is in the index, which
     usually means it's recent, which means it will tend to be at the
     head of the delta chain.

So it probably just makes sense to grab the working tree file only if
mincore() tells us we have all (or most) of it, and otherwise go to the
packfile.

> Ok, I see, I missed that part.  Perhaps the heuristic should then be
> "if the regex boils down to memmem, disable threading", but let's see
> what loose object decompression in parallel can give us.

Yeah. I'd really rather have parallel object decompression than some
complex Linux-only mincore optimization (even though that optimization
_could_ yield extra savings on top of properly threading, if the blob
retrieval is threaded, I think I'll care less about how much CPU time it
takes).

-Peff

  parent reply	other threads:[~2011-12-06  0:40 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-25 14:46 [PATCH] grep: load funcname patterns for -W Thomas Rast
2011-11-25 16:32 ` René Scharfe
2011-11-26 12:15   ` [PATCH] grep: enable multi-threading for -p and -W René Scharfe
2011-11-29  9:54     ` Thomas Rast
2011-11-29 13:49       ` René Scharfe
2011-11-29 14:07         ` Thomas Rast
2011-12-02 13:07           ` [PATCH v2 0/3] grep multithreading and scaling Thomas Rast
2011-12-02 13:07             ` [PATCH v2 1/3] grep: load funcname patterns for -W Thomas Rast
2011-12-02 13:07             ` [PATCH v2 2/3] grep: enable threading with -p and -W using lazy attribute lookup Thomas Rast
2011-12-02 13:07             ` [PATCH v2 3/3] grep: disable threading in all but worktree case Thomas Rast
2011-12-02 16:15               ` René Scharfe
2011-12-05  9:02                 ` Thomas Rast
2011-12-06 22:48                 ` René Scharfe
2011-12-06 23:01                   ` [PATCH 4/2] grep: turn off threading for non-worktree René Scharfe
2011-12-07  4:42                     ` Jeff King
2011-12-07 17:11                       ` René Scharfe
2011-12-07 18:28                         ` Jeff King
2011-12-07 20:11                       ` J. Bruce Fields
2011-12-07 20:45                         ` Jeff King
2011-12-07  8:12                     ` Thomas Rast
2011-12-07 17:00                       ` René Scharfe
2011-12-10 13:13                         ` Pete Wyckoff
2011-12-12 22:37                           ` René Scharfe
2011-12-07  4:24                   ` [PATCH v2 3/3] grep: disable threading in all but worktree case Jeff King
2011-12-07 16:52                     ` René Scharfe
2011-12-07 18:10                       ` Jeff King
2011-12-07  8:11                   ` Thomas Rast
2011-12-07 16:54                     ` René Scharfe
2011-12-12 21:16                 ` [PATCH v3 0/3] grep attributes and multithreading Thomas Rast
2011-12-12 21:16                   ` [PATCH v3 1/3] grep: load funcname patterns for -W Thomas Rast
2011-12-12 21:16                   ` [PATCH v3 2/3] grep: enable threading with -p and -W using lazy attribute lookup Thomas Rast
2011-12-16  8:22                     ` Johannes Sixt
2011-12-16 17:34                       ` Junio C Hamano
2011-12-12 21:16                   ` [PATCH v3 3/3] grep: disable threading in non-worktree case Thomas Rast
2011-12-12 22:37                   ` [PATCH v3 0/3] grep attributes and multithreading René Scharfe
2011-12-12 23:44                   ` Junio C Hamano
2011-12-13  8:44                     ` Thomas Rast
2011-12-23 22:37               ` [PATCH v2 3/3] grep: disable threading in all but worktree case Ævar Arnfjörð Bjarmason
2011-12-23 22:49                 ` Thomas Rast
2011-12-24  1:39                   ` Ævar Arnfjörð Bjarmason
2011-12-24  7:07                     ` Jeff King
2011-12-24 10:49                       ` Nguyen Thai Ngoc Duy
2011-12-24 10:55                       ` Nguyen Thai Ngoc Duy
2011-12-24 13:38                         ` Jeff King
2011-12-25  3:32                       ` Nguyen Thai Ngoc Duy
2011-12-02 17:34             ` [PATCH v2 0/3] grep multithreading and scaling Jeff King
2011-12-05  9:38               ` Thomas Rast
2011-12-05 20:16                 ` Thomas Rast
2011-12-06  0:40                 ` Jeff King [this message]
2011-12-02 20:02             ` Eric Herman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111206004012.GA12760@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=eric@freesa.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=rene.scharfe@lsrfire.ath.cx \
    --cc=trast@student.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).