git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Fredrik Kuivinen <frekui@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Git Mailing List <git@vger.kernel.org>,
	Johannes Sixt <j.sixt@viscovery.net>
Subject: Re: [PATCH v3] Threaded grep
Date: Mon, 25 Jan 2010 23:47:30 +0100	[thread overview]
Message-ID: <4c8ef71001251447m18a0e7c0y71db3972d0ca1152@mail.gmail.com> (raw)
In-Reply-To: <4c8ef71001181612l72ec73ecmae709fbb475752b0@mail.gmail.com>

On Tue, Jan 19, 2010 at 01:12, Fredrik Kuivinen <frekui@gmail.com> wrote:
> On Mon, Jan 18, 2010 at 23:10, Junio C Hamano <gitster@pobox.com> wrote:
>>
>> I am wondering if this can somehow be made into a change with alot
>> smaller inpact, in spirit similar to "preloading".  The higher level
>> loop may walk the input (be it cache, tree, or directory), issuing one
>> grep_file() or grep_sha1() at a time just like the existing code, but
>> the thread-aware code adds "priming" phase that (1) populates a "work
>> queue" with a very small memory footprint, and that (2) starts more
>> than one underlying grep on different files and blobs (up to the
>> number of threads you are allowed, like your "#define THREADS 8"), and
>> that (3) upon completion of one "work queue" item, the work queue item
>> is marked with its result.
>>
>> Then grep_file() and grep_sha1() _may_ notice that the work queue item
>> hasn't completed, and would wait in such a case, or just emits the
>> output produced earlier (could be much earlier) by the worker bee.
>>
>> The low-memory-footprint work queue could be implemented as a lazy
>> one, and may be very different depending on how the input is created.
>> If we are grepping in the index, it could be a small array of <array
>> index in active_cache[], done-bit, result-bit, result-strbuf> with a
>> single integer that is a pointer into the index to indicate up to
>> which index entry the work queue has been populated.
>
> I will have to think about this a bit more. It's getting late here.

I will be sending v4 of the patch in a couple of minutes, but I want
to comment on this first.

Sure, it is probably possible to structure the code as you suggest.
However, I am not so sure that it will be a significantly smaller
change. I find the approach taken in the patch to be quite nice as the
threaded and non-threaded cases are fairly similar. There is a block
of code which deals with the threading, but that part is mostly
self-contained. In v4 I have not changed the overall approach to the
problem.

- Fredrik

      parent reply	other threads:[~2010-01-25 22:47 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-18 10:33 [PATCH v3] Threaded grep Fredrik Kuivinen
2010-01-18 11:11 ` Johannes Sixt
2010-01-18 13:28   ` Fredrik Kuivinen
2010-01-18 13:45     ` Johannes Sixt
2010-01-18 18:05 ` Linus Torvalds
2010-01-19  7:34   ` Fredrik Kuivinen
2010-01-19 15:41     ` Linus Torvalds
     [not found] ` <7vmy0bhxn1.fsf@alter.siamese.dyndns.org>
2010-01-19  0:12   ` Fredrik Kuivinen
2010-01-19  7:03     ` Johannes Sixt
2010-01-25 22:47     ` Fredrik Kuivinen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c8ef71001251447m18a0e7c0y71db3972d0ca1152@mail.gmail.com \
    --to=frekui@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j.sixt@viscovery.net \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).