git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCHv2 0/8] gitweb: Faster and improved project search
Date: Thu, 16 Feb 2012 22:53:47 +0100	[thread overview]
Message-ID: <201202162253.49000.jnareb@gmail.com> (raw)
In-Reply-To: <7vmx8iy0ix.fsf@alter.siamese.dyndns.org>

On Thu, 16 Feb 2012, Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
> > [Cc-ing Junio because of his involvement in discussion about first
> >  patch in previous version of this series.]
> >
> > First three patches in this series are mainly about speeding up
> > project search (and perhaps in the future also project pagination).
> > Well, first one is unification, refactoring and future-proofing.
> > The second and third patch could be squashed together; second adds
> > @fill_only, but third actually uses it.
> >
> > Next set of patches is about highlighting matched part, making it
> > easier to recognize why project was selected, what we were searching
> > for (though better page title would also help second issue).
> >
> > Well, fourth patch (first in set mentioned above) is here for the
> > commit message, otherwise it could have been squashed with next one.
> >
> > Last patch in this series is beginning of using esc_html_match_hl()
> > for other searches in gitweb -- the easiest part.
> 
> Notice that you never said anything about what you wanted to achieve with
> this entire series?  " -- the easiest part." does not mean anything.
> The easiest part of what?

Gitweb, besides project search, supports searching in a give repository
of a commit message ('commit', 'author', 'committer'), of files ('grep')
and of changes ('pickaxe').  From those 'grep' was easiest.  This is
mentioned in comments in 8/8.

What I'd like to achieve is unification of match highlighting code, which
reduces code duplication.  esc_html_match_hl() is also better than current
hand-written code: it highlights all matches, and not only last match.

> Where in the series do the "faster" and "improved" come from?  What do you
> exactly mean by "faster" and "improved"?  In which commit would we find
> the answers to the questions like:
> 
>  - What operation was slow and how you tackled that slowness?
>  - What are the benchmark results?

Those series make project search faster because they reduce number of
git commands that gitweb runs when generating project search results.
Before number of commands was 2*<number of projects> + 1, after 3/8
it is 2*<number of results> + 1.


For search that selects 2 repositories out of 12 total

* Before (warm cache):

  "This page took 0.867151 seconds  and 27 git commands to generate."

* After (warm cache):
 
  "This page took 0.673643 seconds  and 5 git commands to generate."

The improvement in performance is even better for larger number of
repositories, and for cold cache / evicted cache.


Should I do benchmark of git_project_list_body() subroutine before
and after?

> In general, "improve" is such a loaded word (after all, patches sent to
> the list are almost always meant to "improve" things) that it almost does
> not convey a single bit of information.  Are you fixing bugs?  Are you
> tidying up an unreadable piece of code?  Are you fixing styles?  Are you
> producing prettier output?  Are you refactoring cut-and-pasted repetition
> into a common helper function?  Are you adding a new feature?

I'm sorry about vague "improved".  What I meant is that I have added
highlighting of matched part in project search, just like e.g. Google
does unobtrusive highlighting of search phrase in page snippets.

This is especially important because project search matches either
project name or project description, and it might be not obvious why
the match... especially that match on description might be in shortened
chopped-out part.


I'll improve commit messages and cover letter in a re-roll.
-- 
Jakub Narebski
Poland

  reply	other threads:[~2012-02-16 21:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-15 20:38 [PATCHv2 0/8] gitweb: Faster and improved project search Jakub Narebski
2012-02-15 20:38 ` [PATCHv2 1/8] gitweb: Refactor checking if part of project info need filling Jakub Narebski
2012-02-15 20:38 ` [PATCHv2 2/8] gitweb: Option for filling only specified info in fill_project_list_info Jakub Narebski
2012-02-20  6:58   ` Junio C Hamano
2012-02-22 22:05     ` Jakub Narebski
2012-02-15 20:38 ` [PATCHv2 3/8] gitweb: Faster project search Jakub Narebski
2012-02-20  8:33   ` Junio C Hamano
2012-02-15 20:38 ` [PATCHv2 4/8] gitweb: Introduce esc_html_hl_regions Jakub Narebski
2012-02-15 20:38 ` [PATCHv2 5/8] gitweb: Highlight matched part of project name when searching projects Jakub Narebski
2012-02-15 20:38 ` [PATCHv2 6/8] gitweb: Highlight matched part of project description " Jakub Narebski
2012-02-15 20:38 ` [PATCHv3/RFC 7/8] gitweb: Highlight matched part of shortened project description Jakub Narebski
2012-02-15 20:38 ` [PATCHv2 8/8] gitweb: Use esc_html_match_hl() in 'grep' search Jakub Narebski
2012-02-16 20:40 ` [PATCHv2 0/8] gitweb: Faster and improved project search Junio C Hamano
2012-02-16 21:53   ` Jakub Narebski [this message]
2012-02-20  7:09 ` Junio C Hamano
2012-02-22 22:09   ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201202162253.49000.jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).