git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Matheus Tavares <matheus.bernardino@usp.br>,
	Git Mailing List <git@vger.kernel.org>,
	Thomas Gummerer <t.gummerer@gmail.com>,
	Christian Couder <christian.couder@gmail.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [WIP RFC PATCH v2 5/5] clone: use dir-iterator to avoid explicit dir traversal
Date: Tue, 26 Feb 2019 13:50:40 +0100	[thread overview]
Message-ID: <87tvgqvhlr.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <CACsJy8B1asF3i+G-C1aZRw7QTW7jS+a4VkCbg-17YOTyYHuw5w@mail.gmail.com>


On Tue, Feb 26 2019, Duy Nguyen wrote:

> On Tue, Feb 26, 2019 at 12:18 PM Matheus Tavares
> <matheus.bernardino@usp.br> wrote:
>>
>> Replace usage of opendir/readdir/closedir API to traverse directories
>> recursively, at copy_or_link_directory function, by the dir-iterator
>> API. This simplifies the code and avoid recursive calls to
>> copy_or_link_directory.
>>
>> This process also makes copy_or_link_directory call die() in case of an
>> error on readdir or stat, inside dir_iterator_advance. Previously it
>> would just print a warning for errors on stat and ignore errors on
>> readdir, which isn't nice because a local git clone would end up
>> successfully even though the .git/objects copy didn't fully succeeded.
>>
>> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
>> ---
>> I can also make the change described in the last paragraph in a separate
>> patch before this one, but I would have to undo it in this patch because
>> dir-iterator already implements it. So, IMHO, it would be just noise
>> and not worthy.
>>
>>  builtin/clone.c | 45 +++++++++++++++++++++++----------------------
>>  1 file changed, 23 insertions(+), 22 deletions(-)
>>
>> diff --git a/builtin/clone.c b/builtin/clone.c
>> index fd580fa98d..b23ba64c94 100644
>> --- a/builtin/clone.c
>> +++ b/builtin/clone.c
>> @@ -23,6 +23,8 @@
>>  #include "transport.h"
>>  #include "strbuf.h"
>>  #include "dir.h"
>> +#include "dir-iterator.h"
>> +#include "iterator.h"
>>  #include "sigchain.h"
>>  #include "branch.h"
>>  #include "remote.h"
>> @@ -411,42 +413,37 @@ static void mkdir_if_missing(const char *pathname, mode_t mode)
>>  }
>>
>>  static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
>> -                                  const char *src_repo, int src_baselen)
>> +                                  const char *src_repo)
>>  {
>> -       struct dirent *de;
>> -       struct stat buf;
>>         int src_len, dest_len;
>> -       DIR *dir;
>> -
>> -       dir = opendir(src->buf);
>> -       if (!dir)
>> -               die_errno(_("failed to open '%s'"), src->buf);
>> +       struct dir_iterator *iter;
>> +       int iter_status;
>> +       struct stat st;
>> +       unsigned flags;
>>
>>         mkdir_if_missing(dest->buf, 0777);
>>
>> +       flags = DIR_ITERATOR_PEDANTIC | DIR_ITERATOR_FOLLOW_SYMLINKS;
>> +       iter = dir_iterator_begin(src->buf, flags);
>> +
>>         strbuf_addch(src, '/');
>>         src_len = src->len;
>>         strbuf_addch(dest, '/');
>>         dest_len = dest->len;
>>
>> -       while ((de = readdir(dir)) != NULL) {
>> +       while ((iter_status = dir_iterator_advance(iter)) == ITER_OK) {
>>                 strbuf_setlen(src, src_len);
>> -               strbuf_addstr(src, de->d_name);
>> +               strbuf_addstr(src, iter->relative_path);
>>                 strbuf_setlen(dest, dest_len);
>> -               strbuf_addstr(dest, de->d_name);
>> -               if (stat(src->buf, &buf)) {
>> -                       warning (_("failed to stat %s\n"), src->buf);
>> -                       continue;
>> -               }
>> -               if (S_ISDIR(buf.st_mode)) {
>> -                       if (!is_dot_or_dotdot(de->d_name))
>> -                               copy_or_link_directory(src, dest,
>> -                                                      src_repo, src_baselen);
>> +               strbuf_addstr(dest, iter->relative_path);
>> +
>> +               if (S_ISDIR(iter->st.st_mode)) {
>> +                       mkdir_if_missing(dest->buf, 0777);
>
> I wonder if this mkdir_if_missing is sufficient. What if you have to
> create multiple directories?
>
> Let's say the first advance, we hit "a". The the second advance we hit
> directory "b/b/b/b", we would need to mkdir recursively and something
> like safe_create_leading_directories() would be a better fit.
>
> I'm not sure if it can happen though. I haven't re-read dir-iterator
> code carefully.

This part isn't a problem. It iterates one level at a time. So given a
structure like a/b/c/d/e/f/g/h/i/j/k/some-l you'll find that if you
instrument the loop in clone.c you get:

    dir = a
    dir = a/b
    dir = a/b/c
    dir = a/b/c/d
    dir = a/b/c/d/e
    dir = a/b/c/d/e/f
    dir = a/b/c/d/e/f/g
    dir = a/b/c/d/e/f/g/h
    dir = a/b/c/d/e/f/g/h/i
    dir = a/b/c/d/e/f/g/h/i/j
    dir = a/b/c/d/e/f/g/h/i/j/k
    dir = a/b/c/d/e/f/g/h/i/j/k/some-l

So it's like the old implementation in that way. It readdir()'s and
walks directories one level at a time.

  reply	other threads:[~2019-02-26 12:50 UTC|newest]

Thread overview: 127+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-26  5:17 [WIP RFC PATCH v2 0/5] clone: dir iterator refactoring with tests Matheus Tavares
2019-02-26  5:18 ` [WIP RFC PATCH v2 1/5] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-02-26 12:01   ` Duy Nguyen
2019-02-27 13:59     ` Matheus Tavares Bernardino
2019-02-26  5:18 ` [WIP RFC PATCH v2 2/5] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-02-26  5:18 ` [WIP RFC PATCH v2 3/5] clone: copy hidden paths at local clone Matheus Tavares
2019-02-26 12:13   ` Duy Nguyen
2019-02-26  5:18 ` [WIP RFC PATCH v2 4/5] clone: extract function from copy_or_link_directory Matheus Tavares
2019-02-26 12:18   ` Duy Nguyen
2019-02-27 17:30     ` Matheus Tavares Bernardino
2019-02-27 22:45       ` Thomas Gummerer
2019-02-27 22:50         ` Matheus Tavares Bernardino
2019-02-26  5:18 ` [WIP RFC PATCH v2 5/5] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-02-26 11:35   ` Ævar Arnfjörð Bjarmason
2019-02-26 12:32   ` Duy Nguyen
2019-02-26 12:50     ` Ævar Arnfjörð Bjarmason [this message]
2019-02-27 17:40     ` Matheus Tavares Bernardino
2019-02-28  7:13       ` Duy Nguyen
2019-02-28  7:53       ` Ævar Arnfjörð Bjarmason
2019-02-26 11:36 ` [WIP RFC PATCH v2 0/5] clone: dir iterator refactoring with tests Ævar Arnfjörð Bjarmason
2019-02-26 12:20   ` Duy Nguyen
2019-02-26 12:28 ` [RFC PATCH v3 " Ævar Arnfjörð Bjarmason
2019-02-26 20:56   ` Matheus Tavares Bernardino
2019-03-22 23:22   ` [GSoC][PATCH v4 0/7] clone: dir-iterator " Matheus Tavares
2019-03-22 23:22     ` [GSoC][PATCH v4 1/7] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-03-24 18:09       ` Matheus Tavares Bernardino
2019-03-24 20:56       ` SZEDER Gábor
2019-03-26 19:43         ` Matheus Tavares Bernardino
2019-03-28 21:49       ` Thomas Gummerer
2019-03-29 14:06         ` Matheus Tavares Bernardino
2019-03-29 19:31           ` Thomas Gummerer
2019-03-29 19:42             ` SZEDER Gábor
2019-03-30  2:49             ` Matheus Tavares Bernardino
2019-03-22 23:22     ` [GSoC][PATCH v4 2/7] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-03-28 22:10       ` Thomas Gummerer
2019-03-29  8:38         ` Ævar Arnfjörð Bjarmason
2019-03-29 20:15           ` Thomas Gummerer
2019-03-29 14:27         ` Matheus Tavares Bernardino
2019-03-29 20:05           ` Thomas Gummerer
2019-03-30  5:32             ` Matheus Tavares Bernardino
2019-03-30 19:27               ` Thomas Gummerer
2019-04-01  3:56                 ` Matheus Tavares Bernardino
2019-03-29 15:40         ` Johannes Schindelin
2019-03-22 23:22     ` [GSoC][PATCH v4 3/7] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-03-28 22:19       ` Thomas Gummerer
2019-03-29 13:16         ` Matheus Tavares Bernardino
2019-03-22 23:22     ` [GSoC][PATCH v4 4/7] clone: copy hidden paths at local clone Matheus Tavares
2019-03-22 23:22     ` [GSoC][PATCH v4 5/7] clone: extract function from copy_or_link_directory Matheus Tavares
2019-03-22 23:22     ` [GSoC][PATCH v4 6/7] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-03-22 23:22     ` [GSoC][PATCH v4 7/7] clone: Replace strcmp by fspathcmp Matheus Tavares
2019-03-30 22:49     ` [GSoC][PATCH v5 0/7] clone: dir-iterator refactoring with tests Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 1/7] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 2/7] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-03-31 17:40         ` Thomas Gummerer
2019-04-01  3:59           ` Matheus Tavares Bernardino
2019-03-30 22:49       ` [GSoC][PATCH v5 3/7] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-03-31 18:12         ` Thomas Gummerer
2019-04-10 20:24           ` Matheus Tavares Bernardino
2019-04-11 21:09             ` Thomas Gummerer
2019-04-23 17:07               ` Matheus Tavares Bernardino
2019-04-24 18:36                 ` Thomas Gummerer
2019-04-26  4:13                   ` Matheus Tavares Bernardino
2019-03-30 22:49       ` [GSoC][PATCH v5 4/7] clone: copy hidden paths at local clone Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 5/7] clone: extract function from copy_or_link_directory Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 6/7] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 7/7] clone: replace strcmp by fspathcmp Matheus Tavares
2019-03-31 18:16       ` [GSoC][PATCH v5 0/7] clone: dir-iterator refactoring with tests Thomas Gummerer
2019-04-01 13:56         ` Matheus Tavares Bernardino
2019-05-02 14:48       ` [GSoC][PATCH v6 00/10] " Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 01/10] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 02/10] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 03/10] dir-iterator: add tests for dir-iterator API Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 04/10] dir-iterator: use warning_errno when possible Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 05/10] dir-iterator: refactor state machine model Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 06/10] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 07/10] clone: copy hidden paths at local clone Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 08/10] clone: extract function from copy_or_link_directory Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 09/10] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 10/10] clone: replace strcmp by fspathcmp Matheus Tavares
2019-06-18 23:27         ` [GSoC][PATCH v7 00/10] clone: dir-iterator refactoring with tests Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 01/10] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 02/10] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 03/10] dir-iterator: add tests for dir-iterator API Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 04/10] dir-iterator: use warning_errno when possible Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 05/10] dir-iterator: refactor state machine model Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 06/10] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-06-25 18:00             ` Junio C Hamano
2019-06-25 18:11               ` Matheus Tavares Bernardino
2019-06-26 13:34             ` Johannes Schindelin
2019-06-26 18:04               ` Junio C Hamano
2019-06-27  9:20                 ` Duy Nguyen
2019-06-27 17:23                 ` Matheus Tavares Bernardino
2019-06-27 18:48                   ` Johannes Schindelin
2019-06-27 19:33                     ` Matheus Tavares Bernardino
2019-06-28 12:51                       ` Johannes Schindelin
2019-06-28 14:16                         ` Matheus Tavares Bernardino
2019-07-01 12:15                           ` Johannes Schindelin
2019-07-03  8:57             ` SZEDER Gábor
2019-07-08 22:21               ` Matheus Tavares Bernardino
2019-06-18 23:27           ` [GSoC][PATCH v7 07/10] clone: copy hidden paths at local clone Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 08/10] clone: extract function from copy_or_link_directory Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 09/10] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 10/10] clone: replace strcmp by fspathcmp Matheus Tavares
2019-06-19  4:36           ` [GSoC][PATCH v7 00/10] clone: dir-iterator refactoring with tests Matheus Tavares Bernardino
2019-06-20 20:18           ` Junio C Hamano
2019-06-21 13:41             ` Matheus Tavares Bernardino
2019-07-10 23:58           ` [GSoC][PATCH v8 " Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 01/10] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 02/10] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 03/10] dir-iterator: add tests for dir-iterator API Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 04/10] dir-iterator: use warning_errno when possible Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 05/10] dir-iterator: refactor state machine model Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 06/10] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 07/10] clone: copy hidden paths at local clone Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 08/10] clone: extract function from copy_or_link_directory Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 09/10] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 10/10] clone: replace strcmp by fspathcmp Matheus Tavares
2019-07-11 11:56             ` [GSoC][PATCH v8 00/10] clone: dir-iterator refactoring with tests Johannes Schindelin
2019-07-11 15:24               ` Matheus Tavares Bernardino
2019-02-26 12:28 ` [RFC PATCH v3 1/5] clone: test for our behavior on odd objects/* content Ævar Arnfjörð Bjarmason
2019-02-28 21:19   ` Matheus Tavares Bernardino
2019-03-01 13:49     ` Ævar Arnfjörð Bjarmason
2019-03-13  3:17       ` Matheus Tavares
2019-02-26 12:28 ` [RFC PATCH v3 2/5] dir-iterator: add flags parameter to dir_iterator_begin Ævar Arnfjörð Bjarmason
2019-02-26 12:28 ` [RFC PATCH v3 3/5] clone: copy hidden paths at local clone Ævar Arnfjörð Bjarmason
2019-02-26 12:28 ` [RFC PATCH v3 4/5] clone: extract function from copy_or_link_directory Ævar Arnfjörð Bjarmason
2019-02-26 12:28 ` [RFC PATCH v3 5/5] clone: use dir-iterator to avoid explicit dir traversal Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tvgqvhlr.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=matheus.bernardino@usp.br \
    --cc=pclouds@gmail.com \
    --cc=t.gummerer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).