Git development
 help / color / mirror / Atom feed
* gitattributesLarge: .gitattributes too large to parse
@ 2023-02-28 16:09 Danny Smit
  2023-02-28 19:15 ` Taylor Blau
  0 siblings, 1 reply; 4+ messages in thread
From: Danny Smit @ 2023-02-28 16:09 UTC (permalink / raw)
  To: git

Hello everyone,

I'm running into a problem with git fsck and the .gitattributes file.
With more recent git versions, it reports the following error on my
bare git repository:

$ git --version
git version 2.39.2

$ git -C bare_repo fsck
Checking object directories: 100% (256/256), done.
error in blob 70dc06c1e2e79d8cfa4fb67007edcbb8c941d7e0:
gitattributesLarge: .gitattributes too large to parse
error in blob 7f2a61db90e023cc2a3b180203b7298cd971250d:
gitattributesLarge: .gitattributes too large to parse
Checking objects: 100% (33216/33216), done.
Verifying commits in commit graph: 100% (2024/2024), done.

The files seems to be around 1.5MB in size:

$ git -C bare_repo cat-file -s 70dc06c1e2e79d8cfa4fb67007edcbb8c941d7e0
1579407
$ git -C bare_repo cat-file -s 7f2a61db90e023cc2a3b180203b7298cd971250d
1579652

With a cloned repository, the error is not shown:

$ git -C cloned_repo fsck
Checking object directories: 100% (256/256), done.
Checking objects: 100% (33158/33158), done.

I couldn't find a lot of documentation about the size limitations of
the .gitattributes file, but I did find the change that seems to have
introduced it: https://github.com/git/git/commit/27ab4784d5c9e24345b9f5b443609cbe527c51f9
The change describes that the file needs to be smaller than 100MB, which it is.

Tested on archlinux (rolling release), but also with docker to quickly
verify different versions with a completely clean git configuration:
https://hub.docker.com/r/bitnami/git

The error seems to occur with git version 2.39.2 2.38.3, but not in 2.38.2.

Why is git showing this message if the file isn't too big?
Is there a way to get rid of the message, without updating/deleting
the file and having to rewrite the history in git?

Regards,
Danny

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: gitattributesLarge: .gitattributes too large to parse
  2023-02-28 16:09 gitattributesLarge: .gitattributes too large to parse Danny Smit
@ 2023-02-28 19:15 ` Taylor Blau
  2023-02-28 21:30   ` Jeff King
  0 siblings, 1 reply; 4+ messages in thread
From: Taylor Blau @ 2023-02-28 19:15 UTC (permalink / raw)
  To: Danny Smit; +Cc: git

Hi Danny,

On Tue, Feb 28, 2023 at 05:09:01PM +0100, Danny Smit wrote:
> I couldn't find a lot of documentation about the size limitations of
> the .gitattributes file, but I did find the change that seems to have
> introduced it: https://github.com/git/git/commit/27ab4784d5c9e24345b9f5b443609cbe527c51f9
> The change describes that the file needs to be smaller than 100MB, which it is.

It's interesting that you can cause `fsck` to produce an error in the
bare repository but not in the non-bare one. Do you have
`fsck.gitattributesLarge` set to anything in the non-bare repository?
Are the affected objects in the `fsck.skipList`?

Looking at 27ab4784d5, the comment there says:

    if (!buf || size > ATTR_MAX_FILE_SIZE) {
      /*
       * A missing buffer here is a sign that the caller found the
       * blob too gigantic to load into memory. Let's just consider
       * that an error.
       */
      return report(options, oid, OBJ_BLOB,
                    FSCK_MSG_GITATTRIBUTES_LARGE,
                    ".gitattributes too large to parse");
    }

...so it's possible that the caller indeed found the blob too large to
load into memory, which would cause us to emit the ".gitattributes too
large to parse" fsck error without a .gitattributes file that actually
exceeds 100 MiB in size.

> The error seems to occur with git version 2.39.2 2.38.3, but not in
> 2.38.2.

Indeed, the change there 27ab4784d5 (fsck: implement checks for
gitattributes, 2022-12-01) exists in v2.38.3 and newer, as well as
v2.39.1 and newer. v2.38.2 was released before 27ab4784d5 existed.

> Is there a way to get rid of the message, without updating/deleting
> the file and having to rewrite the history in git?

You can suppress this message with:

    $ git config fsck.gitattributesLarge ignore

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: gitattributesLarge: .gitattributes too large to parse
  2023-02-28 19:15 ` Taylor Blau
@ 2023-02-28 21:30   ` Jeff King
  2023-03-01  9:13     ` Danny Smit
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2023-02-28 21:30 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Danny Smit, git

On Tue, Feb 28, 2023 at 02:15:20PM -0500, Taylor Blau wrote:

> On Tue, Feb 28, 2023 at 05:09:01PM +0100, Danny Smit wrote:
> > I couldn't find a lot of documentation about the size limitations of
> > the .gitattributes file, but I did find the change that seems to have
> > introduced it: https://github.com/git/git/commit/27ab4784d5c9e24345b9f5b443609cbe527c51f9
> > The change describes that the file needs to be smaller than 100MB, which it is.
> 
> It's interesting that you can cause `fsck` to produce an error in the
> bare repository but not in the non-bare one. Do you have
> `fsck.gitattributesLarge` set to anything in the non-bare repository?
> Are the affected objects in the `fsck.skipList`?
> 
> Looking at 27ab4784d5, the comment there says:
> 
>     if (!buf || size > ATTR_MAX_FILE_SIZE) {
>       /*
>        * A missing buffer here is a sign that the caller found the
>        * blob too gigantic to load into memory. Let's just consider
>        * that an error.
>        */
>       return report(options, oid, OBJ_BLOB,
>                     FSCK_MSG_GITATTRIBUTES_LARGE,
>                     ".gitattributes too large to parse");
>     }
> 
> ...so it's possible that the caller indeed found the blob too large to
> load into memory, which would cause us to emit the ".gitattributes too
> large to parse" fsck error without a .gitattributes file that actually
> exceeds 100 MiB in size.

I think that "!buf" case would also trigger if the size exceeded
core.bigFileThreshold. It might be worth checking for that, too.

-Peff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: gitattributesLarge: .gitattributes too large to parse
  2023-02-28 21:30   ` Jeff King
@ 2023-03-01  9:13     ` Danny Smit
  0 siblings, 0 replies; 4+ messages in thread
From: Danny Smit @ 2023-03-01  9:13 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git

On Tue, Feb 28, 2023 at 10:30 PM Jeff King <peff@peff.net> wrote:
>
> On Tue, Feb 28, 2023 at 02:15:20PM -0500, Taylor Blau wrote:
> > You can suppress this message with:
> >
> >    $ git config fsck.gitattributesLarge ignore

Thanks, that is useful to know.

> >
> > It's interesting that you can cause `fsck` to produce an error in the
> > bare repository but not in the non-bare one. Do you have
> > `fsck.gitattributesLarge` set to anything in the non-bare repository?
> > Are the affected objects in the `fsck.skipList`?

The configuration in the non-bare repository is almost clean, except
for the remotes and branch, it only contains the following:

$ git config -l --local
core.repositoryformatversion=0
core.filemode=true
core.bare=false
core.logallrefupdates=true

and:

$ git config -l
color.ui=true
core.autocrlf=input
core.bare=false
core.filemode=true
core.logallrefupdates=true
core.repositoryformatversion=0
fsck.gitattributeslarge=ignore
pull.rebase=true
push.autosetupremote=true
status.showuntrackedfiles=normal

So there is nothing in `fsck.skipList`.

> > Looking at 27ab4784d5, the comment there says:
> >
> >     if (!buf || size > ATTR_MAX_FILE_SIZE) {
> >       /*
> >        * A missing buffer here is a sign that the caller found the
> >        * blob too gigantic to load into memory. Let's just consider
> >        * that an error.
> >        */
> >       return report(options, oid, OBJ_BLOB,
> >                     FSCK_MSG_GITATTRIBUTES_LARGE,
> >                     ".gitattributes too large to parse");
> >     }
> >
> > ...so it's possible that the caller indeed found the blob too large to
> > load into memory, which would cause us to emit the ".gitattributes too
> > large to parse" fsck error without a .gitattributes file that actually
> > exceeds 100 MiB in size.
>
> I think that "!buf" case would also trigger if the size exceeded
> core.bigFileThreshold. It might be worth checking for that, too.

As shown above `core.bigFileThreshold` unset is in the local git configuration.

However, I just found the following configuration in the `config` file
in the  bare repository, where the `core.bigFileThreshold` is defined
with a value of 1m. That seems likely to be the cause:

    [core]
            repositoryformatversion = 0
            filemode = true
            bare = true
            bigFileThreshold = 1m
            logallrefupdates = true
            autocrlf = false
            eol = lf
            symlinks = true
    [gc]
            autodetach = false
            auto = 0

After changing the value here, the error is gone.

Thanks for your help!
Danny

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-03-01  9:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-28 16:09 gitattributesLarge: .gitattributes too large to parse Danny Smit
2023-02-28 19:15 ` Taylor Blau
2023-02-28 21:30   ` Jeff King
2023-03-01  9:13     ` Danny Smit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox