* Re: Possible bug in .gitignore [not found] <CADLV-7+fX7jrC8e_nPBHZfg8yXKpjLfPL3MgxS8peUrr8pqQoA@mail.gmail.com> @ 2024-07-25 4:01 ` KwonHyun Kim 2024-07-26 5:26 ` Jeff King 0 siblings, 1 reply; 2+ messages in thread From: KwonHyun Kim @ 2024-07-25 4:01 UTC (permalink / raw) To: git Hello, I am experimenting with git and I found there is something not working as explain in the document When I place `text_[가나].txt` in `.gitignore` it does not ignore text_가.txt nor text_나.txt I experimented with `text_[ab].txt` and it works fine. So I thought it might work bytewise so I put `text_[\200-\352][\200-\352][\200-\352].txt` with no effect. (가 is "\352\260\200" when core.quotepath is set to true) So I think it must be a bug that is that pattern [abc] or [a-z] does not incorporate non-ascii characters. but I am not sure. Thank you for reading and hope to hear from you guys soon KwH Kim. # ==== Here is my spec PRETTY_NAME="Ubuntu 24.04 LTS" NAME="Ubuntu" VERSION_ID="24.04" VERSION="24.04 LTS (Noble Numbat)" VERSION_CODENAME=noble ID=ubuntu ID_LIKE=debian HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" UBUNTU_CODENAME=noble LOGO=ubuntu-logo git version git version 2.43.0 LANG=ko_KR.UTF-8 LANGUAGE=ko:en LC_CTYPE="ko_KR.UTF-8" LC_NUMERIC=ko_KR.UTF-8 LC_TIME=ko_KR.UTF-8 LC_COLLATE="ko_KR.UTF-8" LC_MONETARY=ko_KR.UTF-8 LC_MESSAGES="ko_KR.UTF-8" LC_PAPER=ko_KR.UTF-8 LC_NAME=ko_KR.UTF-8 LC_ADDRESS=ko_KR.UTF-8 LC_TELEPHONE=ko_KR.UTF-8 LC_MEASUREMENT=ko_KR.UTF-8 LC_IDENTIFICATION=ko_KR.UTF-8 LC_ALL= ^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Possible bug in .gitignore 2024-07-25 4:01 ` Possible bug in .gitignore KwonHyun Kim @ 2024-07-26 5:26 ` Jeff King 0 siblings, 0 replies; 2+ messages in thread From: Jeff King @ 2024-07-26 5:26 UTC (permalink / raw) To: KwonHyun Kim; +Cc: git On Thu, Jul 25, 2024 at 01:01:45PM +0900, KwonHyun Kim wrote: > I am experimenting with git and I found there is something not working > as explain in the document > > When I place `text_[가나].txt` in `.gitignore` it does not ignore > text_가.txt nor text_나.txt > > I experimented with `text_[ab].txt` and it works fine. > > So I thought it might work bytewise so I put > `text_[\200-\352][\200-\352][\200-\352].txt` with no effect. (가 is > "\352\260\200" when core.quotepath is set to true) > > So I think it must be a bug that is that pattern [abc] or [a-z] does > not incorporate non-ascii characters. but I am not sure. The globbing in git is generally done by wildmatch.c, which was imported from rsync. Looking in that file, it looks like it does not support multi-byte characters at all inside brackets. So I don't see a way to make it work except to place the _literal_ bytes making up the utf8 sequence, each inside its own single-byte match. Like: printf 'text_[\352\353][\260\202][\200\230].txt\n' >.gitignore But then your .gitignore file is itself invalid utf8 (not to mention that this is obviously something a user shouldn't have to do). So I guess the fix would be to teach wildmatch.c to recognize and match multi-byte sequences inside []. That probably requires that we assume the pattern and the path are utf8, which will usually be true, but not always. So we might need some kind of config switch there. There are also probably a deep rabbit hole of corner cases there (e.g., NFD vs NFC, matching é versus "e" + combining accent). But I suspect that even recognizing multi-byte sequences as a single char to match would be big improvement. -Peff ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-07-26 5:26 UTC | newest] Thread overview: 2+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <CADLV-7+fX7jrC8e_nPBHZfg8yXKpjLfPL3MgxS8peUrr8pqQoA@mail.gmail.com> 2024-07-25 4:01 ` Possible bug in .gitignore KwonHyun Kim 2024-07-26 5:26 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).