* git format-patch displays weird chars when filename includes non-ascii chars @ 2024-05-14 15:31 Yongmin 2024-05-14 20:44 ` brian m. carlson 0 siblings, 1 reply; 4+ messages in thread From: Yongmin @ 2024-05-14 15:31 UTC (permalink / raw) To: git Hi everybody, When the file name has non-ascii characters, the file name gets mangled somehow. Is this anything from my config side error or something gone weird with git? Steps to reproduce; $ git init $ echo 'BlahBlah' > 테스트.txt $ git add 테스트.txt $ git commit -m 'test commit' $ git format-patch --root 0001-test-commit.patch $ cat 0001-test-commit.patch From d2aa2b2f5aa290edec6a5fd141318a479ac9de8e Mon Sep 17 00:00:00 2001 From: Yongmin Hong <revi@omglol.email> Date: Tue, 14 May 2024 15:15:52 +0000 Subject: [PATCH] test commit --- "\355\205\214\354\212\244\355\212\270.txt" | 1 + 1 file changed, 1 insertion(+) create mode 100644 "\355\205\214\354\212\244\355\212\270.txt" diff --git "a/\355\205\214\354\212\244\355\212\270.txt" "b/\355\205\214\354\212\244\355\212\270.txt" new file mode 100644 index 0000000..86724be --- /dev/null +++ "b/\355\205\214\354\212\244\355\212\270.txt" @@ -0,0 +1 @@ +BlahBlah -- 2.32.7 I searched a bit with the keyword 'format-patch ascii' but couldn't find anything useful. Thanks in advance! ---- revi | 레비 - [he/him](https://en.pronouns.page/@revi) - [What time is it in my timezone](https://issuetracker.revi.xyz/u/time) https://revi.xyz ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: git format-patch displays weird chars when filename includes non-ascii chars 2024-05-14 15:31 git format-patch displays weird chars when filename includes non-ascii chars Yongmin @ 2024-05-14 20:44 ` brian m. carlson 2024-05-14 21:38 ` Junio C Hamano 0 siblings, 1 reply; 4+ messages in thread From: brian m. carlson @ 2024-05-14 20:44 UTC (permalink / raw) To: Yongmin; +Cc: git [-- Attachment #1: Type: text/plain, Size: 2205 bytes --] On 2024-05-14 at 15:31:43, Yongmin wrote: > Hi everybody, > > When the file name has non-ascii characters, the file name gets mangled somehow. Is this anything from my config side error or something gone weird with git? > > Steps to reproduce; > $ git init > $ echo 'BlahBlah' > 테스트.txt > $ git add 테스트.txt > $ git commit -m 'test commit' > $ git format-patch --root > 0001-test-commit.patch > $ cat 0001-test-commit.patch > > From d2aa2b2f5aa290edec6a5fd141318a479ac9de8e Mon Sep 17 00:00:00 2001 > From: Yongmin Hong <revi@omglol.email> > Date: Tue, 14 May 2024 15:15:52 +0000 > Subject: [PATCH] test commit > > --- > "\355\205\214\354\212\244\355\212\270.txt" | 1 + > 1 file changed, 1 insertion(+) > create mode 100644 "\355\205\214\354\212\244\355\212\270.txt" > > diff --git "a/\355\205\214\354\212\244\355\212\270.txt" "b/\355\205\214\354\212\244\355\212\270.txt" In some cases, Git uses escaped strings (often octal[0]) to avoid problems with encoding when sending patches over email or producing unambiguous output. For example, the file name "\r\n.txt" would definitely break sending over email. In addition, while it appears that you're using UTF-8, which is great, Git does not require file names to be in UTF-8, and it's valid to specify 0xfe and 0xff (among other byte values) in file names in Git. However, if we wrote those bytes in the body of an email, many users would be upset when reviewing the patches, since they will usually want to write their emails in UTF-8, and it's possible the patches might get mishandled or mangled by a mail server or mail client. Thus, Git prefers to encode names in a way that is unambiguous and doesn't lead to mangling. It is inconvenient that legitimate UTF-8 file names don't get rendered properly, though. I don't _believe_ there's an option to show the regular UTF-8, but I could be wrong. [0] I don't know why we chose octal and I'd much prefer hexadecimal, but I wonder if it may have originally been to pipe to printf(1), which POSIX requires to accept octal, but unfortunately not hexadecimal, escapes. -- brian m. carlson (they/them or he/him) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: git format-patch displays weird chars when filename includes non-ascii chars 2024-05-14 20:44 ` brian m. carlson @ 2024-05-14 21:38 ` Junio C Hamano 2024-05-15 5:40 ` Yongmin 0 siblings, 1 reply; 4+ messages in thread From: Junio C Hamano @ 2024-05-14 21:38 UTC (permalink / raw) To: brian m. carlson; +Cc: Yongmin, git "brian m. carlson" <sandals@crustytoothpaste.net> writes: > Thus, Git prefers to encode names in a way that is unambiguous and > doesn't lead to mangling. It is inconvenient that legitimate UTF-8 file > names don't get rendered properly, though. I don't _believe_ there's an > option to show the regular UTF-8, but I could be wrong. $ git config --global core.quotepath false ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: git format-patch displays weird chars when filename includes non-ascii chars 2024-05-14 21:38 ` Junio C Hamano @ 2024-05-15 5:40 ` Yongmin 0 siblings, 0 replies; 4+ messages in thread From: Yongmin @ 2024-05-15 5:40 UTC (permalink / raw) To: Junio C Hamano, brian m. carlson; +Cc: git On 2024-05-15 (Wed) 06:38:21+09:00, Junio C Hamano <gitster@pobox.com> wrote: > > $ git config --global core.quotepath false Thanks! It now works as desired. ---- revi | 레비 - he/him <https://en.pronouns.page/@revi> - What time is it in my timezone? <https://issuetracker.revi.xyz/u/time> https://revi.xyz ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-05-15 5:40 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-14 15:31 git format-patch displays weird chars when filename includes non-ascii chars Yongmin 2024-05-14 20:44 ` brian m. carlson 2024-05-14 21:38 ` Junio C Hamano 2024-05-15 5:40 ` Yongmin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).