* [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI @ 2008-09-17 21:07 Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding Alexander Gavrilov ` (2 more replies) 0 siblings, 3 replies; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-17 21:07 UTC (permalink / raw) To: git; +Cc: Shawn O. Pearce, Paul Mackerras Currently GUI tools don't provide any support for viewing files that contain non-ASCII characters. This set of patches addresses that issue. The git-gui part of the series is based on patches that are currently in the 'pu' branch of the git-gui repository. File encoding can be specified in the following ways: 1) It defaults to the current locale encoding. 2) It can be overridden by setting the gui.encoding option. 3) It can be further set on per-file basis by specifying the 'encoding' attribute in gitattributes. 4) Finally, git-gui allows directly selecting encodings through a popup menu. The last item is a necessity to compensate for the fact that now staging/unstaging hunks would not work, if the selected encoding is incompatible with the actual data: While most single-byte character sets are mutually compatible, all multibyte encodings, including utf-8, have the concept of invalid byte sequences, and thus are not completely reversible. Because of this, git-gui won't be able to apply hunks that contain such sequences, and the user would have to specify the encoding correctly, or fallback to ISO-8859-1. Since git apparently cannot work with filenames in non-locale encodings anyway, I did not try to do anything about it apart from fixing some obvious bugs. There are also some bugs in handling of commit encodings in gitk, but they are out of the scope of this series. GIT-GUI: git-gui: Cleanup handling of the default encoding. --- git-gui.sh | 1 + lib/blame.tcl | 2 +- lib/diff.tcl | 11 ++++++----- lib/encoding.tcl | 14 ++++++++++++++ lib/option.tcl | 24 ++++++++++++++++++++++++ 5 files changed, 46 insertions(+), 6 deletions(-) git-gui: Add a menu of available encodings. --- lib/encoding.tcl | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++++- lib/option.tcl | 13 +++++- 2 files changed, 142 insertions(+), 4 deletions(-) git-gui: Allow forcing display encoding for diffs using a submenu. --- git-gui.sh | 8 ++++++++ lib/diff.tcl | 9 +++++++++ lib/encoding.tcl | 29 +++++++++++++++++++++++++++-- 3 files changed, 44 insertions(+), 2 deletions(-) git-gui: Optimize encoding name resolution using a lookup table. --- lib/encoding.tcl | 82 +++++++++++++++++++++++++++++++++++------------------- 1 files changed, 53 insertions(+), 29 deletions(-) git-gui: Support the encoding menu in gui blame. --- lib/blame.tcl | 17 +++++++++++++++++ 1 files changed, 17 insertions(+), 0 deletions(-) GITK: gitk: Port new encoding logic from git-gui. --- gitk | 50 +++++++++++++++++++++++++++++++++++++++++++++++--- 1 files changed, 47 insertions(+), 3 deletions(-) gitk: Implement file contents encoding support. --- gitk | 19 ++++++++++++++++--- 1 files changed, 16 insertions(+), 3 deletions(-) gitk: Support filenames in the locale encoding. --- gitk | 16 +++++++++++----- 1 files changed, 11 insertions(+), 5 deletions(-) ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding. 2008-09-17 21:07 [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI Alexander Gavrilov @ 2008-09-17 21:07 ` Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 2/8] git-gui: Add a menu of available encodings Alexander Gavrilov 2008-09-18 15:02 ` [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding Dmitry Potapov 2008-09-17 21:45 ` [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI Paul Mackerras 2008-09-21 22:55 ` Paul Mackerras 2 siblings, 2 replies; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-17 21:07 UTC (permalink / raw) To: git; +Cc: Shawn O. Pearce, Paul Mackerras - Make diffs and blame default to the system (locale) encoding instead of hard-coding UTF-8. - Add a gui.encoding option to allow overriding it. - gitattributes still have the final word. The rationale for this is Windows support: 1) Windows people are accustomed to using legacy encodings for text files. For many of them defaulting to utf-8 will be counter-intuitive. 2) Windows doesn't support utf-8 locales, and switching the system encoding is a real pain. Thus the option. This patch also adds proper encoding conversion to Apply Hunk/Line. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> --- git-gui.sh | 1 + lib/blame.tcl | 2 +- lib/diff.tcl | 11 ++++++----- lib/encoding.tcl | 14 ++++++++++++++ lib/option.tcl | 24 ++++++++++++++++++++++++ 5 files changed, 46 insertions(+), 6 deletions(-) diff --git a/git-gui.sh b/git-gui.sh index 91457a2..444990b 100755 --- a/git-gui.sh +++ b/git-gui.sh @@ -678,6 +678,7 @@ set default_config(merge.verbosity) 2 set default_config(user.name) {} set default_config(user.email) {} +set default_config(gui.encoding) [encoding system] set default_config(gui.matchtrackingbranch) false set default_config(gui.pruneduringfetch) false set default_config(gui.trustmtime) false diff --git a/lib/blame.tcl b/lib/blame.tcl index 7535adb..84d55b5 100644 --- a/lib/blame.tcl +++ b/lib/blame.tcl @@ -402,7 +402,7 @@ method _load {jump} { fconfigure $fd \ -blocking 0 \ -translation lf \ - -encoding [tcl_encoding [gitattr $path encoding UTF-8]] + -encoding [get_path_encoding $path] fileevent $fd readable [cb _read_file $fd $jump] set current_fd $fd } diff --git a/lib/diff.tcl b/lib/diff.tcl index b0ecfbc..8fefc5d 100644 --- a/lib/diff.tcl +++ b/lib/diff.tcl @@ -164,11 +164,10 @@ proc show_other_diff {path w m scroll_pos} { set sz [string length $content] } file { - set enc [gitattr $path encoding UTF-8] set fd [open $path r] fconfigure $fd \ -eofchar {} \ - -encoding [tcl_encoding $enc] + -encoding [get_path_encoding $path] set content [read $fd $max_sz] close $fd set sz [file size $path] @@ -282,7 +281,7 @@ proc start_show_diff {scroll_pos {add_opts {}}} { set ::current_diff_inheader 1 fconfigure $fd \ -blocking 0 \ - -encoding [tcl_encoding [gitattr $path encoding UTF-8]] \ + -encoding [get_path_encoding $path] \ -translation lf fileevent $fd readable [list read_diff $fd $scroll_pos] } @@ -435,8 +434,9 @@ proc apply_hunk {x y} { } if {[catch { + set enc [get_path_encoding $current_diff_path] set p [eval git_write $apply_cmd] - fconfigure $p -translation binary -encoding binary + fconfigure $p -translation binary -encoding $enc puts -nonewline $p $current_diff_header puts -nonewline $p [$ui_diff get $s_lno $e_lno] close $p} err]} { @@ -604,8 +604,9 @@ proc apply_line {x y} { set patch "@@ -$hln,$n +$hln,[eval expr $n $sign 1] @@\n$patch" if {[catch { + set enc [get_path_encoding $current_diff_path] set p [eval git_write $apply_cmd] - fconfigure $p -translation binary -encoding binary + fconfigure $p -translation binary -encoding $enc puts -nonewline $p $current_diff_header puts -nonewline $p $patch close $p} err]} { diff --git a/lib/encoding.tcl b/lib/encoding.tcl index 7f06b0d..e186b0c 100644 --- a/lib/encoding.tcl +++ b/lib/encoding.tcl @@ -274,3 +274,17 @@ proc tcl_encoding {enc} { } return {} } + +proc get_path_encoding {path} { + set tcl_enc [tcl_encoding [get_config gui.encoding]] + if {$tcl_enc eq {}} { + set tcl_enc [encoding system] + } + if {$path ne {}} { + set enc2 [tcl_encoding [gitattr $path encoding $tcl_enc]] + if {$enc2 ne {}} { + set tcl_enc $enc2 + } + } + return $tcl_enc +} diff --git a/lib/option.tcl b/lib/option.tcl index 9b865f6..40af44e 100644 --- a/lib/option.tcl +++ b/lib/option.tcl @@ -1,6 +1,28 @@ # git-gui options editor # Copyright (C) 2006, 2007 Shawn Pearce +proc config_check_encodings {} { + global repo_config_new global_config_new + + set enc $global_config_new(gui.encoding) + if {$enc eq {}} { + set global_config_new(gui.encoding) [encoding system] + } elseif {[tcl_encoding $enc] eq {}} { + error_popup [mc "Invalid global encoding '%s'" $enc] + return 0 + } + + set enc $repo_config_new(gui.encoding) + if {$enc eq {}} { + set repo_config_new(gui.encoding) [encoding system] + } elseif {[tcl_encoding $enc] eq {}} { + error_popup [mc "Invalid repo encoding '%s'" $enc] + return 0 + } + + return 1 +} + proc save_config {} { global default_config font_descs global repo_config global_config @@ -130,6 +152,7 @@ proc do_options {} { {i-1..99 gui.diffcontext {mc "Number of Diff Context Lines"}} {i-0..99 gui.commitmsgwidth {mc "Commit Message Text Width"}} {t gui.newbranchtemplate {mc "New Branch Name Template"}} + {t gui.encoding {mc "Default File Contents Encoding"}} } { set type [lindex $option 0] set name [lindex $option 1] @@ -275,6 +298,7 @@ proc do_restore_defaults {} { } proc do_save_config {w} { + if {![config_check_encodings]} return if {[catch {save_config} err]} { error_popup [strcat [mc "Failed to completely save options:"] "\n\n$err"] } -- 1.6.0.20.g6148bc ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH (GIT-GUI,GITK) 2/8] git-gui: Add a menu of available encodings. 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding Alexander Gavrilov @ 2008-09-17 21:07 ` Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 3/8] git-gui: Allow forcing display encoding for diffs using a submenu Alexander Gavrilov 2008-09-18 15:02 ` [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding Dmitry Potapov 1 sibling, 1 reply; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-17 21:07 UTC (permalink / raw) To: git; +Cc: Shawn O. Pearce, Paul Mackerras To make encoding selection easier, add a menu that lists available encodings to the Options window. Menu structure is borrowed from Firefox. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> --- lib/encoding.tcl | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++++- lib/option.tcl | 13 +++++- 2 files changed, 142 insertions(+), 4 deletions(-) diff --git a/lib/encoding.tcl b/lib/encoding.tcl index e186b0c..2c1eda3 100644 --- a/lib/encoding.tcl +++ b/lib/encoding.tcl @@ -206,7 +206,7 @@ set encoding_aliases { { ISO-8859-16 iso-ir-226 ISO_8859-16:2001 ISO_8859-16 latin10 l10 } { GBK CP936 MS936 windows-936 } { JIS_Encoding csJISEncoding } - { Shift_JIS MS_Kanji csShiftJIS } + { Shift_JIS MS_Kanji csShiftJIS ShiftJIS Shift-JIS } { Extended_UNIX_Code_Packed_Format_for_Japanese csEUCPkdFmtJapanese EUC-JP } { Extended_UNIX_Code_Fixed_Width_for_Japanese csEUCFixWidJapanese } @@ -240,6 +240,52 @@ set encoding_aliases { { Big5 csBig5 } } +set encoding_groups { + {"" "" + {"Unicode" UTF-8} + {"Western" ISO-8859-1}} + {we "West European" + {"Western" ISO-8859-15 CP-437 CP-850 MacRoman CP-1252 Windows-1252} + {"Celtic" ISO-8859-14} + {"Greek" ISO-8859-14 ISO-8859-7 CP-737 CP-869 MacGreek CP-1253 Windows-1253} + {"Icelandic" MacIceland MacIcelandic CP-861} + {"Nordic" ISO-8859-10 CP-865} + {"Portuguese" CP-860} + {"South European" ISO-8859-3}} + {ee "East European" + {"Baltic" CP-775 ISO-8859-4 ISO-8859-13 CP-1257 Windows-1257} + {"Central European" CP-852 ISO-8859-2 MacCE CP-1250 Windows-1250} + {"Croatian" MacCroatian} + {"Cyrillic" CP-855 ISO-8859-5 ISO-IR-111 KOI8-R MacCyrillic CP-1251 Windows-1251} + {"Russian" CP-866} + {"Ukrainian" KOI8-U MacUkraine MacUkrainian} + {"Romanian" ISO-8859-16 MacRomania MacRomanian}} + {ea "East Asian" + {"Generic" ISO-2022} + {"Chinese Simplified" GB2312 GB1988 GB12345 GB2312-RAW GBK EUC-CN GB18030 HZ ISO-2022-CN} + {"Chinese Traditional" Big5 Big5-HKSCS EUC-TW CP-950} + {"Japanese" EUC-JP ISO-2022-JP Shift-JIS JIS-0212 JIS-0208 JIS-0201 CP-932 MacJapan} + {"Korean" EUC-KR UHC JOHAB ISO-2022-KR CP-949 KSC5601}} + {sa "SE & SW Asian" + {"Armenian" ARMSCII-8} + {"Georgian" GEOSTD8} + {"Thai" TIS-620 ISO-8859-11 CP-874 Windows-874 MacThai} + {"Turkish" CP-857 CP857 ISO-8859-9 MacTurkish CP-1254 Windows-1254} + {"Vietnamese" TCVN VISCII VPS CP-1258 Windows-1258} + {"Hindi" MacDevanagari} + {"Gujarati" MacGujarati} + {"Gurmukhi" MacGurmukhi}} + {me "Middle Eastern" + {"Arabic" ISO-8859-6 Windows-1256 CP-1256 CP-864 MacArabic} + {"Farsi" MacFarsi} + {"Hebrew" ISO-8859-8-I Windows-1255 CP-1255 ISO-8859-8 CP-862 MacHebrew}} + {mi "Misc" + {"7-bit" ASCII} + {"16-bit" Unicode} + {"Legacy" CP-863 EBCDIC} + {"Symbol" Symbol Dingbats MacDingbats MacCentEuro}} +} + proc tcl_encoding {enc} { global encoding_aliases set names [encoding names] @@ -248,7 +294,7 @@ proc tcl_encoding {enc} { set i [lsearch -exact $lcnames $enc] if {$i < 0} { # look for "isonnn" instead of "iso-nnn" or "iso_nnn" - if {[regsub {^iso[-_]} $enc iso encx]} { + if {[regsub {^(iso|cp|ibm|jis)[-_]} $enc {\1} encx]} { set i [lsearch -exact $lcnames $encx] } } @@ -260,7 +306,7 @@ proc tcl_encoding {enc} { foreach e $ll { set i [lsearch -exact $lcnames $e] if {$i < 0} { - if {[regsub {^iso[-_]} $e iso ex]} { + if {[regsub {^(iso|cp|ibm|jis)[-_]} $e {\1} ex]} { set i [lsearch -exact $lcnames $ex] } } @@ -288,3 +334,84 @@ proc get_path_encoding {path} { } return $tcl_enc } + +proc build_encoding_submenu {parent grp cmd} { + global used_encodings + + set mid [lindex $grp 0] + set gname [mc [lindex $grp 1]] + + set smenu {} + foreach subset [lrange $grp 2 end] { + set name [mc [lindex $subset 0]] + + foreach enc [lrange $subset 1 end] { + set tcl_enc [tcl_encoding $enc] + if {$tcl_enc eq {}} continue + + if {$smenu eq {}} { + if {$mid eq {}} { + set smenu $parent + } else { + set smenu "$parent.$mid" + menu $smenu + $parent add cascade \ + -label $gname \ + -menu $smenu + } + } + + if {$name ne {}} { + set lbl "$name ($enc)" + } else { + set lbl $enc + } + $smenu add command \ + -label $lbl \ + -command [concat $cmd [list $tcl_enc]] + + lappend used_encodings $tcl_enc + } + } +} + +proc popup_btn_menu {m b} { + tk_popup $m [winfo pointerx $b] [winfo pointery $b] +} + +proc build_encoding_menu {emenu cmd {nodef 0}} { + $emenu configure -postcommand \ + [list do_build_encoding_menu $emenu $cmd $nodef] +} + +proc do_build_encoding_menu {emenu cmd {nodef 0}} { + global used_encodings encoding_groups + + $emenu configure -postcommand {} + + if {!$nodef} { + $emenu add command \ + -label [mc "Default"] \ + -command [concat $cmd [list {}]] + } + set sysenc [encoding system] + $emenu add command \ + -label [mc "System (%s)" $sysenc] \ + -command [concat $cmd [list $sysenc]] + + # Main encoding tree + set used_encodings [list identity] + $emenu add separator + foreach grp $encoding_groups { + build_encoding_submenu $emenu $grp $cmd + } + + # Add unclassified encodings + set unused_grp [list [mc Other]] + foreach enc [encoding names] { + if {[lsearch -exact $used_encodings $enc] < 0} { + lappend unused_grp $enc + } + } + build_encoding_submenu $emenu [list other [mc Other] $unused_grp] $cmd +} diff --git a/lib/option.tcl b/lib/option.tcl index 40af44e..c80c939 100644 --- a/lib/option.tcl +++ b/lib/option.tcl @@ -152,7 +152,7 @@ proc do_options {} { {i-1..99 gui.diffcontext {mc "Number of Diff Context Lines"}} {i-0..99 gui.commitmsgwidth {mc "Commit Message Text Width"}} {t gui.newbranchtemplate {mc "New Branch Name Template"}} - {t gui.encoding {mc "Default File Contents Encoding"}} + {c gui.encoding {mc "Default File Contents Encoding"}} } { set type [lindex $option 0] set name [lindex $option 1] @@ -182,6 +182,7 @@ proc do_options {} { pack $w.$f.$optid.v -side right -anchor e -padx 5 pack $w.$f.$optid -side top -anchor w -fill x } + c - t { frame $w.$f.$optid label $w.$f.$optid.l -text "$text:" @@ -194,6 +195,16 @@ proc do_options {} { pack $w.$f.$optid.v -side left -anchor w \ -fill x -expand 1 \ -padx 5 + if {$type eq {c}} { + menu $w.$f.$optid.m + build_encoding_menu $w.$f.$optid.m \ + [list set ${f}_config_new($name)] 1 + button $w.$f.$optid.b \ + -text [mc "Change"] \ + -command [list popup_btn_menu \ + $w.$f.$optid.m $w.$f.$optid.b] + pack $w.$f.$optid.b -side left -anchor w + } pack $w.$f.$optid -side top -anchor w -fill x } } -- 1.6.0.20.g6148bc ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH (GIT-GUI,GITK) 3/8] git-gui: Allow forcing display encoding for diffs using a submenu. 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 2/8] git-gui: Add a menu of available encodings Alexander Gavrilov @ 2008-09-17 21:07 ` Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 4/8] git-gui: Optimize encoding name resolution using a lookup table Alexander Gavrilov 0 siblings, 1 reply; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-17 21:07 UTC (permalink / raw) To: git; +Cc: Shawn O. Pearce, Paul Mackerras Add a submenu to allow dynamically changing the encoding to use for diffs. Encoding settings are remembered while git-gui runs. The rules are: 1) Encoding set for a specific file overrides gitattributes. 2) Last explicitly set value of the encoding overrides gui.encoding Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> --- git-gui.sh | 8 ++++++++ lib/diff.tcl | 9 +++++++++ lib/encoding.tcl | 29 +++++++++++++++++++++++++++-- 3 files changed, 44 insertions(+), 2 deletions(-) diff --git a/git-gui.sh b/git-gui.sh index 444990b..3bbb4f1 100755 --- a/git-gui.sh +++ b/git-gui.sh @@ -2937,6 +2937,14 @@ proc create_common_diff_popup {ctxm} { -command {incr_font_size font_diff 1} lappend diff_actions [list $ctxm entryconf [$ctxm index last] -state] $ctxm add separator + set emenu $ctxm.enc + menu $emenu + build_encoding_menu $emenu [list force_diff_encoding] + $ctxm add cascade \ + -label [mc "Encoding"] \ + -menu $emenu + lappend diff_actions [list $ctxm entryconf [$ctxm index last] -state] + $ctxm add separator $ctxm add command -label [mc "Options..."] \ -command do_options } diff --git a/lib/diff.tcl b/lib/diff.tcl index 8fefc5d..b616296 100644 --- a/lib/diff.tcl +++ b/lib/diff.tcl @@ -40,6 +40,15 @@ proc reshow_diff {} { } } +proc force_diff_encoding {enc} { + global current_diff_path + + if {$current_diff_path ne {}} { + force_path_encoding $current_diff_path $enc + reshow_diff + } +} + proc handle_empty_diff {} { global current_diff_path file_states file_lists diff --git a/lib/encoding.tcl b/lib/encoding.tcl index 2c1eda3..b2ee38c 100644 --- a/lib/encoding.tcl +++ b/lib/encoding.tcl @@ -321,13 +321,38 @@ proc tcl_encoding {enc} { return {} } +proc force_path_encoding {path enc} { + global path_encoding_overrides last_encoding_override + + set enc [tcl_encoding $enc] + if {$enc eq {}} { + catch { unset last_encoding_override } + catch { unset path_encoding_overrides($path) } + } else { + set last_encoding_override $enc + if {$path ne {}} { + set path_encoding_overrides($path) $enc + } + } +} + proc get_path_encoding {path} { - set tcl_enc [tcl_encoding [get_config gui.encoding]] + global path_encoding_overrides last_encoding_override + + if {[info exists last_encoding_override]} { + set tcl_enc $last_encoding_override + } else { + set tcl_enc [tcl_encoding [get_config gui.encoding]] + } if {$tcl_enc eq {}} { set tcl_enc [encoding system] } if {$path ne {}} { - set enc2 [tcl_encoding [gitattr $path encoding $tcl_enc]] + if {[info exists path_encoding_overrides($path)]} { + set enc2 $path_encoding_overrides($path) + } else { + set enc2 [tcl_encoding [gitattr $path encoding $tcl_enc]] + } if {$enc2 ne {}} { set tcl_enc $enc2 } -- 1.6.0.20.g6148bc ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH (GIT-GUI,GITK) 4/8] git-gui: Optimize encoding name resolution using a lookup table. 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 3/8] git-gui: Allow forcing display encoding for diffs using a submenu Alexander Gavrilov @ 2008-09-17 21:07 ` Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 5/8] git-gui: Support the encoding menu in gui blame Alexander Gavrilov 0 siblings, 1 reply; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-17 21:07 UTC (permalink / raw) To: git; +Cc: Shawn O. Pearce, Paul Mackerras Encoding menu construction does almost a hundred of encoding resolutions, which with the old implementation led to a small but noticeable delay. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> --- lib/encoding.tcl | 82 +++++++++++++++++++++++++++++++++++------------------- 1 files changed, 53 insertions(+), 29 deletions(-) diff --git a/lib/encoding.tcl b/lib/encoding.tcl index b2ee38c..32668fc 100644 --- a/lib/encoding.tcl +++ b/lib/encoding.tcl @@ -286,39 +286,63 @@ set encoding_groups { {"Symbol" Symbol Dingbats MacDingbats MacCentEuro}} } +proc build_encoding_table {} { + global encoding_aliases encoding_lookup_table + + # Prepare the lookup list; cannot use lsort -nocase because + # of compatibility issues with older Tcl (e.g. in msysgit) + set names [list] + foreach item [encoding names] { + lappend names [list [string tolower $item] $item] + } + set names [lsort -ascii -index 0 $names] + # neither can we use lsearch -index + set lnames [list] + foreach item $names { + lappend lnames [lindex $item 0] + } + + foreach grp $encoding_aliases { + set target {} + foreach item $grp { + set i [lsearch -sorted -ascii $lnames \ + [string tolower $item]] + if {$i >= 0} { + set target [lindex $names $i 1] + break + } + } + if {$target eq {}} continue + foreach item $grp { + set encoding_lookup_table([string tolower $item]) $target + } + } + + foreach item $names { + set encoding_lookup_table([lindex $item 0]) [lindex $item 1] + } +} + proc tcl_encoding {enc} { - global encoding_aliases - set names [encoding names] - set lcnames [string tolower $names] - set enc [string tolower $enc] - set i [lsearch -exact $lcnames $enc] - if {$i < 0} { - # look for "isonnn" instead of "iso-nnn" or "iso_nnn" - if {[regsub {^(iso|cp|ibm|jis)[-_]} $enc {\1} encx]} { - set i [lsearch -exact $lcnames $encx] + global encoding_lookup_table + if {$enc eq {}} { + return {} + } + if {![info exists encoding_lookup_table]} { + build_encoding_table } - } - if {$i < 0} { - foreach l $encoding_aliases { - set ll [string tolower $l] - if {[lsearch -exact $ll $enc] < 0} continue - # look through the aliases for one that tcl knows about - foreach e $ll { - set i [lsearch -exact $lcnames $e] - if {$i < 0} { - if {[regsub {^(iso|cp|ibm|jis)[-_]} $e {\1} ex]} { - set i [lsearch -exact $lcnames $ex] - } + set enc [string tolower $enc] + if {![info exists encoding_lookup_table($enc)]} { + # look for "isonnn" instead of "iso-nnn" or "iso_nnn" + if {[regsub {^(iso|cp|ibm|jis)[-_]} $enc {\1} encx]} { + set enc $encx } - if {$i >= 0} break - } - break } - } - if {$i >= 0} { - return [lindex $names $i] - } - return {} + if {[info exists encoding_lookup_table($enc)]} { + return $encoding_lookup_table($enc) + } else { + return {} + } } proc force_path_encoding {path enc} { -- 1.6.0.20.g6148bc ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH (GIT-GUI,GITK) 5/8] git-gui: Support the encoding menu in gui blame. 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 4/8] git-gui: Optimize encoding name resolution using a lookup table Alexander Gavrilov @ 2008-09-17 21:07 ` Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui Alexander Gavrilov 0 siblings, 1 reply; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-17 21:07 UTC (permalink / raw) To: git; +Cc: Shawn O. Pearce, Paul Mackerras Allow dynamically changing the encoding from the blame viewer as well. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> --- lib/blame.tcl | 17 +++++++++++++++++ 1 files changed, 17 insertions(+), 0 deletions(-) diff --git a/lib/blame.tcl b/lib/blame.tcl index 84d55b5..eb61374 100644 --- a/lib/blame.tcl +++ b/lib/blame.tcl @@ -256,9 +256,16 @@ constructor new {i_commit i_path i_jump} { $w.ctxm add command \ -label [mc "Copy Commit"] \ -command [cb _copycommit] + $w.ctxm add separator + menu $w.ctxm.enc + build_encoding_menu $w.ctxm.enc [cb _setencoding] + $w.ctxm add cascade \ + -label [mc "Encoding"] \ + -menu $w.ctxm.enc $w.ctxm add command \ -label [mc "Do Full Copy Detection"] \ -command [cb _fullcopyblame] + $w.ctxm add separator $w.ctxm add command \ -label [mc "Show History Context"] \ -command [cb _gitkcommit] @@ -791,6 +798,16 @@ method _click {cur_w pos} { _showcommit $this $cur_w $lno } +method _setencoding {enc} { + force_path_encoding $path $enc + _load $this [list \ + $highlight_column \ + $highlight_line \ + [lindex [$w_file xview] 0] \ + [lindex [$w_file yview] 0] \ + ] +} + method _load_commit {cur_w cur_d pos} { upvar #0 $cur_d line_data set lno [lindex [split [$cur_w index $pos] .] 0] -- 1.6.0.20.g6148bc ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui. 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 5/8] git-gui: Support the encoding menu in gui blame Alexander Gavrilov @ 2008-09-17 21:07 ` Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 7/8] gitk: Implement file contents encoding support Alexander Gavrilov 2008-09-19 12:10 ` [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui Johannes Sixt 0 siblings, 2 replies; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-17 21:07 UTC (permalink / raw) To: git; +Cc: Shawn O. Pearce, Paul Mackerras Add functions that implement the same logic for file contents encoding as git-gui uses: - Defaults to the system encoding. - Overridden by setting the gui.encoding option. - Further overridden on per-file basis by gitattributes. Also extends the range of supported encoding names. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> --- gitk | 50 +++++++++++++++++++++++++++++++++++++++++++++++--- 1 files changed, 47 insertions(+), 3 deletions(-) diff --git a/gitk b/gitk index 546627f..b210f79 100755 --- a/gitk +++ b/gitk @@ -9726,7 +9726,7 @@ set encoding_aliases { { ISO-8859-16 iso-ir-226 ISO_8859-16:2001 ISO_8859-16 latin10 l10 } { GBK CP936 MS936 windows-936 } { JIS_Encoding csJISEncoding } - { Shift_JIS MS_Kanji csShiftJIS } + { Shift_JIS MS_Kanji csShiftJIS ShiftJIS Shift-JIS } { Extended_UNIX_Code_Packed_Format_for_Japanese csEUCPkdFmtJapanese EUC-JP } { Extended_UNIX_Code_Fixed_Width_for_Japanese csEUCFixWidJapanese } @@ -9768,7 +9768,7 @@ proc tcl_encoding {enc} { set i [lsearch -exact $lcnames $enc] if {$i < 0} { # look for "isonnn" instead of "iso-nnn" or "iso_nnn" - if {[regsub {^iso[-_]} $enc iso encx]} { + if {[regsub {^(iso|cp|ibm|jis)[-_]} $enc {\1} encx]} { set i [lsearch -exact $lcnames $encx] } } @@ -9780,7 +9780,7 @@ proc tcl_encoding {enc} { foreach e $ll { set i [lsearch -exact $lcnames $e] if {$i < 0} { - if {[regsub {^iso[-_]} $e iso ex]} { + if {[regsub {^(iso|cp|ibm|jis)[-_]} $e {\1} ex]} { set i [lsearch -exact $lcnames $ex] } } @@ -9795,6 +9795,45 @@ proc tcl_encoding {enc} { return {} } +proc gitattr {path attr default} { + if {[catch {set r [exec git check-attr $attr -- $path]}]} { + set r unspecified + } else { + set r [join [lrange [split $r :] 2 end] :] + regsub {^ } $r {} r + } + if {$r eq {unspecified}} { + return $default + } + return $r +} + +proc get_path_encoding {path} { + global gui_encoding + set tcl_enc [tcl_encoding $gui_encoding] + if {$tcl_enc eq {}} { + set tcl_enc [encoding system] + } + if {$path ne {}} { + set enc2 [tcl_encoding [gitattr $path encoding $tcl_enc]] + if {$enc2 ne {}} { + set tcl_enc $enc2 + } + } + return $tcl_enc +} + +proc get_cached_encoding {path} { + global path_encoding_cache + if {[info exists path_encoding_cache($path)]} { + return $path_encoding_cache($path) + } else { + set enc [get_path_encoding $path] + set path_encoding_cache($path) $enc + return $enc + } +} + # First check that Tcl/Tk is recent enough if {[catch {package require Tk 8.4} err]} { show_error {} . [mc "Sorry, gitk cannot run with this version of Tcl/Tk.\n\ @@ -9817,6 +9856,11 @@ if {$tclencoding == {}} { puts stderr "Warning: encoding $gitencoding is not supported by Tcl/Tk" } +set gui_encoding [encoding system] +catch { + set gui_encoding [exec git config --get gui.encoding] +} + set mainfont {Helvetica 9} set textfont {Courier 9} set uifont {Helvetica 9 bold} -- 1.6.0.20.g6148bc ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH (GIT-GUI,GITK) 7/8] gitk: Implement file contents encoding support. 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui Alexander Gavrilov @ 2008-09-17 21:07 ` Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 8/8] gitk: Support filenames in the locale encoding Alexander Gavrilov 2008-09-19 12:10 ` [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui Johannes Sixt 1 sibling, 1 reply; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-17 21:07 UTC (permalink / raw) To: git; +Cc: Shawn O. Pearce, Paul Mackerras Use new functions to support using explicit encoding configuration for file contents. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> --- gitk | 19 ++++++++++++++++--- 1 files changed, 16 insertions(+), 3 deletions(-) diff --git a/gitk b/gitk index b210f79..98923e7 100755 --- a/gitk +++ b/gitk @@ -6295,7 +6295,7 @@ proc showfile {f} { return } } - fconfigure $bf -blocking 0 + fconfigure $bf -blocking 0 -encoding [get_cached_encoding $f] filerun $bf [list getblobline $bf $diffids] $ctext config -state normal clear_ctext $commentend @@ -6333,6 +6333,7 @@ proc mergediff {id} { global diffids global parents global diffcontext + global diffencoding global limitdiffs vfilelimit curview set diffmergeid $id @@ -6346,9 +6347,10 @@ proc mergediff {id} { error_popup "[mc "Error getting merge diffs:"] $err" return } - fconfigure $mdf -blocking 0 + fconfigure $mdf -blocking 0 -encoding binary set mdifffd($id) $mdf set np [llength $parents($curview,$id)] + set diffencoding [get_cached_encoding {}] settabs $np filerun $mdf [list getmergediffline $mdf $id $np] } @@ -6356,6 +6358,7 @@ proc mergediff {id} { proc getmergediffline {mdf id np} { global diffmergeid ctext cflist mergemax global difffilestart mdifffd + global diffencoding $ctext conf -state normal set nr 0 @@ -6371,14 +6374,17 @@ proc getmergediffline {mdf id np} { set here [$ctext index "end - 1c"] lappend difffilestart $here add_flist [list $fname] + set diffencoding [get_cached_encoding $fname] set l [expr {(78 - [string length $fname]) / 2}] set pad [string range "----------------------------------------" 1 $l] $ctext insert end "$pad $fname $pad\n" filesep } elseif {[regexp {^@@} $line]} { + set line [encoding convertfrom $diffencoding $line] $ctext insert end "$line\n" hunksep } elseif {[regexp {^[0-9a-f]{40}$} $line] || [regexp {^index} $line]} { # do nothing } else { + set line [encoding convertfrom $diffencoding $line] # parse the prefix - one ' ', '-' or '+' for each parent set spaces {} set minuses {} @@ -6586,6 +6592,7 @@ proc getblobdiffs {ids} { global diffcontext global ignorespace global limitdiffs vfilelimit curview + global diffencoding set cmd [diffcmd $ids "-p -C --no-commit-id -U$diffcontext"] if {$ignorespace} { @@ -6599,7 +6606,8 @@ proc getblobdiffs {ids} { return } set diffinhdr 0 - fconfigure $bdf -blocking 0 + set diffencoding [get_cached_encoding {}] + fconfigure $bdf -blocking 0 -encoding binary set blobdifffd($ids) $bdf filerun $bdf [list getblobdiffline $bdf $diffids] } @@ -6633,6 +6641,7 @@ proc getblobdiffline {bdf ids} { global diffids blobdifffd ctext curdiffstart global diffnexthead diffnextnote difffilestart global diffinhdr treediffs + global diffencoding set nr 0 $ctext conf -state normal @@ -6670,10 +6679,12 @@ proc getblobdiffline {bdf ids} { } else { set fname [string range $line 2 [expr {$i - 1}]] } + set diffencoding [get_cached_encoding $fname] makediffhdr $fname $ids } elseif {[regexp {^@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,[0-9]+)? @@(.*)} \ $line match f1l f1c f2l f2c rest]} { + set line [encoding convertfrom $diffencoding $line] $ctext insert end "$line\n" hunksep set diffinhdr 0 @@ -6693,6 +6704,7 @@ proc getblobdiffline {bdf ids} { if {[string index $fname 0] eq "\""} { set fname [lindex $fname 0] } + set diffencoding [get_cached_encoding $fname] makediffhdr $fname $ids } elseif {[string compare -length 3 $line "---"] == 0} { # do nothing @@ -6704,6 +6716,7 @@ proc getblobdiffline {bdf ids} { $ctext insert end "$line\n" filesep } else { + set line [encoding convertfrom $diffencoding $line] set x [string range $line 0 0] if {$x == "-" || $x == "+"} { set tag [expr {$x == "+"}] -- 1.6.0.20.g6148bc ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH (GIT-GUI,GITK) 8/8] gitk: Support filenames in the locale encoding. 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 7/8] gitk: Implement file contents encoding support Alexander Gavrilov @ 2008-09-17 21:07 ` Alexander Gavrilov 0 siblings, 0 replies; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-17 21:07 UTC (permalink / raw) To: git; +Cc: Shawn O. Pearce, Paul Mackerras Ensure that gitk works properly with non-ASCII names in the system encoding. Apparently git does not support using file names encoded differently from the current locale. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> --- gitk | 16 +++++++++++----- 1 files changed, 11 insertions(+), 5 deletions(-) diff --git a/gitk b/gitk index 98923e7..254faa1 100755 --- a/gitk +++ b/gitk @@ -6228,7 +6228,7 @@ proc gettree {id} { set treepending $id set treefilelist($id) {} set treeidlist($id) {} - fconfigure $gtf -blocking 0 + fconfigure $gtf -blocking 0 -encoding binary filerun $gtf [list gettreeline $gtf $id] } } else { @@ -6250,11 +6250,12 @@ proc gettreeline {gtf id} { set line [string range $line 0 [expr {$i-1}]] if {$diffids ne $nullid2 && [lindex $line 1] ne "blob"} continue set sha1 [lindex $line 2] - if {[string index $fname 0] eq "\""} { - set fname [lindex $fname 0] - } lappend treeidlist($id) $sha1 } + if {[string index $fname 0] eq "\""} { + set fname [lindex $fname 0] + } + set fname [encoding convertfrom $fname] lappend treefilelist($id) $fname } if {![eof $gtf]} { @@ -6370,6 +6371,7 @@ proc getmergediffline {mdf id np} { } if {[regexp {^diff --cc (.*)} $line match fname]} { # start of a new file + set fname [encoding convertfrom $fname] $ctext insert end "\n" set here [$ctext index "end - 1c"] lappend difffilestart $here @@ -6519,7 +6521,7 @@ proc gettreediffs {ids} { set treepending $ids set treediff {} - fconfigure $gdtf -blocking 0 + fconfigure $gdtf -blocking 0 -encoding binary filerun $gdtf [list gettreediffline $gdtf $ids] } @@ -6535,6 +6537,7 @@ proc gettreediffline {gdtf ids} { if {[string index $file 0] eq "\""} { set file [lindex $file 0] } + set file [encoding convertfrom $file] lappend treediff $file } } @@ -6679,6 +6682,7 @@ proc getblobdiffline {bdf ids} { } else { set fname [string range $line 2 [expr {$i - 1}]] } + set fname [encoding convertfrom $fname] set diffencoding [get_cached_encoding $fname] makediffhdr $fname $ids @@ -6694,6 +6698,7 @@ proc getblobdiffline {bdf ids} { if {[string index $fname 0] eq "\""} { set fname [lindex $fname 0] } + set fname [encoding convertfrom $fname] set i [lsearch -exact $treediffs($ids) $fname] if {$i >= 0} { setinlist difffilestart $i $curdiffstart @@ -6704,6 +6709,7 @@ proc getblobdiffline {bdf ids} { if {[string index $fname 0] eq "\""} { set fname [lindex $fname 0] } + set fname [encoding convertfrom $fname] set diffencoding [get_cached_encoding $fname] makediffhdr $fname $ids } elseif {[string compare -length 3 $line "---"] == 0} { -- 1.6.0.20.g6148bc ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui. 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 7/8] gitk: Implement file contents encoding support Alexander Gavrilov @ 2008-09-19 12:10 ` Johannes Sixt 2008-09-19 12:38 ` Alexander Gavrilov 1 sibling, 1 reply; 44+ messages in thread From: Johannes Sixt @ 2008-09-19 12:10 UTC (permalink / raw) To: Alexander Gavrilov; +Cc: git, Shawn O. Pearce, Paul Mackerras Alexander Gavrilov schrieb: > Add functions that implement the same logic for file > contents encoding as git-gui uses: > > - Defaults to the system encoding. > - Overridden by setting the gui.encoding option. > - Further overridden on per-file basis by gitattributes. > > Also extends the range of supported encoding names. If I run $ LANG=C gitk 146ed90 with this series on the git-gui repository, then I hoped to see the text in the patches in the right encoding. But I understand that I expected too much - the patch text is just a stream of bytes that comes from different files, and the best you can do is to apply the system encoding. But if the view is switched to the tree view, and file contents are inspected, then this patch should help. But it doesn't. If you look at po/ja.po, it is appearent that the file was not read as UTF-8, which is dictated by .gitattributes. -- Hannes ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui. 2008-09-19 12:10 ` [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui Johannes Sixt @ 2008-09-19 12:38 ` Alexander Gavrilov 2008-09-19 13:04 ` Johannes Sixt 0 siblings, 1 reply; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-19 12:38 UTC (permalink / raw) To: Johannes Sixt; +Cc: git, Shawn O. Pearce, Paul Mackerras On Fri, Sep 19, 2008 at 4:10 PM, Johannes Sixt <j.sixt@viscovery.net> wrote: > If I run > > $ LANG=C gitk 146ed90 > > with this series on the git-gui repository, then I hoped to see the text > in the patches in the right encoding. But I understand that I expected too > much - the patch text is just a stream of bytes that comes from different > files, and the best you can do is to apply the system encoding. > > But if the view is switched to the tree view, and file contents are > inspected, then this patch should help. But it doesn't. If you look at > po/ja.po, it is appearent that the file was not read as UTF-8, which is > dictated by .gitattributes. On my system everything works. You must have made a mistake somewhere. Namely: 1) Did you apply ALL patches to gitk? The first one is absolutely useless without the other two. 2) Did you install it? I.e. do you actually run the patched version? 3) Do you have .gitattributes checked out? Also, it should work in the patch mode as well: it reads the patch as binary, and decodes each line separately, based on the encoding determined for the current file. Alexander ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui. 2008-09-19 12:38 ` Alexander Gavrilov @ 2008-09-19 13:04 ` Johannes Sixt 2008-09-21 18:52 ` Alexander Gavrilov 0 siblings, 1 reply; 44+ messages in thread From: Johannes Sixt @ 2008-09-19 13:04 UTC (permalink / raw) To: Alexander Gavrilov; +Cc: git, Shawn O. Pearce, Paul Mackerras Alexander Gavrilov schrieb: > On Fri, Sep 19, 2008 at 4:10 PM, Johannes Sixt <j.sixt@viscovery.net> wrote: >> If I run >> >> $ LANG=C gitk 146ed90 >> >> with this series on the git-gui repository, then I hoped to see the text >> in the patches in the right encoding. But I understand that I expected too >> much - the patch text is just a stream of bytes that comes from different >> files, and the best you can do is to apply the system encoding. >> >> But if the view is switched to the tree view, and file contents are >> inspected, then this patch should help. But it doesn't. If you look at >> po/ja.po, it is appearent that the file was not read as UTF-8, which is >> dictated by .gitattributes. > > On my system everything works. You must have made a mistake somewhere. Indeed. I was running above command form a git.git clone. Now, looking at anything that is in 146ed90 produces paths like po/de.po po/ja.po Yet, .gitattributes says this: /po/*.po encoding=UTF-8 which does not match any of the *.po files because the *.po are one level deeper inside git-gui/po/. The change below made a difference. Everything's fine now! diff --git a/git-gui/.gitattributes b/git-gui/.gitattributes index f96112d..8ad5766 100644 --- a/git-gui/.gitattributes +++ b/git-gui/.gitattributes @@ -1,3 +1,3 @@ * encoding=US-ASCII git-gui.sh encoding=UTF-8 -/po/*.po encoding=UTF-8 +*.po encoding=UTF-8 > Also, it should work in the patch mode as well: it reads the patch as > binary, and decodes each line separately, based on the encoding > determined for the current file. This is great. But we'll have to see how this works on a multi-file diff on Windows. (I sense slowness.) Thanks, -- Hannes ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui. 2008-09-19 13:04 ` Johannes Sixt @ 2008-09-21 18:52 ` Alexander Gavrilov 2008-09-22 7:25 ` Johannes Sixt 2008-09-22 9:01 ` Dmitry Potapov 0 siblings, 2 replies; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-21 18:52 UTC (permalink / raw) To: Johannes Sixt; +Cc: git, Shawn O. Pearce, Paul Mackerras On Friday 19 September 2008 17:04:54 Johannes Sixt wrote: > Alexander Gavrilov schrieb: > > Also, it should work in the patch mode as well: it reads the patch as > > binary, and decodes each line separately, based on the encoding > > determined for the current file. > > This is great. But we'll have to see how this works on a multi-file diff > on Windows. (I sense slowness.) > Yes, yesterday I noticed that when the patch is very big, like when the index has got out of sync, and gitk claims local changes in all of the git-gui repository (>1000 files), first view is slow even on Linux. I made a patch to optimize attribute lookup. I'm afraid that further optimization is impossible without interface changes in git-check-attr, or reimplementing the attribute parser in Tcl. --- >8 --- From: Alexander Gavrilov <angavrilov@gmail.com> Subject: [PATCH] gitk: Implement batch lookup and caching of encoding attrs. When the diff contains thousands of files, calling git-check-attr once per file is very slow. With this patch gitk does attribute lookup in batches of 30 files while reading the diff file list, which leads to a very noticeable speedup. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> --- gitk | 34 +++++++++++++++++++++++++++++++++- 1 files changed, 33 insertions(+), 1 deletions(-) diff --git a/gitk b/gitk index 1355aa2..6fc1e90 100755 --- a/gitk +++ b/gitk @@ -6530,6 +6530,7 @@ proc gettreediffline {gdtf ids} { global cmitmode vfilelimit curview limitdiffs set nr 0 + set sublist {} while {[incr nr] <= 1000 && [gets $gdtf line] >= 0} { set i [string first "\t" $line] if {$i >= 0} { @@ -6539,8 +6540,10 @@ proc gettreediffline {gdtf ids} { } set file [encoding convertfrom $file] lappend treediff $file + lappend sublist $file } } + cache_gitattr encoding $sublist if {![eof $gdtf]} { return [expr {$nr >= 1000? 2: 1}] } @@ -9839,18 +9842,47 @@ proc tcl_encoding {enc} { } proc gitattr {path attr default} { - if {[catch {set r [exec git check-attr $attr -- $path]}]} { + global path_attr_cache + if {[info exists path_attr_cache($attr,$path)]} { + set r $path_attr_cache($attr,$path) + } elseif {[catch {set r [exec git check-attr $attr -- $path]}]} { set r unspecified } else { set r [join [lrange [split $r :] 2 end] :] regsub {^ } $r {} r } + set path_attr_cache($attr,$path) $r if {$r eq {unspecified}} { return $default } return $r } +proc cache_gitattr {attr pathlist} { + global path_attr_cache + set newlist {} + foreach path $pathlist { + if {[info exists path_attr_cache($attr,$path)]} continue + lappend newlist $path + } + while {$newlist ne {}} { + set head [lrange $newlist 0 29] + set newlist [lrange $newlist 30 end] + if {![catch {set rlist [eval exec git check-attr $attr -- $head]}]} { + foreach row [split $rlist "\n"] { + set cols [split $row :] + set path [lindex $cols 0] + set value [join [lrange $cols 2 end] :] + if {[string index $path 0] eq "\""} { + set path [encoding convertfrom [lindex $path 0]] + } + set path_attr_cache($attr,$path) $value + } + } + update + } +} + proc get_path_encoding {path} { global gui_encoding set tcl_enc [tcl_encoding $gui_encoding] -- 1.6.0.20.g6148bc ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui. 2008-09-21 18:52 ` Alexander Gavrilov @ 2008-09-22 7:25 ` Johannes Sixt 2008-09-22 7:46 ` Johannes Sixt 2008-09-22 8:01 ` Alexander Gavrilov 2008-09-22 9:01 ` Dmitry Potapov 1 sibling, 2 replies; 44+ messages in thread From: Johannes Sixt @ 2008-09-22 7:25 UTC (permalink / raw) To: Alexander Gavrilov; +Cc: git, Shawn O. Pearce, Paul Mackerras Alexander Gavrilov schrieb: > Subject: [PATCH] gitk: Implement batch lookup and caching of encoding attrs. > > When the diff contains thousands of files, calling git-check-attr > once per file is very slow. With this patch gitk does attribute > lookup in batches of 30 files while reading the diff file list, > which leads to a very noticeable speedup. This one does not work for me: The correct is not picked up anymore, neither in Patch mode nor Tree mode. (It works as expected without this patch.) > +proc cache_gitattr {attr pathlist} { > + global path_attr_cache > + set newlist {} > + foreach path $pathlist { > + if {[info exists path_attr_cache($attr,$path)]} continue > + lappend newlist $path > + } > + while {$newlist ne {}} { > + set head [lrange $newlist 0 29] > + set newlist [lrange $newlist 30 end] > + if {![catch {set rlist [eval exec git check-attr $attr -- $head]}]} { > + foreach row [split $rlist "\n"] { > + set cols [split $row :] This colon made me nervous (because of the drive-colon combination on Windows), but as long as you feed relative paths into 'git check-attr', this should not matter (in my case). > + set path [lindex $cols 0] > + set value [join [lrange $cols 2 end] :] > + if {[string index $path 0] eq "\""} { > + set path [encoding convertfrom [lindex $path 0]] > + } > + set path_attr_cache($attr,$path) $value > + } > + } > + update > + } > +} ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui. 2008-09-22 7:25 ` Johannes Sixt @ 2008-09-22 7:46 ` Johannes Sixt 2008-09-22 8:01 ` Alexander Gavrilov 1 sibling, 0 replies; 44+ messages in thread From: Johannes Sixt @ 2008-09-22 7:46 UTC (permalink / raw) To: Alexander Gavrilov; +Cc: git, Shawn O. Pearce, Paul Mackerras Johannes Sixt schrieb: > Alexander Gavrilov schrieb: >> Subject: [PATCH] gitk: Implement batch lookup and caching of encoding attrs. >> >> When the diff contains thousands of files, calling git-check-attr >> once per file is very slow. With this patch gitk does attribute >> lookup in batches of 30 files while reading the diff file list, >> which leads to a very noticeable speedup. > > This one does not work for me: The correct is not picked up anymore, The correct _encoding_ is not picked up anymore... > neither in Patch mode nor Tree mode. (It works as expected without this > patch.) > >> +proc cache_gitattr {attr pathlist} { >> + global path_attr_cache >> + set newlist {} >> + foreach path $pathlist { >> + if {[info exists path_attr_cache($attr,$path)]} continue >> + lappend newlist $path >> + } >> + while {$newlist ne {}} { >> + set head [lrange $newlist 0 29] >> + set newlist [lrange $newlist 30 end] >> + if {![catch {set rlist [eval exec git check-attr $attr -- $head]}]} { >> + foreach row [split $rlist "\n"] { >> + set cols [split $row :] > > This colon made me nervous (because of the drive-colon combination on > Windows), but as long as you feed relative paths into 'git check-attr', > this should not matter (in my case). This comment does not imply that I debugged this code. I just noticed the colon while reading the patch. >> + set path [lindex $cols 0] >> + set value [join [lrange $cols 2 end] :] >> + if {[string index $path 0] eq "\""} { >> + set path [encoding convertfrom [lindex $path 0]] >> + } >> + set path_attr_cache($attr,$path) $value >> + } >> + } >> + update >> + } >> +} ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui. 2008-09-22 7:25 ` Johannes Sixt 2008-09-22 7:46 ` Johannes Sixt @ 2008-09-22 8:01 ` Alexander Gavrilov 2008-09-22 8:20 ` Johannes Sixt 1 sibling, 1 reply; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-22 8:01 UTC (permalink / raw) To: Johannes Sixt; +Cc: git, Shawn O. Pearce, Paul Mackerras On Monday 22 September 2008 11:25:43 Johannes Sixt wrote: > Alexander Gavrilov schrieb: > > Subject: [PATCH] gitk: Implement batch lookup and caching of encoding attrs. > > > > When the diff contains thousands of files, calling git-check-attr > > once per file is very slow. With this patch gitk does attribute > > lookup in batches of 30 files while reading the diff file list, > > which leads to a very noticeable speedup. > > This one does not work for me: The correct is not picked up anymore, > neither in Patch mode nor Tree mode. (It works as expected without this > patch.) > OOPS, I forgot to copy the line that removes leading whitespace: @@ -9876,6 +9876,7 @@ proc cache_gitattr {attr pathlist} { if {[string index $path 0] eq "\""} { set path [encoding convertfrom [lindex $path 0]] } + regsub {^ } $value {} value set path_attr_cache($attr,$path) $value } } > > + set cols [split $row :] > > + set path [lindex $cols 0] > > This colon made me nervous (because of the drive-colon combination on > Windows), but as long as you feed relative paths into 'git check-attr', > this should not matter (in my case). I'm afraid there is nothing to be done about it without changing git-check-attr. For example, it could quote ':' in the path in octal -- current interface allows it. But since the path is relative, it is not a big deal. --- >8 --- From: Alexander Gavrilov <angavrilov@gmail.com> Subject: [PATCH v2] gitk: Implement batch lookup and caching of encoding attrs. When the diff contains thousands of files, calling git-check-attr once per file is very slow. With this patch gitk does attribute lookup in batches of 30 files while reading the diff file list, which leads to a very noticeable speedup. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> --- gitk | 35 ++++++++++++++++++++++++++++++++++- 1 files changed, 34 insertions(+), 1 deletions(-) diff --git a/gitk b/gitk index 1355aa2..cf557c3 100755 --- a/gitk +++ b/gitk @@ -6530,6 +6530,7 @@ proc gettreediffline {gdtf ids} { global cmitmode vfilelimit curview limitdiffs set nr 0 + set sublist {} while {[incr nr] <= 1000 && [gets $gdtf line] >= 0} { set i [string first "\t" $line] if {$i >= 0} { @@ -6539,8 +6540,10 @@ proc gettreediffline {gdtf ids} { } set file [encoding convertfrom $file] lappend treediff $file + lappend sublist $file } } + cache_gitattr encoding $sublist if {![eof $gdtf]} { return [expr {$nr >= 1000? 2: 1}] } @@ -9839,18 +9842,48 @@ proc tcl_encoding {enc} { } proc gitattr {path attr default} { - if {[catch {set r [exec git check-attr $attr -- $path]}]} { + global path_attr_cache + if {[info exists path_attr_cache($attr,$path)]} { + set r $path_attr_cache($attr,$path) + } elseif {[catch {set r [exec git check-attr $attr -- $path]}]} { set r unspecified } else { set r [join [lrange [split $r :] 2 end] :] regsub {^ } $r {} r } + set path_attr_cache($attr,$path) $r if {$r eq {unspecified}} { return $default } return $r } +proc cache_gitattr {attr pathlist} { + global path_attr_cache + set newlist {} + foreach path $pathlist { + if {[info exists path_attr_cache($attr,$path)]} continue + lappend newlist $path + } + while {$newlist ne {}} { + set head [lrange $newlist 0 29] + set newlist [lrange $newlist 30 end] + if {![catch {set rlist [eval exec git check-attr $attr -- $head]}]} { + foreach row [split $rlist "\n"] { + set cols [split $row :] + set path [lindex $cols 0] + set value [join [lrange $cols 2 end] :] + if {[string index $path 0] eq "\""} { + set path [encoding convertfrom [lindex $path 0]] + } + regsub {^ } $value {} value + set path_attr_cache($attr,$path) $value + } + } + update + } +} + proc get_path_encoding {path} { global gui_encoding set tcl_enc [tcl_encoding $gui_encoding] -- 1.6.0.20.g6148bc ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui. 2008-09-22 8:01 ` Alexander Gavrilov @ 2008-09-22 8:20 ` Johannes Sixt 2008-09-22 9:02 ` Alexander Gavrilov 0 siblings, 1 reply; 44+ messages in thread From: Johannes Sixt @ 2008-09-22 8:20 UTC (permalink / raw) To: Alexander Gavrilov; +Cc: git, Shawn O. Pearce, Paul Mackerras Alexander Gavrilov schrieb: > On Monday 22 September 2008 11:25:43 Johannes Sixt wrote: >> Alexander Gavrilov schrieb: >>> Subject: [PATCH] gitk: Implement batch lookup and caching of encoding attrs. >>> >>> When the diff contains thousands of files, calling git-check-attr >>> once per file is very slow. With this patch gitk does attribute >>> lookup in batches of 30 files while reading the diff file list, >>> which leads to a very noticeable speedup. >> This one does not work for me: The correct is not picked up anymore, >> neither in Patch mode nor Tree mode. (It works as expected without this >> patch.) >> > > OOPS, I forgot to copy the line that removes leading whitespace: Thanks, with this it works now. The delay for a 1000 file patch is bearable (on Windows, but y'know, Windows types are masochists :-). Feel free to add: Tested-by: Johannes Sixt <johannes.sixt@telecom.at> to the patches in the series when you resend it - and I hope you do resend it after addressing Paul's concerns. -- Hannes ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui. 2008-09-22 8:20 ` Johannes Sixt @ 2008-09-22 9:02 ` Alexander Gavrilov 2008-09-22 9:18 ` Johannes Sixt 0 siblings, 1 reply; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-22 9:02 UTC (permalink / raw) To: Johannes Sixt; +Cc: git, Shawn O. Pearce, Paul Mackerras On Monday 22 September 2008 12:20:28 Johannes Sixt wrote: > Alexander Gavrilov schrieb: > > On Monday 22 September 2008 11:25:43 Johannes Sixt wrote: > >> Alexander Gavrilov schrieb: > >>> Subject: [PATCH] gitk: Implement batch lookup and caching of encoding attrs. > >>> > >>> When the diff contains thousands of files, calling git-check-attr > >>> once per file is very slow. With this patch gitk does attribute > >>> lookup in batches of 30 files while reading the diff file list, > >>> which leads to a very noticeable speedup. > >> This one does not work for me: The correct is not picked up anymore, > >> neither in Patch mode nor Tree mode. (It works as expected without this > >> patch.) > >> > > > > OOPS, I forgot to copy the line that removes leading whitespace: > > Thanks, with this it works now. The delay for a 1000 file patch is > bearable (on Windows, but y'know, Windows types are masochists :-). You can also try applying this patch (originally made for git-gui). It may save additional 0.3 sec, especially for obscure legacy encodings. P.S. I do believe there is a place for a library shared between gitk & git-gui. This code duplication is ugly and annoying; moreover, they have different indentation conventions, which get messed up... --- >8 --- From: Alexander Gavrilov <angavrilov@gmail.com> Subject: [PATCH] git-gui: Optimize encoding name resolution using a lookup table. Encoding menu construction does almost a hundred of encoding resolutions, which with the old implementation led to a small but noticeable delay. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> --- gitk | 84 ++++++++++++++++++++++++++++++++++++++++++----------------------- 1 files changed, 54 insertions(+), 30 deletions(-) diff --git a/gitk b/gitk index 254faa1..1355aa2 100755 --- a/gitk +++ b/gitk @@ -9779,39 +9779,63 @@ set encoding_aliases { { Big5 csBig5 } } -proc tcl_encoding {enc} { - global encoding_aliases - set names [encoding names] - set lcnames [string tolower $names] - set enc [string tolower $enc] - set i [lsearch -exact $lcnames $enc] - if {$i < 0} { - # look for "isonnn" instead of "iso-nnn" or "iso_nnn" - if {[regsub {^(iso|cp|ibm|jis)[-_]} $enc {\1} encx]} { - set i [lsearch -exact $lcnames $encx] +proc build_encoding_table {} { + global encoding_aliases encoding_lookup_table + + # Prepare the lookup list; cannot use lsort -nocase because + # of compatibility issues with older Tcl (e.g. in msysgit) + set names [list] + foreach item [encoding names] { + lappend names [list [string tolower $item] $item] + } + set names [lsort -ascii -index 0 $names] + # neither can we use lsearch -index + set lnames [list] + foreach item $names { + lappend lnames [lindex $item 0] + } + + foreach grp $encoding_aliases { + set target {} + foreach item $grp { + set i [lsearch -sorted -ascii $lnames \ + [string tolower $item]] + if {$i >= 0} { + set target [lindex $names $i 1] + break + } + } + if {$target eq {}} continue + foreach item $grp { + set encoding_lookup_table([string tolower $item]) $target + } } - } - if {$i < 0} { - foreach l $encoding_aliases { - set ll [string tolower $l] - if {[lsearch -exact $ll $enc] < 0} continue - # look through the aliases for one that tcl knows about - foreach e $ll { - set i [lsearch -exact $lcnames $e] - if {$i < 0} { - if {[regsub {^(iso|cp|ibm|jis)[-_]} $e {\1} ex]} { - set i [lsearch -exact $lcnames $ex] - } + + foreach item $names { + set encoding_lookup_table([lindex $item 0]) [lindex $item 1] + } +} + +proc tcl_encoding {enc} { + global encoding_lookup_table + if {$enc eq {}} { + return {} + } + if {![info exists encoding_lookup_table]} { + build_encoding_table + } + set enc [string tolower $enc] + if {![info exists encoding_lookup_table($enc)]} { + # look for "isonnn" instead of "iso-nnn" or "iso_nnn" + if {[regsub {^(iso|cp|ibm|jis)[-_]} $enc {\1} encx]} { + set enc $encx } - if {$i >= 0} break - } - break } - } - if {$i >= 0} { - return [lindex $names $i] - } - return {} + if {[info exists encoding_lookup_table($enc)]} { + return $encoding_lookup_table($enc) + } else { + return {} + } } proc gitattr {path attr default} { -- 1.6.0.20.g6148bc ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui. 2008-09-22 9:02 ` Alexander Gavrilov @ 2008-09-22 9:18 ` Johannes Sixt 2008-09-22 10:18 ` Alexander Gavrilov 0 siblings, 1 reply; 44+ messages in thread From: Johannes Sixt @ 2008-09-22 9:18 UTC (permalink / raw) To: Alexander Gavrilov; +Cc: git, Shawn O. Pearce, Paul Mackerras Alexander Gavrilov schrieb: > You can also try applying this patch (originally made for git-gui). It may save > additional 0.3 sec, especially for obscure legacy encodings. Is this about startup time? Personally, I don't care about 0.3 sec startup time. I close my primary gitk and git-gui Windows only once a week. ;-) -- Hannes ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui. 2008-09-22 9:18 ` Johannes Sixt @ 2008-09-22 10:18 ` Alexander Gavrilov 0 siblings, 0 replies; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-22 10:18 UTC (permalink / raw) To: Johannes Sixt; +Cc: git, Shawn O. Pearce, Paul Mackerras On Mon, Sep 22, 2008 at 1:18 PM, Johannes Sixt <j.sixt@viscovery.net> wrote: > Alexander Gavrilov schrieb: >> You can also try applying this patch (originally made for git-gui). It may save >> additional 0.3 sec, especially for obscure legacy encodings. > > Is this about startup time? Personally, I don't care about 0.3 sec startup > time. I close my primary gitk and git-gui Windows only once a week. ;-) As encoding lookups are currently cached, all this discussion is about the first viewing. This patch addresses the time necessary to convert an arbitrary encoding name to a name that is known to Tcl, or determine that it is not supported. Without the patch it is done using a linear search through a large table of aliases. In git-gui it caused a noticeable delay before the Encoding submenu, which lists all available encodings, was displayed. Alexander ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui. 2008-09-21 18:52 ` Alexander Gavrilov 2008-09-22 7:25 ` Johannes Sixt @ 2008-09-22 9:01 ` Dmitry Potapov 1 sibling, 0 replies; 44+ messages in thread From: Dmitry Potapov @ 2008-09-22 9:01 UTC (permalink / raw) To: Alexander Gavrilov; +Cc: Johannes Sixt, git, Shawn O. Pearce, Paul Mackerras On Sun, Sep 21, 2008 at 10:52:35PM +0400, Alexander Gavrilov wrote: > > I made a patch to optimize attribute lookup. I'm afraid that further optimization > is impossible without interface changes in git-check-attr, or reimplementing > the attribute parser in Tcl. I wonder would not make sense to add --stdin-paths option to git check-attr in the same way as we have for git hash-object? Dmitry ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding. 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 2/8] git-gui: Add a menu of available encodings Alexander Gavrilov @ 2008-09-18 15:02 ` Dmitry Potapov 2008-09-18 15:14 ` Alexander Gavrilov 2008-09-18 16:29 ` Johannes Sixt 1 sibling, 2 replies; 44+ messages in thread From: Dmitry Potapov @ 2008-09-18 15:02 UTC (permalink / raw) To: Alexander Gavrilov; +Cc: git, Shawn O. Pearce, Paul Mackerras On Thu, Sep 18, 2008 at 01:07:32AM +0400, Alexander Gavrilov wrote: > - Make diffs and blame default to the system (locale) > encoding instead of hard-coding UTF-8. > - Add a gui.encoding option to allow overriding it. > - gitattributes still have the final word. The subject line of this patch is a bit misleading. I would not expect from "clean up" to change the existing behavior and existing default. > The rationale for this is Windows support: > > 1) Windows people are accustomed to using legacy encodings > for text files. For many of them defaulting to utf-8 > will be counter-intuitive. > 2) Windows doesn't support utf-8 locales, and switching > the system encoding is a real pain. Thus the option. I don't care much what is the default for Windows, but I wonder whether this rationale is good enough to change the default for other platforms. If you have systems configured with utf-8 and others (usually old ones) with legacy encoding, you will store files in utf-8 in your repo, thus having utf-8 as the default makes sense for non-Windows platforms. BTW, when you said the system encoding above, what exactly encoding do you mean? AFAIK, Windows has two legacy encodings OEM-CP and ANSI-CP. If I write a console program and compile it using MS-VC then it should use OEM-CP. However, if you write a GUI program or a console program that is compiled using gcc from Cygwin, you have to use ANSI-CP. For instance, if you use the Russian locale on Windows, ASNI-CP is 1251 and OEM-CP is 866. So, my question is what exactly encoding do you call as "system" above? Dmitry ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding. 2008-09-18 15:02 ` [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding Dmitry Potapov @ 2008-09-18 15:14 ` Alexander Gavrilov 2008-09-18 16:29 ` Johannes Sixt 1 sibling, 0 replies; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-18 15:14 UTC (permalink / raw) To: Dmitry Potapov; +Cc: git, Shawn O. Pearce, Paul Mackerras On Thu, Sep 18, 2008 at 7:02 PM, Dmitry Potapov <dpotapov@gmail.com> wrote: > The subject line of this patch is a bit misleading. I would not expect > from "clean up" to change the existing behavior and existing default. Comments can be easily changed. >> The rationale for this is Windows support: >> >> 1) Windows people are accustomed to using legacy encodings >> for text files. For many of them defaulting to utf-8 >> will be counter-intuitive. >> 2) Windows doesn't support utf-8 locales, and switching >> the system encoding is a real pain. Thus the option. > > I don't care much what is the default for Windows, but I wonder whether > this rationale is good enough to change the default for other platforms. > If you have systems configured with utf-8 and others (usually old ones) > with legacy encoding, you will store files in utf-8 in your repo, thus > having utf-8 as the default makes sense for non-Windows platforms. In fact, I think that the only reasonable default is the locale encoding. If they want something different, they can do "git config --global gui.encoding utf-8", that's what the option is there for. > BTW, when you said the system encoding above, what exactly encoding do > you mean? AFAIK, Windows has two legacy encodings OEM-CP and ANSI-CP. > If I write a console program and compile it using MS-VC then it should > use OEM-CP. However, if you write a GUI program or a console program > that is compiled using gcc from Cygwin, you have to use ANSI-CP. For > instance, if you use the Russian locale on Windows, ASNI-CP is 1251 and > OEM-CP is 866. So, my question is what exactly encoding do you call as > "system" above? Whatever Tcl thinks the system encoding is. In this case it is cp1251. CP866 is for DOS. Alexander ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding. 2008-09-18 15:02 ` [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding Dmitry Potapov 2008-09-18 15:14 ` Alexander Gavrilov @ 2008-09-18 16:29 ` Johannes Sixt 2008-09-18 16:50 ` Dmitry Potapov 1 sibling, 1 reply; 44+ messages in thread From: Johannes Sixt @ 2008-09-18 16:29 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Alexander Gavrilov, git, Shawn O. Pearce, Paul Mackerras Dmitry Potapov schrieb: > On Thu, Sep 18, 2008 at 01:07:32AM +0400, Alexander Gavrilov wrote: >> The rationale for this is Windows support: >> >> 1) Windows people are accustomed to using legacy encodings >> for text files. For many of them defaulting to utf-8 >> will be counter-intuitive. >> 2) Windows doesn't support utf-8 locales, and switching >> the system encoding is a real pain. Thus the option. > > I don't care much what is the default for Windows, but I wonder whether > this rationale is good enough to change the default for other platforms. "The default" should not be hardcoded in the tool. By setting the encoding to "system", "the default" is taken from whatever the system's current locale is. If you are on modern Linux, your locale is most likely set to UTF8, and everything is fine; you won't observe a change in behavior. But if you are on a system whose locale was not set to UTF8, then you very likely did *not* produce UTF8 data, and the display in git-gui was screwed because it assumed UTF8. With this change it uses the system's encoding, and it is an improvement. > If you have systems configured with utf-8 and others (usually old ones) > with legacy encoding, you will store files in utf-8 in your repo, thus > having utf-8 as the default makes sense for non-Windows platforms. How can you know? For example, I've to work with systems that use "legacy encodings", and I can't use UTF8 in my data. Hence, the default of UTF8 was not exactly useful. With this patch series there's now a mechanism that allows me to state the encoding per file, and all platforms should be able to show the data in the correct way. -- Hannes ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding. 2008-09-18 16:29 ` Johannes Sixt @ 2008-09-18 16:50 ` Dmitry Potapov 2008-09-18 17:00 ` Alexander Gavrilov 0 siblings, 1 reply; 44+ messages in thread From: Dmitry Potapov @ 2008-09-18 16:50 UTC (permalink / raw) To: Johannes Sixt; +Cc: Alexander Gavrilov, git, Shawn O. Pearce, Paul Mackerras On Thu, Sep 18, 2008 at 06:29:26PM +0200, Johannes Sixt wrote: > > By setting the encoding to "system", "the default" is taken from whatever > the system's current locale is. If you are on modern Linux, your locale is > most likely set to UTF8, and everything is fine; you won't observe a > change in behavior. That's right. > But if you are on a system whose locale was not set to UTF8, then you very > likely did *not* produce UTF8 data, and the display in git-gui was screwed > because it assumed UTF8. With this change it uses the system's encoding, > and it is an improvement. It is not about how data are stored locale but what is in repository. Even if you still have some Linux box with legacy encoding on it, you still want to see what in repository, which is mostly likely to be in UTF-8. Even if you do not have UTF-8 locale, all decent editors are capable to read and store files in UTF-8 (even if it is not your locale), and it is really make sense to store files in UTF-8, which makes sense because you are going then on a modern Linux, you want to have all data in the repository to be in a single encoding, and UTF-8 is the best choice for that. > > > If you have systems configured with utf-8 and others (usually old ones) > > with legacy encoding, you will store files in utf-8 in your repo, thus > > having utf-8 as the default makes sense for non-Windows platforms. > > How can you know? For example, I've to work with systems that use "legacy > encodings", and I can't use UTF8 in my data. Hence, the default of UTF8 > was not exactly useful. With this patch series there's now a mechanism > that allows me to state the encoding per file, and all platforms should be > able to show the data in the correct way. This patch is certainly a big improvement, as it allows to choose what encoding you want to see, but I was not sure that changing the default from UTF-8 to the system locale is really a good idea for anything but Windows specific projects. Anyway, I have converted all computers that I use regularly to UTF-8, so I don't really care... Dmitry ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding. 2008-09-18 16:50 ` Dmitry Potapov @ 2008-09-18 17:00 ` Alexander Gavrilov 2008-09-18 17:19 ` Dmitry Potapov 0 siblings, 1 reply; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-18 17:00 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Johannes Sixt, git, Shawn O. Pearce, Paul Mackerras On Thu, Sep 18, 2008 at 8:50 PM, Dmitry Potapov <dpotapov@gmail.com> wrote: > It is not about how data are stored locale but what is in repository. > Even if you still have some Linux box with legacy encoding on it, you > still want to see what in repository, which is mostly likely to be in > UTF-8. Even if you do not have UTF-8 locale, all decent editors are > capable to read and store files in UTF-8 (even if it is not your locale), > and it is really make sense to store files in UTF-8, which makes sense > because you are going then on a modern Linux, you want to have all data > in the repository to be in a single encoding, and UTF-8 is the best > choice for that. A new user would expect to see his files properly, and they are likely to be in the locale encoding. And if you know about utf-8, you can open the Options dialog, and select it explicitly from a menu. And if you commit a .gitattributes file with encoding specifications to the repository, it will be used automatically wherever you check it out. > This patch is certainly a big improvement, as it allows to choose what > encoding you want to see, but I was not sure that changing the default > from UTF-8 to the system locale is really a good idea for anything but > Windows specific projects. Anyway, I have converted all computers that > I use regularly to UTF-8, so I don't really care... You are here missing the fact, that the actual current default for git-gui is not utf-8, but 'binary', essentially equivalent to ISO-8859-1. UTF-8 was suggested by a patch that has been around in the 'pu' branch since January, and which I took as a base for my series. Gitk on the other hand uses the locale encoding. Alexander ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding. 2008-09-18 17:00 ` Alexander Gavrilov @ 2008-09-18 17:19 ` Dmitry Potapov 0 siblings, 0 replies; 44+ messages in thread From: Dmitry Potapov @ 2008-09-18 17:19 UTC (permalink / raw) To: Alexander Gavrilov; +Cc: Johannes Sixt, git, Shawn O. Pearce, Paul Mackerras On Thu, Sep 18, 2008 at 09:00:07PM +0400, Alexander Gavrilov wrote: > > You are here missing the fact, that the actual current default for > git-gui is not utf-8, but 'binary', essentially equivalent to > ISO-8859-1. In this case, I was just confused by the comment to the patch, which was saying "Make diffs and blame default to the system (locale) encoding instead of hard-coding UTF-8." > UTF-8 was suggested by a patch that has been around in the > 'pu' branch since January, and which I took as a base for my series. > Gitk on the other hand uses the locale encoding. I see... Then your patch makes perfect sense regardless of Windows support, which was presented as the rationale for the patch. Thanks, Dmitry ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI 2008-09-17 21:07 [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding Alexander Gavrilov @ 2008-09-17 21:45 ` Paul Mackerras 2008-09-18 11:12 ` Alexander Gavrilov 2008-09-21 22:55 ` Paul Mackerras 2 siblings, 1 reply; 44+ messages in thread From: Paul Mackerras @ 2008-09-17 21:45 UTC (permalink / raw) To: Alexander Gavrilov; +Cc: git, Shawn O. Pearce Alexander Gavrilov writes: > Currently GUI tools don't provide any support for > viewing files that contain non-ASCII characters. Well, that's just not true, at least as far as gitk is concerned. If you feel there are deficiencies in how gitk handles encodings (and I'm quite willing to believe there are, since ASCII is sufficient for my needs), then please give us a detailed explanation of what you would like it to do or specifically what is wrong with what it does at the moment. I'd like to see several paragraphs, not just the one or two sentences you have put in the descriptions for patches 6-8. Thanks, Paul. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI 2008-09-17 21:45 ` [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI Paul Mackerras @ 2008-09-18 11:12 ` Alexander Gavrilov 0 siblings, 0 replies; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-18 11:12 UTC (permalink / raw) To: Paul Mackerras; +Cc: git, Shawn O. Pearce On Thu, Sep 18, 2008 at 1:45 AM, Paul Mackerras <paulus@samba.org> wrote: > Alexander Gavrilov writes: >> Currently GUI tools don't provide any support for >> viewing files that contain non-ASCII characters. > > Well, that's just not true, at least as far as gitk is concerned. Somehow being able to show files in the system encoding is not good enough for a tool that is supposed to be used for cross-platform projects. It is only marginally better than always using ISO-8859-1, as git-gui, in effect, did. > If you feel there are deficiencies in how gitk handles encodings (and > I'm quite willing to believe there are, since ASCII is sufficient for > my needs), then please give us a detailed explanation of what you > would like it to do or specifically what is wrong with what it does at > the moment. I'd like to see several paragraphs, not just the one or > two sentences you have put in the descriptions for patches 6-8. I did not combine this set of patches into a single group without a reason. This is a policy decision that spans the boundary of individual tools, although it is initially implemented and documented on the git-gui side. The gitk commits simply bring back changes to code originally copied from gitk, tie in new logic that supports per-file encoding, and fix some obvious breakage (of course, I can write longer descriptions for them). By the way, patch 4 will apply to gitk, if you replace 'lib/encoding.tcl' with 'gitk', and specify -C2. P.S. All changes are build on top of these two commits: http://repo.or.cz/w/git-gui.git?a=log;h=refs/heads/pu Alexander ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI 2008-09-17 21:07 [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding Alexander Gavrilov 2008-09-17 21:45 ` [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI Paul Mackerras @ 2008-09-21 22:55 ` Paul Mackerras 2008-09-22 10:12 ` Alexander Gavrilov 2008-10-01 11:35 ` [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI Johannes Sixt 2 siblings, 2 replies; 44+ messages in thread From: Paul Mackerras @ 2008-09-21 22:55 UTC (permalink / raw) To: Alexander Gavrilov; +Cc: git, Shawn O. Pearce Alexander Gavrilov writes: > File encoding can be specified in the following ways: > > 1) It defaults to the current locale encoding. > 2) It can be overridden by setting the gui.encoding option. I'm happy with providing a way to say what the default encoding of files in the repository is, but I wonder why it is seen as a property of the GUI. Is it just that there is an existing "gui" section that is convenient to use, or does git-gui already use gui.encoding (before this patch series), or is there some other reason? > 3) It can be further set on per-file basis by specifying > the 'encoding' attribute in gitattributes. I haven't used .gitattributes before, but I would expect that the .gitattributes files would be stored in the repository along with everything else. If that's the case, then for gitk at least there is the question of which version of a given .gitattributes file one should use when viewing the tree for a commit which isn't the currently checked-out commit - do you use the version from that tree, or the version in the working directory? We seem to be using the latter at present, and caching the results. Is there a philosophical reason to do that, other than speed? (Also it seems that we won't notice if the user changes .gitattributes after we've looked at it, or if they create one after we've looked for one and not found it.) > Since git apparently cannot work with filenames in non-locale > encodings anyway, I did not try to do anything about it apart > from fixing some obvious bugs. For Linux, filenames are sequences of non-null bytes, so using binary encoding to read them in Tcl sounds about right. > There are also some bugs in handling of commit encodings in gitk, > but they are out of the scope of this series. I'm interested to hear what they are. Paul. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI 2008-09-21 22:55 ` Paul Mackerras @ 2008-09-22 10:12 ` Alexander Gavrilov 2008-10-05 2:30 ` [PATCH 1/2] check-attr: add an internal check_attr() function Dmitry Potapov 2008-10-01 11:35 ` [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI Johannes Sixt 1 sibling, 1 reply; 44+ messages in thread From: Alexander Gavrilov @ 2008-09-22 10:12 UTC (permalink / raw) To: Paul Mackerras; +Cc: git, Shawn O. Pearce On Mon, Sep 22, 2008 at 2:55 AM, Paul Mackerras <paulus@samba.org> wrote: > I'm happy with providing a way to say what the default encoding of > files in the repository is, but I wonder why it is seen as a property > of the GUI. Is it just that there is an existing "gui" section that > is convenient to use, or does git-gui already use gui.encoding (before > this patch series), or is there some other reason? No particular reason, it can be easily renamed to i18n.encoding, or something else. I only recall seeing the name 'gui.encoding' in a discussion on this topic several months ago. > I haven't used .gitattributes before, but I would expect that the > .gitattributes files would be stored in the repository along with > everything else. If that's the case, then for gitk at least there is > the question of which version of a given .gitattributes file one > should use when viewing the tree for a commit which isn't the > currently checked-out commit - do you use the version from that tree, > or the version in the working directory? We seem to be using the > latter at present, and caching the results. Is there a philosophical > reason to do that, other than speed? (Also it seems that we won't > notice if the user changes .gitattributes after we've looked at it, or > if they create one after we've looked for one and not found it.) Core git does not provide any interface for reading attributes from older commits, so they are loaded from the working copy. And caching is necessary for performance when lookup involves calling an external application at least once per 30 files. This may change if git-check-attr is modified to support --stdin-paths. >> There are also some bugs in handling of commit encodings in gitk, >> but they are out of the scope of this series. > > I'm interested to hear what they are. When I tested it, I noticed that: 1. It works correctly only if all commits use the encoding specified in i18n.commitencoding 2. Even when they do, sometimes commits are loaded through cat-file, and in this case they are processed using the locale encoding. Alexander ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH 1/2] check-attr: add an internal check_attr() function 2008-09-22 10:12 ` Alexander Gavrilov @ 2008-10-05 2:30 ` Dmitry Potapov 2008-10-05 2:30 ` [PATCH 2/2] check-attr: Add --stdin-paths option Dmitry Potapov 0 siblings, 1 reply; 44+ messages in thread From: Dmitry Potapov @ 2008-10-05 2:30 UTC (permalink / raw) To: Alexander Gavrilov, git; +Cc: Shawn O. Pearce, Paul Mackerras, Dmitry Potapov This step is preparation to introducing --stdin-paths option Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- On Mon, Sep 22, 2008 at 02:12:33PM +0400, Alexander Gavrilov wrote: > > Core git does not provide any interface for reading attributes from > older commits, so they are loaded from the working copy. And caching > is necessary for performance when lookup involves calling an external > application at least once per 30 files. This may change if > git-check-attr is modified to support --stdin-paths. builtin-check-attr.c | 41 +++++++++++++++++++++++------------------ 1 files changed, 23 insertions(+), 18 deletions(-) diff --git a/builtin-check-attr.c b/builtin-check-attr.c index cb783fc..5a2e329 100644 --- a/builtin-check-attr.c +++ b/builtin-check-attr.c @@ -6,6 +6,27 @@ static const char check_attr_usage[] = "git check-attr attr... [--] pathname..."; +static void check_attr(int cnt, struct git_attr_check *check, + const char** name, const char *file) +{ + int j; + if (git_checkattr(file, cnt, check)) + die("git_checkattr died"); + for (j = 0; j < cnt; j++) { + const char *value = check[j].value; + + if (ATTR_TRUE(value)) + value = "set"; + else if (ATTR_FALSE(value)) + value = "unset"; + else if (ATTR_UNSET(value)) + value = "unspecified"; + + quote_c_style(file, NULL, stdout, 0); + printf(": %s: %s\n", name[j], value); + } +} + int cmd_check_attr(int argc, const char **argv, const char *prefix) { struct git_attr_check *check; @@ -42,23 +63,7 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix) check[i].attr = a; } - for (i = doubledash; i < argc; i++) { - int j; - if (git_checkattr(argv[i], cnt, check)) - die("git_checkattr died"); - for (j = 0; j < cnt; j++) { - const char *value = check[j].value; - - if (ATTR_TRUE(value)) - value = "set"; - else if (ATTR_FALSE(value)) - value = "unset"; - else if (ATTR_UNSET(value)) - value = "unspecified"; - - quote_c_style(argv[i], NULL, stdout, 0); - printf(": %s: %s\n", argv[j+1], value); - } - } + for (i = doubledash; i < argc; i++) + check_attr(cnt, check, argv+1, argv[i]); return 0; } -- 1.6.0.2.447.g3befd ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 2/2] check-attr: Add --stdin-paths option 2008-10-05 2:30 ` [PATCH 1/2] check-attr: add an internal check_attr() function Dmitry Potapov @ 2008-10-05 2:30 ` Dmitry Potapov 2008-10-06 7:09 ` Johannes Sixt 0 siblings, 1 reply; 44+ messages in thread From: Dmitry Potapov @ 2008-10-05 2:30 UTC (permalink / raw) To: Alexander Gavrilov, git; +Cc: Shawn O. Pearce, Paul Mackerras, Dmitry Potapov This allows multiple paths to be specified on stdin. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- Documentation/git-check-attr.txt | 4 ++ builtin-check-attr.c | 66 ++++++++++++++++++++++++++++++++------ t/t0003-attributes.sh | 17 ++++++++++ 3 files changed, 77 insertions(+), 10 deletions(-) diff --git a/Documentation/git-check-attr.txt b/Documentation/git-check-attr.txt index 2b821f2..0839a57 100644 --- a/Documentation/git-check-attr.txt +++ b/Documentation/git-check-attr.txt @@ -9,6 +9,7 @@ git-check-attr - Display gitattributes information. SYNOPSIS -------- 'git check-attr' attr... [--] pathname... +'git check-attr' --stdin-paths attr... < <list-of-paths DESCRIPTION ----------- @@ -17,6 +18,9 @@ For every pathname, this command will list if each attr is 'unspecified', OPTIONS ------- +--stdin-paths:: + Read file names from stdin instead of from the command-line. + \--:: Interpret all preceding arguments as attributes, and all following arguments as path names. If not supplied, only the first argument will diff --git a/builtin-check-attr.c b/builtin-check-attr.c index 5a2e329..821eb5e 100644 --- a/builtin-check-attr.c +++ b/builtin-check-attr.c @@ -2,9 +2,19 @@ #include "cache.h" #include "attr.h" #include "quote.h" +#include "parse-options.h" -static const char check_attr_usage[] = -"git check-attr attr... [--] pathname..."; +static int stdin_paths; +static const char * const check_attr_usage[] = { +"git check-attr attr... [--] pathname...", +"git check-attr --stdin-paths attr... < <list-of-paths>", +NULL +}; + +static const struct option check_attr_options[] = { + OPT_BOOLEAN(0 , "stdin-paths", &stdin_paths, "read file names from stdin"), + OPT_END() +}; static void check_attr(int cnt, struct git_attr_check *check, const char** name, const char *file) @@ -27,17 +37,43 @@ static void check_attr(int cnt, struct git_attr_check *check, } } +static void check_attr_stdin_paths(int cnt, struct git_attr_check *check, + const char** name) +{ + struct strbuf buf, nbuf; + + strbuf_init(&buf, 0); + strbuf_init(&nbuf, 0); + while (strbuf_getline(&buf, stdin, '\n') != EOF) { + if (buf.buf[0] == '"') { + strbuf_reset(&nbuf); + if (unquote_c_style(&nbuf, buf.buf, NULL)) + die("line is badly quoted"); + strbuf_swap(&buf, &nbuf); + } + check_attr(cnt, check, name, buf.buf); + } + strbuf_release(&buf); + strbuf_release(&nbuf); +} + int cmd_check_attr(int argc, const char **argv, const char *prefix) { struct git_attr_check *check; int cnt, i, doubledash; + const char *errstr = NULL; + + argc = parse_options(argc, argv, check_attr_options, check_attr_usage, + PARSE_OPT_KEEP_DASHDASH); + if (!argc) + usage_with_options(check_attr_usage, check_attr_options); if (read_cache() < 0) { die("invalid cache"); } doubledash = -1; - for (i = 1; doubledash < 0 && i < argc; i++) { + for (i = 0; doubledash < 0 && i < argc; i++) { if (!strcmp(argv[i], "--")) doubledash = i; } @@ -45,25 +81,35 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix) /* If there is no double dash, we handle only one attribute */ if (doubledash < 0) { cnt = 1; - doubledash = 1; + doubledash = 0; } else - cnt = doubledash - 1; + cnt = doubledash; doubledash++; - if (cnt <= 0 || argc < doubledash) - usage(check_attr_usage); + if (cnt <= 0) + errstr = "No attribute specified"; + else if (stdin_paths && doubledash < argc) + errstr = "Can't specify files with --stdin-paths"; + if (errstr) { + error (errstr); + usage_with_options(check_attr_usage, check_attr_options); + } + check = xcalloc(cnt, sizeof(*check)); for (i = 0; i < cnt; i++) { const char *name; struct git_attr *a; - name = argv[i + 1]; + name = argv[i]; a = git_attr(name, strlen(name)); if (!a) return error("%s: not a valid attribute name", name); check[i].attr = a; } - for (i = doubledash; i < argc; i++) - check_attr(cnt, check, argv+1, argv[i]); + if (stdin_paths) + check_attr_stdin_paths(cnt, check, argv); + else + for (i = doubledash; i < argc; i++) + check_attr(cnt, check, argv, argv[i]); return 0; } diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh index 3d8e06a..f6901b4 100755 --- a/t/t0003-attributes.sh +++ b/t/t0003-attributes.sh @@ -47,6 +47,23 @@ test_expect_success 'attribute test' ' ' +test_expect_success 'attribute test: read paths from stdin' ' + + cat <<EOF > expect +f: test: f +a/f: test: f +a/c/f: test: f +a/g: test: a/g +a/b/g: test: a/b/g +b/g: test: unspecified +a/b/h: test: a/b/h +a/b/d/g: test: a/b/d/* +EOF + + sed -e "s/:.*//" < expect | git check-attr --stdin-paths test > actual && + test_cmp expect actual +' + test_expect_success 'root subdir attribute test' ' attr_check a/i a/i && -- 1.6.0.2.447.g3befd ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH 2/2] check-attr: Add --stdin-paths option 2008-10-05 2:30 ` [PATCH 2/2] check-attr: Add --stdin-paths option Dmitry Potapov @ 2008-10-06 7:09 ` Johannes Sixt 2008-10-07 0:14 ` [PATCH 1/2 v2] check-attr: add an internal check_attr() function Dmitry Potapov 2008-10-07 0:16 ` [PATCH 2/2 v2] check-attr: Add --stdin-paths option Dmitry Potapov 0 siblings, 2 replies; 44+ messages in thread From: Johannes Sixt @ 2008-10-06 7:09 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Alexander Gavrilov, git, Shawn O. Pearce, Paul Mackerras Dmitry Potapov schrieb: > +static void check_attr_stdin_paths(int cnt, struct git_attr_check *check, > + const char** name) > +{ > + struct strbuf buf, nbuf; > + > + strbuf_init(&buf, 0); > + strbuf_init(&nbuf, 0); > + while (strbuf_getline(&buf, stdin, '\n') != EOF) { > + if (buf.buf[0] == '"') { > + strbuf_reset(&nbuf); > + if (unquote_c_style(&nbuf, buf.buf, NULL)) > + die("line is badly quoted"); > + strbuf_swap(&buf, &nbuf); > + } > + check_attr(cnt, check, name, buf.buf); > + } > + strbuf_release(&buf); > + strbuf_release(&nbuf); > +} > + We know that you will want to use this feature in gitk to reduce the number of fork()s. But you've a problem: gitk will first write a path to git-check-addr's stdin, and then wait for the result on its stdout. But this is a classic pitfall: You are not guaranteed that something will be returned from stdout right away due to buffering. The least that is needed is fflush(stdout) in this loop (after each iteration!) so that gitk sees some result and does not hang forever. -- Hannes ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH 1/2 v2] check-attr: add an internal check_attr() function 2008-10-06 7:09 ` Johannes Sixt @ 2008-10-07 0:14 ` Dmitry Potapov 2008-10-07 0:16 ` [PATCH 2/2 v2] check-attr: Add --stdin-paths option Dmitry Potapov 1 sibling, 0 replies; 44+ messages in thread From: Dmitry Potapov @ 2008-10-07 0:14 UTC (permalink / raw) To: Johannes Sixt; +Cc: Alexander Gavrilov, git, Shawn O. Pearce, Paul Mackerras This step is preparation to introducing --stdin-paths option. I have also added maybe_flush_or_die() at the end of main() to ensure that we exit with the zero code only when we flushed the output successfully. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- maybe_flush_or_die() is added in this version of patch. builtin-check-attr.c | 42 ++++++++++++++++++++++++------------------ 1 files changed, 24 insertions(+), 18 deletions(-) diff --git a/builtin-check-attr.c b/builtin-check-attr.c index cb783fc..786256e 100644 --- a/builtin-check-attr.c +++ b/builtin-check-attr.c @@ -6,6 +6,27 @@ static const char check_attr_usage[] = "git check-attr attr... [--] pathname..."; +static void check_attr(int cnt, struct git_attr_check *check, + const char** name, const char *file) +{ + int j; + if (git_checkattr(file, cnt, check)) + die("git_checkattr died"); + for (j = 0; j < cnt; j++) { + const char *value = check[j].value; + + if (ATTR_TRUE(value)) + value = "set"; + else if (ATTR_FALSE(value)) + value = "unset"; + else if (ATTR_UNSET(value)) + value = "unspecified"; + + quote_c_style(file, NULL, stdout, 0); + printf(": %s: %s\n", name[j], value); + } +} + int cmd_check_attr(int argc, const char **argv, const char *prefix) { struct git_attr_check *check; @@ -42,23 +63,8 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix) check[i].attr = a; } - for (i = doubledash; i < argc; i++) { - int j; - if (git_checkattr(argv[i], cnt, check)) - die("git_checkattr died"); - for (j = 0; j < cnt; j++) { - const char *value = check[j].value; - - if (ATTR_TRUE(value)) - value = "set"; - else if (ATTR_FALSE(value)) - value = "unset"; - else if (ATTR_UNSET(value)) - value = "unspecified"; - - quote_c_style(argv[i], NULL, stdout, 0); - printf(": %s: %s\n", argv[j+1], value); - } - } + for (i = doubledash; i < argc; i++) + check_attr(cnt, check, argv+1, argv[i]); + maybe_flush_or_die(stdout, "attribute to stdout"); return 0; } -- 1.6.0.2.447.g3befd ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 2/2 v2] check-attr: Add --stdin-paths option 2008-10-06 7:09 ` Johannes Sixt 2008-10-07 0:14 ` [PATCH 1/2 v2] check-attr: add an internal check_attr() function Dmitry Potapov @ 2008-10-07 0:16 ` Dmitry Potapov 2008-10-08 15:24 ` Shawn O. Pearce 2008-10-10 22:39 ` Paul Mackerras 1 sibling, 2 replies; 44+ messages in thread From: Dmitry Potapov @ 2008-10-07 0:16 UTC (permalink / raw) To: Johannes Sixt; +Cc: Alexander Gavrilov, git, Shawn O. Pearce, Paul Mackerras This allows multiple paths to be specified on stdin. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- On Mon, Oct 06, 2008 at 09:09:11AM +0200, Johannes Sixt wrote: > > The least that is needed is fflush(stdout) in this loop (after each > iteration!) Thanks. Somehow, I forgot about it though it is quite obvious. I have added maybe_flush_or_die(). Documentation/git-check-attr.txt | 4 ++ builtin-check-attr.c | 70 ++++++++++++++++++++++++++++++++------ t/t0003-attributes.sh | 17 +++++++++ 3 files changed, 80 insertions(+), 11 deletions(-) diff --git a/Documentation/git-check-attr.txt b/Documentation/git-check-attr.txt index 2b821f2..0839a57 100644 --- a/Documentation/git-check-attr.txt +++ b/Documentation/git-check-attr.txt @@ -9,6 +9,7 @@ git-check-attr - Display gitattributes information. SYNOPSIS -------- 'git check-attr' attr... [--] pathname... +'git check-attr' --stdin-paths attr... < <list-of-paths DESCRIPTION ----------- @@ -17,6 +18,9 @@ For every pathname, this command will list if each attr is 'unspecified', OPTIONS ------- +--stdin-paths:: + Read file names from stdin instead of from the command-line. + \--:: Interpret all preceding arguments as attributes, and all following arguments as path names. If not supplied, only the first argument will diff --git a/builtin-check-attr.c b/builtin-check-attr.c index 786256e..fa1e4d5 100644 --- a/builtin-check-attr.c +++ b/builtin-check-attr.c @@ -2,9 +2,19 @@ #include "cache.h" #include "attr.h" #include "quote.h" +#include "parse-options.h" -static const char check_attr_usage[] = -"git check-attr attr... [--] pathname..."; +static int stdin_paths; +static const char * const check_attr_usage[] = { +"git check-attr attr... [--] pathname...", +"git check-attr --stdin-paths attr... < <list-of-paths>", +NULL +}; + +static const struct option check_attr_options[] = { + OPT_BOOLEAN(0 , "stdin-paths", &stdin_paths, "read file names from stdin"), + OPT_END() +}; static void check_attr(int cnt, struct git_attr_check *check, const char** name, const char *file) @@ -27,17 +37,44 @@ static void check_attr(int cnt, struct git_attr_check *check, } } +static void check_attr_stdin_paths(int cnt, struct git_attr_check *check, + const char** name) +{ + struct strbuf buf, nbuf; + + strbuf_init(&buf, 0); + strbuf_init(&nbuf, 0); + while (strbuf_getline(&buf, stdin, '\n') != EOF) { + if (buf.buf[0] == '"') { + strbuf_reset(&nbuf); + if (unquote_c_style(&nbuf, buf.buf, NULL)) + die("line is badly quoted"); + strbuf_swap(&buf, &nbuf); + } + check_attr(cnt, check, name, buf.buf); + maybe_flush_or_die(stdout, "attribute to stdout"); + } + strbuf_release(&buf); + strbuf_release(&nbuf); +} + int cmd_check_attr(int argc, const char **argv, const char *prefix) { struct git_attr_check *check; int cnt, i, doubledash; + const char *errstr = NULL; + + argc = parse_options(argc, argv, check_attr_options, check_attr_usage, + PARSE_OPT_KEEP_DASHDASH); + if (!argc) + usage_with_options(check_attr_usage, check_attr_options); if (read_cache() < 0) { die("invalid cache"); } doubledash = -1; - for (i = 1; doubledash < 0 && i < argc; i++) { + for (i = 0; doubledash < 0 && i < argc; i++) { if (!strcmp(argv[i], "--")) doubledash = i; } @@ -45,26 +82,37 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix) /* If there is no double dash, we handle only one attribute */ if (doubledash < 0) { cnt = 1; - doubledash = 1; + doubledash = 0; } else - cnt = doubledash - 1; + cnt = doubledash; doubledash++; - if (cnt <= 0 || argc < doubledash) - usage(check_attr_usage); + if (cnt <= 0) + errstr = "No attribute specified"; + else if (stdin_paths && doubledash < argc) + errstr = "Can't specify files with --stdin-paths"; + if (errstr) { + error (errstr); + usage_with_options(check_attr_usage, check_attr_options); + } + check = xcalloc(cnt, sizeof(*check)); for (i = 0; i < cnt; i++) { const char *name; struct git_attr *a; - name = argv[i + 1]; + name = argv[i]; a = git_attr(name, strlen(name)); if (!a) return error("%s: not a valid attribute name", name); check[i].attr = a; } - for (i = doubledash; i < argc; i++) - check_attr(cnt, check, argv+1, argv[i]); - maybe_flush_or_die(stdout, "attribute to stdout"); + if (stdin_paths) + check_attr_stdin_paths(cnt, check, argv); + else { + for (i = doubledash; i < argc; i++) + check_attr(cnt, check, argv, argv[i]); + maybe_flush_or_die(stdout, "attribute to stdout"); + } return 0; } diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh index 3d8e06a..f6901b4 100755 --- a/t/t0003-attributes.sh +++ b/t/t0003-attributes.sh @@ -47,6 +47,23 @@ test_expect_success 'attribute test' ' ' +test_expect_success 'attribute test: read paths from stdin' ' + + cat <<EOF > expect +f: test: f +a/f: test: f +a/c/f: test: f +a/g: test: a/g +a/b/g: test: a/b/g +b/g: test: unspecified +a/b/h: test: a/b/h +a/b/d/g: test: a/b/d/* +EOF + + sed -e "s/:.*//" < expect | git check-attr --stdin-paths test > actual && + test_cmp expect actual +' + test_expect_success 'root subdir attribute test' ' attr_check a/i a/i && -- 1.6.0.2.447.g3befd ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH 2/2 v2] check-attr: Add --stdin-paths option 2008-10-07 0:16 ` [PATCH 2/2 v2] check-attr: Add --stdin-paths option Dmitry Potapov @ 2008-10-08 15:24 ` Shawn O. Pearce 2008-10-12 14:19 ` Dmitry Potapov 2008-10-10 22:39 ` Paul Mackerras 1 sibling, 1 reply; 44+ messages in thread From: Shawn O. Pearce @ 2008-10-08 15:24 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Johannes Sixt, Alexander Gavrilov, git, Paul Mackerras Dmitry Potapov <dpotapov@gmail.com> wrote: > This allows multiple paths to be specified on stdin. > diff --git a/Documentation/git-check-attr.txt b/Documentation/git-check-attr.txt > index 2b821f2..0839a57 100644 > --- a/Documentation/git-check-attr.txt > +++ b/Documentation/git-check-attr.txt > @@ -9,6 +9,7 @@ git-check-attr - Display gitattributes information. > SYNOPSIS > -------- > 'git check-attr' attr... [--] pathname... > +'git check-attr' --stdin-paths attr... < <list-of-paths I wonder if the option should just be "--stdin". And since its being used mostly by automated tools (gitk/git-gui) I wonder if a -z should also be supported for input termination with NUL instead of LF. > diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh > index 3d8e06a..f6901b4 100755 > --- a/t/t0003-attributes.sh > +++ b/t/t0003-attributes.sh > @@ -47,6 +47,23 @@ test_expect_success 'attribute test' ' > > ' > > +test_expect_success 'attribute test: read paths from stdin' ' A test case for the quoting might also be good. > + > + cat <<EOF > expect > +f: test: f > +a/f: test: f > +a/c/f: test: f > +a/g: test: a/g > +a/b/g: test: a/b/g > +b/g: test: unspecified > +a/b/h: test: a/b/h > +a/b/d/g: test: a/b/d/* > +EOF > + > + sed -e "s/:.*//" < expect | git check-attr --stdin-paths test > actual && > + test_cmp expect actual -- Shawn. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 2/2 v2] check-attr: Add --stdin-paths option 2008-10-08 15:24 ` Shawn O. Pearce @ 2008-10-12 14:19 ` Dmitry Potapov 2008-10-12 15:04 ` Jakub Narebski 0 siblings, 1 reply; 44+ messages in thread From: Dmitry Potapov @ 2008-10-12 14:19 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Johannes Sixt, Alexander Gavrilov, git, Paul Mackerras On Wed, Oct 08, 2008 at 08:24:43AM -0700, Shawn O. Pearce wrote: > Dmitry Potapov <dpotapov@gmail.com> wrote: > > This allows multiple paths to be specified on stdin. > > > diff --git a/Documentation/git-check-attr.txt b/Documentation/git-check-attr.txt > > index 2b821f2..0839a57 100644 > > --- a/Documentation/git-check-attr.txt > > +++ b/Documentation/git-check-attr.txt > > @@ -9,6 +9,7 @@ git-check-attr - Display gitattributes information. > > SYNOPSIS > > -------- > > 'git check-attr' attr... [--] pathname... > > +'git check-attr' --stdin-paths attr... < <list-of-paths > > I wonder if the option should just be "--stdin". I used "--stdin-paths" because git hash-object uses it, while "--stdin" means to read the object from standard input. OTOH, we are never going to read the object from standard input in check-attr and some other git commands use "--stdin" to mean: read the list of paths from the standard input. So, I fully agree here. > And since its being > used mostly by automated tools (gitk/git-gui) I wonder if a -z should > also be supported for input termination with NUL instead of LF. I have added it, but after I did, I start to wonder whether it is the right thing to do to unquote NUL terminated input line? NUL terminated makes sense when you feed raw-bytes, and if the first byte happen to be a quote character, I suppose it should be treated just as any other byte, not as a sign that the string is quited. But then I looked at git checkout-index, and it unquotes string even if it is NUL terminated. I don't think it is the right thing to do, but just to be consistent, I have decided to leave as-is, i.e. to unquote a NUL terminated string. > > +test_expect_success 'attribute test: read paths from stdin' ' > > A test case for the quoting might also be good. As far as I can tell, there is no test case for special characters in filenames when these filenames are given as arguments. And there are a few problems with them. First, it is using colon as a separator in output, which breaks parsing of a filename containing colons. Second, I still have not figured out how to specify filenames with special characters in gitattributes. The documentation does not say anything and was lazy to study the code. Does gitattributes understand quote strings in filenames? Anyway, here is interdiff to my previous patch, which addresses two first points as I described above. (I can resend the full patch if necessary). -- >8 -- From: Dmitry Potapov <dpotapov@gmail.com> Date: Sun, 12 Oct 2008 18:08:43 +0400 Subject: [PATCH] check-attr: Add --stdin option --- Documentation/git-check-attr.txt | 8 ++++++-- builtin-check-attr.c | 13 +++++++++---- t/t0003-attributes.sh | 2 +- 3 files changed, 16 insertions(+), 7 deletions(-) diff --git a/Documentation/git-check-attr.txt b/Documentation/git-check-attr.txt index 0839a57..14e4374 100644 --- a/Documentation/git-check-attr.txt +++ b/Documentation/git-check-attr.txt @@ -9,7 +9,7 @@ git-check-attr - Display gitattributes information. SYNOPSIS -------- 'git check-attr' attr... [--] pathname... -'git check-attr' --stdin-paths attr... < <list-of-paths +'git check-attr' --stdin [-z] attr... < <list-of-paths DESCRIPTION ----------- @@ -18,9 +18,13 @@ For every pathname, this command will list if each attr is 'unspecified', OPTIONS ------- ---stdin-paths:: +--stdin:: Read file names from stdin instead of from the command-line. +-z:: + Only meaningful with `--stdin`; paths are separated with + NUL character instead of LF. + \--:: Interpret all preceding arguments as attributes, and all following arguments as path names. If not supplied, only the first argument will diff --git a/builtin-check-attr.c b/builtin-check-attr.c index fa1e4d5..02a8292 100644 --- a/builtin-check-attr.c +++ b/builtin-check-attr.c @@ -7,12 +7,16 @@ static int stdin_paths; static const char * const check_attr_usage[] = { "git check-attr attr... [--] pathname...", -"git check-attr --stdin-paths attr... < <list-of-paths>", +"git check-attr --stdin attr... < <list-of-paths>", NULL }; +static int null_term_line; + static const struct option check_attr_options[] = { - OPT_BOOLEAN(0 , "stdin-paths", &stdin_paths, "read file names from stdin"), + OPT_BOOLEAN(0 , "stdin", &stdin_paths, "read file names from stdin"), + OPT_BOOLEAN('z', NULL, &null_term_line, + "input paths are terminated by a null character"), OPT_END() }; @@ -41,10 +45,11 @@ static void check_attr_stdin_paths(int cnt, struct git_attr_check *check, const char** name) { struct strbuf buf, nbuf; + int line_termination = null_term_line ? 0 : '\n'; strbuf_init(&buf, 0); strbuf_init(&nbuf, 0); - while (strbuf_getline(&buf, stdin, '\n') != EOF) { + while (strbuf_getline(&buf, stdin, line_termination) != EOF) { if (buf.buf[0] == '"') { strbuf_reset(&nbuf); if (unquote_c_style(&nbuf, buf.buf, NULL)) @@ -90,7 +95,7 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix) if (cnt <= 0) errstr = "No attribute specified"; else if (stdin_paths && doubledash < argc) - errstr = "Can't specify files with --stdin-paths"; + errstr = "Can't specify files with --stdin"; if (errstr) { error (errstr); usage_with_options(check_attr_usage, check_attr_options); diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh index f6901b4..1c77192 100755 --- a/t/t0003-attributes.sh +++ b/t/t0003-attributes.sh @@ -60,7 +60,7 @@ a/b/h: test: a/b/h a/b/d/g: test: a/b/d/* EOF - sed -e "s/:.*//" < expect | git check-attr --stdin-paths test > actual && + sed -e "s/:.*//" < expect | git check-attr --stdin test > actual && test_cmp expect actual ' -- 1.6.0.2.521.g49aa8.dirty -- >8 -- ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH 2/2 v2] check-attr: Add --stdin-paths option 2008-10-12 14:19 ` Dmitry Potapov @ 2008-10-12 15:04 ` Jakub Narebski 2008-10-12 16:35 ` Dmitry Potapov 0 siblings, 1 reply; 44+ messages in thread From: Jakub Narebski @ 2008-10-12 15:04 UTC (permalink / raw) To: git Dmitry Potapov wrote: >> And since its being >> used mostly by automated tools (gitk/git-gui) I wonder if a -z should >> also be supported for input termination with NUL instead of LF. > > I have added it, but after I did, I start to wonder whether it is the > right thing to do to unquote NUL terminated input line? I think that -z should be not quoted (like git-diff-tree or git-ls-tree _output_); if it is, then IMHO it is a bug. -- Jakub Narebski Warsaw, Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 2/2 v2] check-attr: Add --stdin-paths option 2008-10-12 15:04 ` Jakub Narebski @ 2008-10-12 16:35 ` Dmitry Potapov 0 siblings, 0 replies; 44+ messages in thread From: Dmitry Potapov @ 2008-10-12 16:35 UTC (permalink / raw) To: Jakub Narebski; +Cc: git, Shawn O. Pearce On Sun, Oct 12, 2008 at 05:04:21PM +0200, Jakub Narebski wrote: > Dmitry Potapov wrote: > > >> And since its being > >> used mostly by automated tools (gitk/git-gui) I wonder if a -z should > >> also be supported for input termination with NUL instead of LF. > > > > I have added it, but after I did, I start to wonder whether it is the > > right thing to do to unquote NUL terminated input line? > > I think that -z should be not quoted (like git-diff-tree or git-ls-tree > _output_); if it is, then IMHO it is a bug. I am sorry. I was obviously wrong about 'git checkout-index', as it does not try to unquote if '-z' is specified. So, my previous patch was wrong. The corrected patch is below. It still uses the same output format whether '-z' is specified or not. Maybe I should change that too to use '\0' as the separator if '-z' is specified? Dmitry PS Please, add me to 'To:' or 'Cc:' when you reply to my email. -- >8 -- From: Dmitry Potapov <dpotapov@gmail.com> Date: Sun, 12 Oct 2008 18:08:43 +0400 Subject: [PATCH] check-attr: Add --stdin option --- This patch is meant to be squash on top of dp/checkattr in pu. Documentation/git-check-attr.txt | 8 ++++++-- builtin-check-attr.c | 15 ++++++++++----- t/t0003-attributes.sh | 2 +- 3 files changed, 17 insertions(+), 8 deletions(-) diff --git a/Documentation/git-check-attr.txt b/Documentation/git-check-attr.txt index 0839a57..14e4374 100644 --- a/Documentation/git-check-attr.txt +++ b/Documentation/git-check-attr.txt @@ -9,7 +9,7 @@ git-check-attr - Display gitattributes information. SYNOPSIS -------- 'git check-attr' attr... [--] pathname... -'git check-attr' --stdin-paths attr... < <list-of-paths +'git check-attr' --stdin [-z] attr... < <list-of-paths DESCRIPTION ----------- @@ -18,9 +18,13 @@ For every pathname, this command will list if each attr is 'unspecified', OPTIONS ------- ---stdin-paths:: +--stdin:: Read file names from stdin instead of from the command-line. +-z:: + Only meaningful with `--stdin`; paths are separated with + NUL character instead of LF. + \--:: Interpret all preceding arguments as attributes, and all following arguments as path names. If not supplied, only the first argument will diff --git a/builtin-check-attr.c b/builtin-check-attr.c index fa1e4d5..4921341 100644 --- a/builtin-check-attr.c +++ b/builtin-check-attr.c @@ -7,12 +7,16 @@ static int stdin_paths; static const char * const check_attr_usage[] = { "git check-attr attr... [--] pathname...", -"git check-attr --stdin-paths attr... < <list-of-paths>", +"git check-attr --stdin attr... < <list-of-paths>", NULL }; +static int null_term_line; + static const struct option check_attr_options[] = { - OPT_BOOLEAN(0 , "stdin-paths", &stdin_paths, "read file names from stdin"), + OPT_BOOLEAN(0 , "stdin", &stdin_paths, "read file names from stdin"), + OPT_BOOLEAN('z', NULL, &null_term_line, + "input paths are terminated by a null character"), OPT_END() }; @@ -41,11 +45,12 @@ static void check_attr_stdin_paths(int cnt, struct git_attr_check *check, const char** name) { struct strbuf buf, nbuf; + int line_termination = null_term_line ? 0 : '\n'; strbuf_init(&buf, 0); strbuf_init(&nbuf, 0); - while (strbuf_getline(&buf, stdin, '\n') != EOF) { - if (buf.buf[0] == '"') { + while (strbuf_getline(&buf, stdin, line_termination) != EOF) { + if (line_termination && buf.buf[0] == '"') { strbuf_reset(&nbuf); if (unquote_c_style(&nbuf, buf.buf, NULL)) die("line is badly quoted"); @@ -90,7 +95,7 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix) if (cnt <= 0) errstr = "No attribute specified"; else if (stdin_paths && doubledash < argc) - errstr = "Can't specify files with --stdin-paths"; + errstr = "Can't specify files with --stdin"; if (errstr) { error (errstr); usage_with_options(check_attr_usage, check_attr_options); diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh index f6901b4..1c77192 100755 --- a/t/t0003-attributes.sh +++ b/t/t0003-attributes.sh @@ -60,7 +60,7 @@ a/b/h: test: a/b/h a/b/d/g: test: a/b/d/* EOF - sed -e "s/:.*//" < expect | git check-attr --stdin-paths test > actual && + sed -e "s/:.*//" < expect | git check-attr --stdin test > actual && test_cmp expect actual ' -- 1.6.0.2.521.g739d3 -- >8 -- ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH 2/2 v2] check-attr: Add --stdin-paths option 2008-10-07 0:16 ` [PATCH 2/2 v2] check-attr: Add --stdin-paths option Dmitry Potapov 2008-10-08 15:24 ` Shawn O. Pearce @ 2008-10-10 22:39 ` Paul Mackerras 2008-10-12 14:30 ` Dmitry Potapov 1 sibling, 1 reply; 44+ messages in thread From: Paul Mackerras @ 2008-10-10 22:39 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Johannes Sixt, Alexander Gavrilov, git, Shawn O. Pearce Dmitry Potapov writes: > This allows multiple paths to be specified on stdin. > > Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> > --- > On Mon, Oct 06, 2008 at 09:09:11AM +0200, Johannes Sixt wrote: > > > > The least that is needed is fflush(stdout) in this loop (after each > > iteration!) > > Thanks. Somehow, I forgot about it though it is quite obvious. > I have added maybe_flush_or_die(). Actually, what was done with git diff-tree --stdin was to have it do fflush(stdout) when it sees a blank line in the input. That gives the calling program a way to get the output up to that point without having to do a flush for every line of output. Paul. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 2/2 v2] check-attr: Add --stdin-paths option 2008-10-10 22:39 ` Paul Mackerras @ 2008-10-12 14:30 ` Dmitry Potapov 0 siblings, 0 replies; 44+ messages in thread From: Dmitry Potapov @ 2008-10-12 14:30 UTC (permalink / raw) To: Paul Mackerras; +Cc: Johannes Sixt, Alexander Gavrilov, git, Shawn O. Pearce On Sat, Oct 11, 2008 at 09:39:49AM +1100, Paul Mackerras wrote: > > Actually, what was done with git diff-tree --stdin was to have it do > fflush(stdout) when it sees a blank line in the input. That gives the > calling program a way to get the output up to that point without > having to do a flush for every line of output. It is an interesting idea, but I have not seen any other git command doing so, and my measurement in git repo on Linux have not offered any noticeable speed-up. So, I am not sure if it is worth it. Dmitry ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI 2008-09-21 22:55 ` Paul Mackerras 2008-09-22 10:12 ` Alexander Gavrilov @ 2008-10-01 11:35 ` Johannes Sixt 2008-10-10 10:46 ` Paul Mackerras 1 sibling, 1 reply; 44+ messages in thread From: Johannes Sixt @ 2008-10-01 11:35 UTC (permalink / raw) To: Paul Mackerras; +Cc: Alexander Gavrilov, git, Shawn O. Pearce Paul Mackerras schrieb: > Alexander Gavrilov writes: >> 3) It can be further set on per-file basis by specifying >> the 'encoding' attribute in gitattributes. > > I haven't used .gitattributes before, but I would expect that the > .gitattributes files would be stored in the repository along with > everything else. If that's the case, then for gitk at least there is > the question of which version of a given .gitattributes file one > should use when viewing the tree for a commit which isn't the > currently checked-out commit - do you use the version from that tree, > or the version in the working directory? We seem to be using the > latter at present, and caching the results. Is there a philosophical > reason to do that, other than speed? I understand your concerns that an encoding may be picked from the "wrong" .gitattributes file. But in practice it doesn't matter much, and picking the attribute from a past commit's tree would even be counter-productive: I'm about to add a .gitattributes file that specifies the encoding for some of my files *today* because I was not clever enough to anticipate the usefulness of an "encoding" attribute a year ago when those files were added to the repository. When I browse history, I *do* want that *today's* encoding is picked. > (Also it seems that we won't > notice if the user changes .gitattributes after we've looked at it, or > if they create one after we've looked for one and not found it.) This is not a show stopper, IMHO. The user will notice soon enough, and can restart gitk. Nobody sane will change the encoding attributes every hour. That said, a menu command to flush the attribute cache would be useful every now and then. -- Hannes ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI 2008-10-01 11:35 ` [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI Johannes Sixt @ 2008-10-10 10:46 ` Paul Mackerras 0 siblings, 0 replies; 44+ messages in thread From: Paul Mackerras @ 2008-10-10 10:46 UTC (permalink / raw) To: Johannes Sixt; +Cc: Alexander Gavrilov, git, Shawn O. Pearce Johannes Sixt writes: > I'm about to add a .gitattributes file that specifies the encoding for > some of my files *today* because I was not clever enough to anticipate the > usefulness of an "encoding" attribute a year ago when those files were > added to the repository. When I browse history, I *do* want that *today's* > encoding is picked. Fair enough. > > (Also it seems that we won't > > notice if the user changes .gitattributes after we've looked at it, or > > if they create one after we've looked for one and not found it.) > > This is not a show stopper, IMHO. The user will notice soon enough, and > can restart gitk. Nobody sane will change the encoding attributes every hour. The scenario I'm thinking of is this: a user clicks on a file to display it, sees that it isn't displayed using the encoding he wants, says "duh" and creates a .gitattributes file. User clicks on the file again, sees that it *still* isn't displayed using the right encoding, and says something worse than "duh". :) If it was just one .gitattributes file, then it wouldn't be hard to stat it each time we go to display some file, and throw away our cache if it has changed. But it looks like we would have to do N+1 stats for a file N levels deep in the tree (since it looks like we also need to stat $GIT_DIR/info/attributes). Still, it may be worth it. At least on Linux, stat is pretty fast. Paul. ^ permalink raw reply [flat|nested] 44+ messages in thread
end of thread, other threads:[~2008-10-12 16:36 UTC | newest] Thread overview: 44+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-09-17 21:07 [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 2/8] git-gui: Add a menu of available encodings Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 3/8] git-gui: Allow forcing display encoding for diffs using a submenu Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 4/8] git-gui: Optimize encoding name resolution using a lookup table Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 5/8] git-gui: Support the encoding menu in gui blame Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 7/8] gitk: Implement file contents encoding support Alexander Gavrilov 2008-09-17 21:07 ` [PATCH (GIT-GUI,GITK) 8/8] gitk: Support filenames in the locale encoding Alexander Gavrilov 2008-09-19 12:10 ` [PATCH (GIT-GUI,GITK) 6/8] gitk: Port new encoding logic from git-gui Johannes Sixt 2008-09-19 12:38 ` Alexander Gavrilov 2008-09-19 13:04 ` Johannes Sixt 2008-09-21 18:52 ` Alexander Gavrilov 2008-09-22 7:25 ` Johannes Sixt 2008-09-22 7:46 ` Johannes Sixt 2008-09-22 8:01 ` Alexander Gavrilov 2008-09-22 8:20 ` Johannes Sixt 2008-09-22 9:02 ` Alexander Gavrilov 2008-09-22 9:18 ` Johannes Sixt 2008-09-22 10:18 ` Alexander Gavrilov 2008-09-22 9:01 ` Dmitry Potapov 2008-09-18 15:02 ` [PATCH (GIT-GUI,GITK) 1/8] git-gui: Cleanup handling of the default encoding Dmitry Potapov 2008-09-18 15:14 ` Alexander Gavrilov 2008-09-18 16:29 ` Johannes Sixt 2008-09-18 16:50 ` Dmitry Potapov 2008-09-18 17:00 ` Alexander Gavrilov 2008-09-18 17:19 ` Dmitry Potapov 2008-09-17 21:45 ` [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI Paul Mackerras 2008-09-18 11:12 ` Alexander Gavrilov 2008-09-21 22:55 ` Paul Mackerras 2008-09-22 10:12 ` Alexander Gavrilov 2008-10-05 2:30 ` [PATCH 1/2] check-attr: add an internal check_attr() function Dmitry Potapov 2008-10-05 2:30 ` [PATCH 2/2] check-attr: Add --stdin-paths option Dmitry Potapov 2008-10-06 7:09 ` Johannes Sixt 2008-10-07 0:14 ` [PATCH 1/2 v2] check-attr: add an internal check_attr() function Dmitry Potapov 2008-10-07 0:16 ` [PATCH 2/2 v2] check-attr: Add --stdin-paths option Dmitry Potapov 2008-10-08 15:24 ` Shawn O. Pearce 2008-10-12 14:19 ` Dmitry Potapov 2008-10-12 15:04 ` Jakub Narebski 2008-10-12 16:35 ` Dmitry Potapov 2008-10-10 22:39 ` Paul Mackerras 2008-10-12 14:30 ` Dmitry Potapov 2008-10-01 11:35 ` [PATCH (GIT-GUI,GITK) 0/8] Encoding support in GUI Johannes Sixt 2008-10-10 10:46 ` Paul Mackerras
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).