* Performance of various compressors @ 2005-04-21 5:06 Mike Taht 2005-04-21 5:14 ` Mike Taht 2005-04-22 20:38 ` Performance of various compressors Aaron Lehmann 0 siblings, 2 replies; 11+ messages in thread From: Mike Taht @ 2005-04-21 5:06 UTC (permalink / raw) To: git I started rolling a tool to measure various aspects of git performance. I will start looking at merge next, and at workloads different from the kernel (gcc4 anyone?) ... The only data points worth sharing a this point are: That doing the compression at a level of 3, rather than the max of 9, cuts the cpu time required for a big git commit by over half, and that that actually translates into a win on the I/O to disk. (these tests were performed on a dual opteron 842) The benefits of compression aren't very much for git right now. And: A big git commit is I/O bound. But we knew that. Maybe it's possible to make it less I/O bound. Git branch: 7a4c67965de68ae7bc7aa1fde33f8eb9d8114697 Tree: 2.6.11.7 source tree Branch: N/a Merge File: N/a HW: dual opteron 242 Mem: 1GB Disk: seagate barracuda Filesystem: Reiser3 Git add: N/a Cache: Hot Git Commit: 44.97user 5.94system 1:45.24elapsed 48%CPU Git Merge: Options: Feature: Test of compression=9 (std git) du -s .git/objects 110106 # du is probably not the right thing du -s --apparent-size .git/objects 58979 Git branch: 9e272677621c91784cf2533123a41745178f0701 Tree: 2.6.11.7 source tree Branch: N/a Merge File: N/a HW: dual opteron 242 Mem: 1GB Disk: seagate barracuda Disk mode: udma5 Filesystem: Reiser3 Git add: N/a Cache: Hot Git Commit: 16.79user 6.15system 1:21.92elapsed 28%CPU Git Merge: Options: Feature: Test of compression=3 (std git) du -s .git/objects 115218 du -s --apparent-size .git/objects 64274 There's some variety in the best/worst case timings for I/O for the compressor=3 case... 16.79user 6.15system 1:21.92elapsed 28%CPU 16.68user 5.71system 1:13.19elapsed 30%CPU -- Mike Taht lastly - Timings of git commit with tmpfs (note, these were done with an ancient, 5 hour old version of git and the script) Hot cache, tmpfs .git compression=9 44.97user 2.76system 0:47.72elapsed 100%CPU Hot cache, tmpfs .git, compression=6 Wed Apr 20 20:18:11 PDT 2005 23.55user 2.83system 0:26.36elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+568680minor)pagefaults 0swaps 109620 .git/objects 58618 .git/objects ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Performance of various compressors 2005-04-21 5:06 Performance of various compressors Mike Taht @ 2005-04-21 5:14 ` Mike Taht 2005-04-21 5:22 ` [PATCH] experimental - " Mike Taht 2005-04-22 20:38 ` Performance of various compressors Aaron Lehmann 1 sibling, 1 reply; 11+ messages in thread From: Mike Taht @ 2005-04-21 5:14 UTC (permalink / raw) To: git Just to clarify this was a git add of the linux-2.6.11.7 sources (sorry, untimed) , and timing the git commit. Mo betta data latah. Mike Taht wrote: > I started rolling a tool to measure various aspects of git performance. > I will start looking at merge next, and at workloads different from the > kernel (gcc4 anyone?) ... > > The only data points worth sharing a this point are: > > That doing the compression at a level of 3, rather than the max of 9, > cuts the cpu time required for a big git commit by over half, and that > that actually translates into a win on the I/O to disk. (these tests > were performed on a dual opteron 842) > > The benefits of compression aren't very much for git right now. > > And: A big git commit is I/O bound. But we knew that. Maybe it's > possible to make it less I/O bound. > > Git branch: 7a4c67965de68ae7bc7aa1fde33f8eb9d8114697 > Tree: 2.6.11.7 source tree > Branch: N/a > Merge File: N/a > HW: dual opteron 242 > Mem: 1GB > Disk: seagate barracuda > Filesystem: Reiser3 > Git add: N/a > Cache: Hot > Git Commit: 44.97user 5.94system 1:45.24elapsed 48%CPU > Git Merge: > Options: > Feature: Test of compression=9 (std git) > > du -s .git/objects 110106 # du is probably not the right thing > du -s --apparent-size .git/objects 58979 > > Git branch: 9e272677621c91784cf2533123a41745178f0701 > Tree: 2.6.11.7 source tree > Branch: N/a > Merge File: N/a > HW: dual opteron 242 > Mem: 1GB > Disk: seagate barracuda > Disk mode: udma5 > Filesystem: Reiser3 > Git add: N/a > Cache: Hot > Git Commit: 16.79user 6.15system 1:21.92elapsed 28%CPU > Git Merge: > Options: > Feature: Test of compression=3 (std git) > > du -s .git/objects 115218 > du -s --apparent-size .git/objects 64274 > > There's some variety in the best/worst case timings for I/O for the > compressor=3 case... > > 16.79user 6.15system 1:21.92elapsed 28%CPU > 16.68user 5.71system 1:13.19elapsed 30%CPU -- Mike Taht "The chief contribution of Protestantism to human thought is its massive proof that God is a bore. -- H.L. Mencken, "The Aesthetic Recoil," American Mercury, July, 1931." ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] experimental - Performance of various compressors 2005-04-21 5:14 ` Mike Taht @ 2005-04-21 5:22 ` Mike Taht 2005-04-21 10:23 ` HOWTO: PATCH: don't hardcode path-to-bash, use sys/limits.h Klaus Robert Suetterlin 0 siblings, 1 reply; 11+ messages in thread From: Mike Taht @ 2005-04-21 5:22 UTC (permalink / raw) To: Mike Taht; +Cc: git Don't apply this patch and change GIT_COMPRESSION unless you know what you are doing and why you are doing it. You will break an older version of git. You may break a newer version of git. You have been warned. I also note that there's a bzlib out there. cache.h: 828d660ab82bb35a1ca632a2ba4620dc483889bd --- a/cache.h +++ b/cache.h @@ -16,6 +16,8 @@ #include <openssl/sha.h> #include <zlib.h> +#define GIT_COMPRESSION Z_BEST_COMPRESSION + /* * Basic data structures for the directory cache * sha1_file.c: 754e8b4e9ea8104df48152f875d6b874304e2a62 --- a/sha1_file.c +++ b/sha1_file.c @@ -199,7 +199,7 @@ int write_sha1_file(char *buf, unsigned /* Set it up */ memset(&stream, 0, sizeof(stream)); - deflateInit(&stream, Z_BEST_COMPRESSION); + deflateInit(&stream, GIT_COMPRESSION); size = deflateBound(&stream, len); compressed = malloc(size); update-cache.c: a09883541c745c76413c62109a80f40df4b7a7fb --- a/update-cache.c +++ b/update-cache.c @@ -40,7 +40,7 @@ static int index_fd(unsigned char *sha1, SHA1_Final(sha1, &c); memset(&stream, 0, sizeof(stream)); - deflateInit(&stream, Z_BEST_COMPRESSION); + deflateInit(&stream, GIT_COMPRESSION); /* * ASCII size + nul byte Mike Taht wrote: > Just to clarify this was a git add of the linux-2.6.11.7 sources (sorry, > untimed) , and timing the git commit. > > Mo betta data latah. > > Mike Taht wrote: > >> I started rolling a tool to measure various aspects of git >> performance. I will start looking at merge next, and at workloads >> different from the kernel (gcc4 anyone?) ... >> >> The only data points worth sharing a this point are: >> >> That doing the compression at a level of 3, rather than the max of 9, >> cuts the cpu time required for a big git commit by over half, and that >> that actually translates into a win on the I/O to disk. (these tests >> were performed on a dual opteron 842) >> >> The benefits of compression aren't very much for git right now. >> >> And: A big git commit is I/O bound. But we knew that. Maybe it's >> possible to make it less I/O bound. >> >> Git branch: 7a4c67965de68ae7bc7aa1fde33f8eb9d8114697 >> Tree: 2.6.11.7 source tree >> Branch: N/a >> Merge File: N/a >> HW: dual opteron 242 >> Mem: 1GB >> Disk: seagate barracuda >> Filesystem: Reiser3 >> Git add: N/a >> Cache: Hot >> Git Commit: 44.97user 5.94system 1:45.24elapsed 48%CPU >> Git Merge: >> Options: >> Feature: Test of compression=9 (std git) >> >> du -s .git/objects 110106 # du is probably not the right thing >> du -s --apparent-size .git/objects 58979 >> >> Git branch: 9e272677621c91784cf2533123a41745178f0701 >> Tree: 2.6.11.7 source tree >> Branch: N/a >> Merge File: N/a >> HW: dual opteron 242 >> Mem: 1GB >> Disk: seagate barracuda >> Disk mode: udma5 >> Filesystem: Reiser3 >> Git add: N/a >> Cache: Hot >> Git Commit: 16.79user 6.15system 1:21.92elapsed 28%CPU >> Git Merge: >> Options: >> Feature: Test of compression=3 (std git) >> >> du -s .git/objects 115218 >> du -s --apparent-size .git/objects 64274 >> >> There's some variety in the best/worst case timings for I/O for the >> compressor=3 case... >> >> 16.79user 6.15system 1:21.92elapsed 28%CPU >> 16.68user 5.71system 1:13.19elapsed 30%CPU > > > -- Mike Taht ""His mind is like a steel trap -- full of mice." -- Foghorn Leghorn" ^ permalink raw reply [flat|nested] 11+ messages in thread
* HOWTO: PATCH: don't hardcode path-to-bash, use sys/limits.h 2005-04-21 5:22 ` [PATCH] experimental - " Mike Taht @ 2005-04-21 10:23 ` Klaus Robert Suetterlin 2005-04-21 14:31 ` Alecs King 0 siblings, 1 reply; 11+ messages in thread From: Klaus Robert Suetterlin @ 2005-04-21 10:23 UTC (permalink / raw) To: Mike Taht; +Cc: git Hi, I supply a patch that dehardcodes the path to bash (which is not /bin on all computers) and adds sys/limits.h to provide ULONG_MAX. If this is not the right way to supply patches, or if this email misses some crucial point please tell me so, and supply explanation. -- Robert Suetterlin (robert@mpe.mpg.de) phone: (+49)89 / 30000-3546 fax: (+49)89 / 30000-3950 commit 5f6caff82b1f3b5931d92aaff99be6d8dbad10ca tree d7ea8aeefbbc2ab63cb5acd41b647b1b5f11fb83 parent cd1c034369b73da7503da365fa556aab27004814 author Klaus Robert Suetterlin <krs@xdt04.mpe-garching.mpg.de> 1114078431 +0200 committer Klaus Robert Suetterlin <krs@xdt04.mpe-garching.mpg.de> 1114078431 +0200 Don't hardcode the path-to-bash please. Index: commit.c =================================================================== --- c0260bfb82da04aeff4e598ced5295d6ae2e262d/commit.c (mode:100644 sha1:eda45d7e15358ed6f2cd0502de2a08987307fc98) +++ d7ea8aeefbbc2ab63cb5acd41b647b1b5f11fb83/commit.c (mode:100644 sha1:cfe9a8ddf6ee2702e3923cb22240f9f9ed1bd04c) @@ -1,3 +1,4 @@ +#include <sys/limits.h> #include "commit.h" #include "cache.h" #include <string.h> Index: gitdiff-do =================================================================== --- c0260bfb82da04aeff4e598ced5295d6ae2e262d/gitdiff-do (mode:100755 sha1:afed4e40b259a61b0f12979ba7326f26743bc553) +++ d7ea8aeefbbc2ab63cb5acd41b647b1b5f11fb83/gitdiff-do (mode:100755 sha1:218dfabeb4a5dcbd2cf58bd6f672f385690ec397) @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # # Make a diff between two GIT trees. # Copyright (c) Petr Baudis, 2005 Index: gitlog.sh =================================================================== --- c0260bfb82da04aeff4e598ced5295d6ae2e262d/gitlog.sh (mode:100755 sha1:a496a864f9586e47a4d7bd3ae0af0b3e07b7deb8) +++ d7ea8aeefbbc2ab63cb5acd41b647b1b5f11fb83/gitlog.sh (mode:100755 sha1:7b3aa8a89bc64273c648920ccd1686859754803e) @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # # Make a log of changes in a GIT branch. # ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: HOWTO: PATCH: don't hardcode path-to-bash, use sys/limits.h 2005-04-21 10:23 ` HOWTO: PATCH: don't hardcode path-to-bash, use sys/limits.h Klaus Robert Suetterlin @ 2005-04-21 14:31 ` Alecs King 2005-04-21 19:42 ` [PATCH] #!/bin/sh --> #!/usr/bin/env bash Alecs King 0 siblings, 1 reply; 11+ messages in thread From: Alecs King @ 2005-04-21 14:31 UTC (permalink / raw) To: git On Thu, Apr 21, 2005 at 12:23:26PM +0200, Klaus Robert Suetterlin wrote: > Hi, > > I supply a patch that dehardcodes the path to bash (which is not /bin > on all computers) and adds sys/limits.h to provide ULONG_MAX. Hi, i did a similar patch a while back ago. As for ULONG_MAX, not every sytem has <sys/limits.h>, i think <limits.h> is the rite place to go. The patch below tested on both debian and fbsd. commit 2deea74db72fb57a8b80e7945f23814112b22723 tree 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd parent cd1c034369b73da7503da365fa556aab27004814 author Alecs King <alecsk ! gmail d@t com> 1114075114 +0800 committer Alecs King <alecsk ! gmail d@t com> 1114075114 +0800 trivial fix for making it more portable Index: commit-tree.c =================================================================== --- c0260bfb82da04aeff4e598ced5295d6ae2e262d/commit-tree.c (mode:100644 sha1:043c7aa371101a1ea8cfc467279abf6c8acc7fd1) +++ 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/commit-tree.c (mode:100644 sha1:8a1f12dca07041d203ce22442b8470d42d322ef5) @@ -252,7 +252,7 @@ then -= offset; - snprintf(result, maxlen, "%lu %5.5s", then, p); + snprintf(result, maxlen, "%lu %5.5s", (unsigned long) then, p); } static void check_valid(unsigned char *sha1, const char *expect) Index: commit.c =================================================================== --- c0260bfb82da04aeff4e598ced5295d6ae2e262d/commit.c (mode:100644 sha1:eda45d7e15358ed6f2cd0502de2a08987307fc98) +++ 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/commit.c (mode:100644 sha1:9f0668eb68cec56a738a58fe930ae0ae2960e2b2) @@ -1,6 +1,7 @@ #include "commit.h" #include "cache.h" #include <string.h> +#include <limits.h> const char *commit_type = "commit"; Index: gitdiff-do =================================================================== --- c0260bfb82da04aeff4e598ced5295d6ae2e262d/gitdiff-do (mode:100755 sha1:afed4e40b259a61b0f12979ba7326f26743bc553) +++ 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitdiff-do (mode:100755 sha1:218dfabeb4a5dcbd2cf58bd6f672f385690ec397) @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # # Make a diff between two GIT trees. # Copyright (c) Petr Baudis, 2005 Index: gitlog.sh =================================================================== --- c0260bfb82da04aeff4e598ced5295d6ae2e262d/gitlog.sh (mode:100755 sha1:a496a864f9586e47a4d7bd3ae0af0b3e07b7deb8) +++ 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitlog.sh (mode:100755 sha1:7b3aa8a89bc64273c648920ccd1686859754803e) @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # # Make a log of changes in a GIT branch. # Index: revision.h =================================================================== --- c0260bfb82da04aeff4e598ced5295d6ae2e262d/revision.h (mode:100644 sha1:46cc10440be781cea4993aca37ee35e251495084) +++ 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/revision.h (mode:100644 sha1:f0754f5d8ea3da52503b8ea8c16b34566e4ae6e0) @@ -10,6 +10,7 @@ * definition for this rev, and not just seen it as * a parent target. */ +#include <limits.h> #define marked(rev) ((rev)->flags & 0xffff) #define SEEN 0x10000 #define USED 0x20000 -- Alecs King ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] #!/bin/sh --> #!/usr/bin/env bash 2005-04-21 14:31 ` Alecs King @ 2005-04-21 19:42 ` Alecs King 2005-04-22 7:37 ` H. Peter Anvin 0 siblings, 1 reply; 11+ messages in thread From: Alecs King @ 2005-04-21 19:42 UTC (permalink / raw) To: git On Thu, Apr 21, 2005 at 10:31:02PM +0800, Alecs King wrote: > On Thu, Apr 21, 2005 at 12:23:26PM +0200, Klaus Robert Suetterlin wrote: > > Hi, > > > > I supply a patch that dehardcodes the path to bash (which is not /bin > > on all computers) and adds sys/limits.h to provide ULONG_MAX. > > Hi, i did a similar patch a while back ago. As for ULONG_MAX, not every > sytem has <sys/limits.h>, i think <limits.h> is the rite place to go. > > The patch below tested on both debian and fbsd. > > [snip] And as for bash, only gitdiff-do and gitlog.sh 'explicitly' use bash instead of /bin/sh. On most Linux distros, /bin/sh is just a symbolic link to bash. But not on some others. I found gitlsobj.sh could not work using a plain /bin/sh on fbsd. To make life easier, i think it might be better if we all explicitly use bash for all shell scripts. patch below assumes the patch above has been applied. commit 341cd1241815178d567ce612c97c2bb5a663021a tree abb16c39fe8354383b632f7fa9dd4611ff66e1d1 parent 2deea74db72fb57a8b80e7945f23814112b22723 author Alecs King <alecsk ! gmail d@t com> 1114107613 +0800 committer Alecs King <alecsk ! gmail d@t com> 1114107613 +0800 Explicitly use bash #!/bin/sh ==> #!/usr/bin/env bash Index: gitXlntree.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitXlntree.sh (mode:100755 sha1:c474913d09906739d8175f1b430720a3ac67e798) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitXlntree.sh (mode:100755 sha1:adc01eeb56f394a6168ae1f6f1fe4c40e1c2aecc) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Provide an independent view to the objects database. # Copyright (c) Petr Baudis, 2005 Index: gitXnormid.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitXnormid.sh (mode:100755 sha1:c0d53afabe8662ebfc3c697faf08b0a2b43c93f7) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitXnormid.sh (mode:100755 sha1:9b311aca57bd8b7012f45d730c6fd26d5fb5d2b2) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Internal: Normalize the given ID to a tree ID. # Copyright (c) Petr Baudis, 2005 Index: gitadd.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitadd.sh (mode:100755 sha1:3f5e9a2d6b452d596cd853f1585113bdb356a2e3) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitadd.sh (mode:100755 sha1:6feb7372e95be4546af17e0c6b55d10c9a1c441d) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Add new file to a GIT repository. # Copyright (c) Petr Baudis, 2005 Index: gitaddremote.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitaddremote.sh (mode:100755 sha1:a117b9e8d14b977143caa48c26fc51794e8b7135) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitaddremote.sh (mode:100755 sha1:bccaa9068063b07d13012477861c6706b7cd40a6) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Add new "remote" to the GIT repository. # Copyright (c) Petr Baudis, 2005 Index: gitapply.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitapply.sh (mode:100755 sha1:7703809dc0743c6e4c1fa5b7d922a4efc16b4276) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitapply.sh (mode:100755 sha1:794ea5ed6acdd34e34742a17cbd784dcbf738289) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Apply a diff generated by git diff. # Copyright (c) Petr Baudis, 2005 Index: gitcancel.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitcancel.sh (mode:100755 sha1:74b4083d67eda87d88a6f92c6c66877bba8bda8a) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitcancel.sh (mode:100755 sha1:c320ee98e2ed0b13a68de3b2ec4e4a8451b5189a) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Cancels current edits in the working tree. # Copyright (c) Petr Baudis, 2005 Index: gitcommit.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitcommit.sh (mode:100755 sha1:a13bef2c84492ed75679d7d52bb710df35544f8a) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitcommit.sh (mode:100755 sha1:0207f402cc5107de2a4685f6fcade081c41d91e9) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Commit into a GIT repository. # Copyright (c) Petr Baudis, 2005 Index: gitdiff.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitdiff.sh (mode:100755 sha1:8e14a868f513f4ec524a2c8974c8d202c6824038) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitdiff.sh (mode:100755 sha1:e27915d4172717ddd4d01269877312b08ed2acc4) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Make a diff between two GIT trees. # Copyright (c) Petr Baudis, 2005 Index: gitexport.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitexport.sh (mode:100755 sha1:5b94424beca55ffe6b5535e4975e6e63c1bae672) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitexport.sh (mode:100755 sha1:428cd9d845598e320556729b6098505132a4e7c4) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Exports a particular revision from a GIT repository. # Copyright (c) Johannes E. Schindelin, 2005 Index: gitfork.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitfork.sh (mode:100755 sha1:b827c3037ac4f3cdfb6708bf8edb60944f59318a) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitfork.sh (mode:100755 sha1:ce26f985ebb48b6a3127ac8afd427ba30ba5668a) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Create a branch sharing the objects database. # Copyright (c) Petr Baudis, 2005 Index: gitinit.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitinit.sh (mode:100755 sha1:9905166859827893e326b01bdc3970ff6d51064d) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitinit.sh (mode:100755 sha1:bc00e9ee709aabeb4764b77ac4e5a19212fa5857) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Initialize a GIT repository. # Copyright (c) Petr Baudis, 2005 Index: gitls.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitls.sh (mode:100755 sha1:c8d2220eae66addd49493cdb32af21b6c0217b23) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitls.sh (mode:100755 sha1:a05883b09512bd1d1fe31e1c6d43f01a395c58a1) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # List contents of a particular tree in a GIT repository. # Copyright (c) Petr Baudis, 2005 Index: gitlsobj.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitlsobj.sh (mode:100755 sha1:423a1bc7476bad7bf40f1b3ddb03d83fdcf1f9cd) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitlsobj.sh (mode:100755 sha1:3f4426eeac7cc5ad51a46632319814fbf62b2cc3) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # List objects of the GIT repository. # Copyright (c) Randy Dunlap, 2005 Index: gitlsremote.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitlsremote.sh (mode:100755 sha1:2212be93aaa8a371e83cafb69fa21a7a1b24ed13) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitlsremote.sh (mode:100755 sha1:29657d7a899ffb425a36ec04bf1c62aa1ecc14d7) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Lists remote GIT repositories # Copyright (c) Steven Cole 2005 Index: gitmerge-file.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitmerge-file.sh (mode:100755 sha1:820de487babb76ce419b6823c8fe4c58608d0c8c) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitmerge-file.sh (mode:100755 sha1:237186eaefc4a503c386e4a0e7c28818e6704db7) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Copyright (c) Linus Torvalds, 2005 # Index: gitmerge.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitmerge.sh (mode:100755 sha1:bc68f6cda84cbf1165d71b17d6207b3c46a8cad4) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitmerge.sh (mode:100755 sha1:92e552700a40c5e1f7339c9b1f261cb39206a3c3) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Merge a branch to the current tree. # Copyright (c) Petr Baudis, 2005 Index: gitpatch.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitpatch.sh (mode:100755 sha1:580e3e6b0c23625abd2288be35ee33a787a1ba3c) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitpatch.sh (mode:100755 sha1:fd00c88133c874ac71a90a045a313363f9f22350) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Make a patch from a given commit. # Copyright (c) Petr Baudis, 2005 Index: gitpull.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitpull.sh (mode:100755 sha1:0cafc0270ea91aaf099f398b7e5cd360be9ea086) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitpull.sh (mode:100755 sha1:7f847f39e0b2aa150fe195d8d4f6f0d62487ae72) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Pulls changes from "remote" to the local GIT repository. # Copyright (c) Petr Baudis, 2005 Index: gitrm.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitrm.sh (mode:100755 sha1:3fa31f9a1ae843dcb184b8371ff60f626e8820b3) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitrm.sh (mode:100755 sha1:e014b979ea7b8f7ae69eabc7dd146c8a7f286d19) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Remove a file from a GIT repository. # Copyright (c) Petr Baudis, 2005 Index: gitseek.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitseek.sh (mode:100755 sha1:b80969a4ba040202827ea7532235abab15ca9392) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitseek.sh (mode:100755 sha1:035b78a93307da8f67f7447ed3a182a6d17d2c50) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Seek the working tree to a given commit. # Copyright (c) Petr Baudis, 2005 Index: gitstatus.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitstatus.sh (mode:100755 sha1:7d5209ea838106eb2ab5bde2704997508a22a4e8) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitstatus.sh (mode:100755 sha1:9cfb2ce947082002cff3e5497ca2a994c4bbb101) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Show status of entries in your working tree. # Copyright (c) Petr Baudis, 2005 Index: gittag.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gittag.sh (mode:100755 sha1:9e1e200deda54b2401d6d685f0d5305cfbfa38ca) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gittag.sh (mode:100755 sha1:19c7f3ecffa55f117c27c9a1d8de67f65805f1c7) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Mark certain commit by a tag. # Copyright (c) Petr Baudis, 2005 Index: gittrack.sh =================================================================== --- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gittrack.sh (mode:100755 sha1:7509d4adb2b2c50cd2acdf9126fc57cff79e6009) +++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gittrack.sh (mode:100755 sha1:bdf33313f4ad0c4f7b6b235fbc68fc85f226a33a) @@ -1,4 +1,4 @@ -#!/bin/sh +#!/usr/bin/env bash # # Makes your working tree track the given branch. # Copyright (c) Petr Baudis, 2005 -- Alecs King ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] #!/bin/sh --> #!/usr/bin/env bash 2005-04-21 19:42 ` [PATCH] #!/bin/sh --> #!/usr/bin/env bash Alecs King @ 2005-04-22 7:37 ` H. Peter Anvin 2005-04-23 2:34 ` David A. Wheeler 0 siblings, 1 reply; 11+ messages in thread From: H. Peter Anvin @ 2005-04-22 7:37 UTC (permalink / raw) To: Alecs King; +Cc: git Alecs King wrote: > > And as for bash, only gitdiff-do and gitlog.sh 'explicitly' use bash > instead of /bin/sh. On most Linux distros, /bin/sh is just a symbolic > link to bash. But not on some others. I found gitlsobj.sh could not > work using a plain /bin/sh on fbsd. To make life easier, i think it > might be better if we all explicitly use bash for all shell scripts. > How about #!/bin/bash (build from .in files if you feel it necessary to support systems which don't have bash in /bin) instead of doubling the number of execs? -hpa ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] #!/bin/sh --> #!/usr/bin/env bash 2005-04-22 7:37 ` H. Peter Anvin @ 2005-04-23 2:34 ` David A. Wheeler 2005-04-23 6:16 ` H. Peter Anvin 0 siblings, 1 reply; 11+ messages in thread From: David A. Wheeler @ 2005-04-23 2:34 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Alecs King, git > Alecs King wrote: > >> >> And as for bash, only gitdiff-do and gitlog.sh 'explicitly' use bash >> instead of /bin/sh. On most Linux distros, /bin/sh is just a symbolic >> link to bash. But not on some others. I found gitlsobj.sh could not >> work using a plain /bin/sh on fbsd. To make life easier, i think it >> might be better if we all explicitly use bash for all shell scripts. H. Peter Anvin wrote: > How about #!/bin/bash (build from .in files if you feel it necessary to > support systems which don't have bash in /bin) instead of doubling the > number of execs? If # of execs is that critical, it probably should not be in bash anyway. OpenBSD (at least 3.1)'s bash appears to be in /usr/local/bin/bash, NOT /bin/bash. I'd go with the /bin/env solution for now; it maximizes the "it just works" factor, and when it comes time for .in files much of the cogito code (at least) will probably be rewritten in Perl, and anything performance-sensitive will be in C. --- David A. Wheeler ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] #!/bin/sh --> #!/usr/bin/env bash 2005-04-23 2:34 ` David A. Wheeler @ 2005-04-23 6:16 ` H. Peter Anvin 0 siblings, 0 replies; 11+ messages in thread From: H. Peter Anvin @ 2005-04-23 6:16 UTC (permalink / raw) To: dwheeler; +Cc: Alecs King, git David A. Wheeler wrote: > > If # of execs is that critical, it probably should not be in > bash anyway. OpenBSD (at least 3.1)'s bash appears to be in > /usr/local/bin/bash, NOT /bin/bash. > I'd go with the /bin/env solution for now; > it maximizes the "it just works" factor, and > when it comes time for .in files much of the cogito code (at least) > will probably be rewritten in Perl, and anything performance-sensitive > will be in C. > Makes sense. -hpa ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Performance of various compressors 2005-04-21 5:06 Performance of various compressors Mike Taht 2005-04-21 5:14 ` Mike Taht @ 2005-04-22 20:38 ` Aaron Lehmann 2005-04-25 12:17 ` git I/O performance (was: Performance of various compressors) Klaus Robert Suetterlin 1 sibling, 1 reply; 11+ messages in thread From: Aaron Lehmann @ 2005-04-22 20:38 UTC (permalink / raw) To: Mike Taht; +Cc: git On Wed, Apr 20, 2005 at 10:06:38PM -0700, Mike Taht wrote: > That doing the compression at a level of 3, rather than the max of 9, > cuts the cpu time required for a big git commit by over half, and that > that actually translates into a win on the I/O to disk. (these tests > were performed on a dual opteron 842) If (de)compression is slowing things down, you might want to check out lzo (http://www.oberhumer.com/opensource/lzo/). I tested it on the 2.6.11 kernel source and found that lzo -7 output is only 2% larger than gzip -3, but lzo decompression is almost 3 times faster. The downside is that lzo took 5 times longer to perform the compression at -7. Compression with lzo -3 is 3.5 times faster than gzip -3, but it produces a file that's 37% bigger. Unfortunately, lzo has no settings in between -3 and -7. I'd expect git to be more sensitive to decompression speeds, though. BTW, lzo decompression speed is not affected by the compression level. ^ permalink raw reply [flat|nested] 11+ messages in thread
* git I/O performance (was: Performance of various compressors) 2005-04-22 20:38 ` Performance of various compressors Aaron Lehmann @ 2005-04-25 12:17 ` Klaus Robert Suetterlin 0 siblings, 0 replies; 11+ messages in thread From: Klaus Robert Suetterlin @ 2005-04-25 12:17 UTC (permalink / raw) To: Aaron Lehmann; +Cc: Mike Taht, git I did some statistics on the freebsd /usr/src/sys directory, as I did not have access to the linux kernel sources. This is 5435 Files, of about 81MB (according to du -sh). I did find sys/ -type f -exec gzip -9 {} + find sys/ -type f -exec gzip -d {} + and similar calls to get an impression how different compression levels and compressors will act on the data most likely handled by git backend storage. On a 700MHz p3, UDMA33, freebsd 5.3, ffs (soft updates) I get: compressor | levels (size, time to compress, time to uncompress) -----------+------------------------------------------------------------------- gzip | 9 (28M, 1:19, 30), 6 (28M, 31.7, 30), 3 (30M, 26.1, 28.7) | 1 (31M, 23.6, 29.8) bzip2 | 9 (27M, 2:14, 37.4) 6 (27M, 2:11, 38.8) 3 (27M, 2:10, 38.3) lzop | 9 (32M, 2:15, 35.4) 7 (32M, 57.9, 40.3) 3 (39M, 36.0, 44.4) These speeds are for the case that our work set fits into filesystem caches. This will be the most common case --- as normal commits will not checkin the whole tree. That is. We should really use gzip -6. It results in the best compression at a reasonable time. bzip2 can't really compress those tiny files efficiently. lzop is limited by open/close (see below). BTW. I also did this for the whole /usr/src of freebsd (which is 35000 files and 350MB, du -sh gives 398MB). The numbers look best for gzip -6. The files we work with seem to have an average uncompressed size of 10-15kB and seem to compress by about a factor of three. So I did a test in C: open("file%d"), write(file, buf, 10000), close(file). I repeated this for 35000 files as in the freebsd src case, to get some statistics. The gprof output tells me, that open+close take the same ammount of time as the write. (You should really try do to rm test[0-9]* on those 35000 files :)) I wrote the full 10000 bytes, to check for the case when we have no compression at all. When compression gets better we will become more and more open/close limited. This means we are open/close limited in git. Even if we compress the files to zero size, we cannot get faster than by a factor of two! Earlier messages in this thread showed that we are also limited by filesytem cache, so we should use compression and efficient prefetch to get best performance out of it. Because even if we cannot get faster than by a factor of two through compression (even delta compression won't help! It would make things worse IMHO) we can get a lot worse (like ten times slower) for large sets on slow machines with few memory. I also tried to get a better ratio by using the standard db.h btree database, so I wouldn't have to open and close all those files. Unfortunately the btree is about twice as large as the files, so I had to write twice as much data to disk(800MB). Also db->put is much more complicated than write. So the test ended up taking about 10% more time, than the open/write/close case. Maybe in the case of a smaller work set (i.e. 1000 files instead of 35000) this might provide faster backend speeds. Also one could optimise speed by tweaking the acccess method parameters. On Fri, Apr 22, 2005 at 01:38:01PM -0700, Aaron Lehmann wrote: > On Wed, Apr 20, 2005 at 10:06:38PM -0700, Mike Taht wrote: > > That doing the compression at a level of 3, rather than the max of 9, > > cuts the cpu time required for a big git commit by over half, and that > > that actually translates into a win on the I/O to disk. (these tests > > were performed on a dual opteron 842) > > If (de)compression is slowing things down, you might want to check out > lzo (http://www.oberhumer.com/opensource/lzo/). I tested it on the > 2.6.11 kernel source and found that lzo -7 output is only 2% larger > than gzip -3, but lzo decompression is almost 3 times faster. The > downside is that lzo took 5 times longer to perform the compression at > -7. Compression with lzo -3 is 3.5 times faster than gzip -3, but it > produces a file that's 37% bigger. Unfortunately, lzo has no settings > in between -3 and -7. I'd expect git to be more sensitive to > decompression speeds, though. > > BTW, lzo decompression speed is not affected by the compression level. > - > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Robert Suetterlin (robert@mpe.mpg.de) phone: (+49)89 / 30000-3546 fax: (+49)89 / 30000-3950 ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2005-04-25 12:13 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-04-21 5:06 Performance of various compressors Mike Taht 2005-04-21 5:14 ` Mike Taht 2005-04-21 5:22 ` [PATCH] experimental - " Mike Taht 2005-04-21 10:23 ` HOWTO: PATCH: don't hardcode path-to-bash, use sys/limits.h Klaus Robert Suetterlin 2005-04-21 14:31 ` Alecs King 2005-04-21 19:42 ` [PATCH] #!/bin/sh --> #!/usr/bin/env bash Alecs King 2005-04-22 7:37 ` H. Peter Anvin 2005-04-23 2:34 ` David A. Wheeler 2005-04-23 6:16 ` H. Peter Anvin 2005-04-22 20:38 ` Performance of various compressors Aaron Lehmann 2005-04-25 12:17 ` git I/O performance (was: Performance of various compressors) Klaus Robert Suetterlin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).