From: <rsbecker@nexbridge.com>
To: "'Sean Allred'" <allred.sean@gmail.com>
Cc: "'Junio C Hamano'" <gitster@pobox.com>, <git@vger.kernel.org>,
<sallred@epic.com>, <grmason@epic.com>, <sconrad@epic.com>
Subject: RE: Dealing with corporate email recycling
Date: Sun, 13 Mar 2022 11:02:24 -0400 [thread overview]
Message-ID: <01f301d836eb$5c7a6810$156f3830$@nexbridge.com> (raw)
In-Reply-To: <87v8whap0b.fsf@gmail.com>
On March 13, 2022 10:41 AM, Sean Allred wrote:
><rsbecker@nexbridge.com> writes:
>> (I am a little nervous about this advice, hoping others will chime in
>> and correct anything wrong here)
>>
>> While this will change the commit hashes, AFAIK, the other metadata is
>> preserved, including date, author, and committer. Set up the specific
>> keys/settings in ssh-agent and the user.signingKey value, then:
>>
>> git filter-branch --commit-filter 'git commit-tree -S "$@";'
>> <FROM-COMMIT>..<TO-COMMIT>
>>
>> Others might have a better way of doing this or may tell me this will
>> not work. Test this before you do it. I have not done this operation
>> before. You do need to start from the oldest commit going forward
>> otherwise I think that filter-branch will (should!) invalidate child
>> commits. I suspect this is going to be a rather lengthy script to build and run.
>
>Given the size of our history (several orders of magnitude larger than linux.git),
>using git-filter-branch after the fact is certainly not ideal. The replay already takes
>a week to run (we're IO-bound). We'd rather want to extend git-fast-import to
>allow signing commits instead
>-- which comes back to our shared 'nervousness' about this approach in
>general: I don't know that Git should endorse this as a standard option.
>
>But yes -- hoping others can chime in with more thoughts :-)
I have another reluctant suggestion, but it depends on your industry, regulations, and other factors. In some sectors, there is a requirement to keep only some period of time worth of history. In fact, in some settings, keeping user identifying information beyond, say 7 years, actually is problematic. Pruning your history may be not only an option but required. An alternative is to use filter-branch to essentially tokenize the identities of past authors and keep those in a electronic vault somewhere. I have customers who are interpreting GDPR-like rules just such as situation, where employees gone 7 years ago and cannot be retained, by name, in the repos. I am not personally happy about that, because my own repo-OCD demands that I know exactly who did what until the end of time, but according to them, it actually violates the local regulations. I'm sure you have had conversations with lawyers, yes? ☹
next prev parent reply other threads:[~2022-03-13 15:02 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-12 22:38 Dealing with corporate email recycling Sean Allred
2022-03-13 0:03 ` Junio C Hamano
2022-03-13 0:26 ` rsbecker
2022-03-13 14:01 ` Sean Allred
2022-03-13 14:20 ` rsbecker
2022-03-13 14:41 ` Sean Allred
2022-03-13 15:02 ` rsbecker [this message]
2022-03-13 15:21 ` Sean Allred
2022-03-13 19:57 ` Philip Oakley
2022-03-13 22:40 ` Sean Allred
2022-03-13 23:16 ` Junio C Hamano
2022-03-13 23:23 ` rsbecker
2022-03-14 0:19 ` Junio C Hamano
2022-03-14 11:56 ` Philip Oakley
2022-03-14 21:24 ` Junio C Hamano
2022-03-14 22:25 ` Philip Oakley
2022-03-15 1:23 ` Sean Allred
2022-03-15 11:15 ` Philip Oakley
2022-03-13 12:20 ` Philip Oakley
2022-03-13 13:35 ` Sean Allred
2022-03-14 11:59 ` Philip Oakley
2022-03-13 15:51 ` Ævar Arnfjörð Bjarmason
2022-03-13 17:22 ` brian m. carlson
2022-03-13 17:52 ` rsbecker
2022-03-13 19:47 ` rsbecker
2022-03-13 22:23 ` Sean Allred
2022-03-15 1:27 ` Sean Allred
2022-03-18 21:22 ` Peter Krefting
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='01f301d836eb$5c7a6810$156f3830$@nexbridge.com' \
--to=rsbecker@nexbridge.com \
--cc=allred.sean@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=grmason@epic.com \
--cc=sallred@epic.com \
--cc=sconrad@epic.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).