* Is git compliant with GDPR? @ 2020-07-02 15:58 Jakub Trzebiatowski 2020-07-02 16:28 ` Jason Pyeron 0 siblings, 1 reply; 10+ messages in thread From: Jakub Trzebiatowski @ 2020-07-02 15:58 UTC (permalink / raw) To: git Hello, I've been using git for years, but I've never before taken part in the discussion on the mailing list. I have a simple question, which probably isn't easy to answer. Is git compliant with GDPR, the EU data protection law? Before I'm able to commit with git, I'm asked for my first and last name. That is personal data. GDPR, Article 4, point (1): ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); [...] That data is handled by the git utility. It's sent to other parties operating remote git servers (as a result of my commands, but as far as I know that's not relevant). It sounds like it's being processed. GDPR, Article 4, point (2): ‘processing’ means any operation or set of operations which is performed on personal data or on sets of personal data, whether or not by automated means, such as collection, recording, organisation, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction; This data is processed with a compatible computer owned by the end user for the purpose of identification of git commits. It's sent to other parties only when specific commands are given. All this was defined by git authors/contributors (from all around the world). GDPR, Article 4, point (7): ‘controller’ means the natural or legal person, public authority, agency or other body which, alone or jointly with others, determines the purposes and means of the processing of personal data; [...] Git authors can be considered joint controllers. If we'd assume the above interpretations, there would be many, many consequences. I'm not a lawyer, and I have no idea if this interpretation is reasonable. I don't even know if I'd like it to be. But here are some facts: GDPR does focus on protecting the end user. Possibly, it's the most strict data protection law in the world. It doesn't care how difficult it is to adjust the organisation for compliance and it doesn't care where the controller is located, as long as it processes personal data of EU citizens (if I understand it correctly). Are there any lawyers in the git community? Could The Linux Foundation help with legal support? It's a very non-trivial issue. It's non obvious how local software relates to GDPR, and it's even more difficult with Free/Open Source software with many, many authors. But if the aforementioned interpretation was assumed, the git authors could be held responsible for non-compliance. Best, Jakub ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Is git compliant with GDPR? 2020-07-02 15:58 Is git compliant with GDPR? Jakub Trzebiatowski @ 2020-07-02 16:28 ` Jason Pyeron 2020-07-02 16:40 ` Randall S. Becker 2020-07-02 17:06 ` Jakub Trzebiatowski 0 siblings, 2 replies; 10+ messages in thread From: Jason Pyeron @ 2020-07-02 16:28 UTC (permalink / raw) To: git; +Cc: Matthew Horowitz, 'Jakub Trzebiatowski' > -----Original Message----- > From: Jakub Trzebiatowski > Sent: Thursday, July 2, 2020 11:58 AM > > Hello, First: I am not a lawyer, and even if I were, I (nor anyone else on this list) would not be your lawyer - get a lawyer. Second: This thread is likely borderline off topic because for Git and GPDR to meet, it would be in the context of SaaS or your internal organization. There is almost nothing pure Git about these issues, see below. Discussion for the sake of it follows. > > I've been using git for years, but I've never before taken part in the > discussion on the mailing list. I have a simple question, which > probably isn't easy to answer. > > Is git compliant with GDPR, the EU data protection law? > > Before I'm able to commit with git, I'm asked for my first and last > name. That is personal data. > > GDPR, Article 4, point (1): > ‘personal data’ means any information relating to an identified or > identifiable natural person (‘data subject’); [...] > > That data is handled by the git utility. It's sent to other parties > operating remote git servers (as a result of my commands, but as far > as I know that's not relevant). It sounds like it's being processed. Git is like a hard drive or database in your organization. It does not do anything else than store the information. Exception 1: IF you configure it to do so. Exception 2: You are using a SaaS provider (e.g. github.com, gitlab.com, etc.) Note: this is no different than any other SCM (e.g. CVS, Subversion, file shares, etc.). > > GDPR, Article 4, point (2): > ‘processing’ means any operation or set of operations which is > performed on personal data or on sets of personal data, whether or not > by automated means, such as collection, recording, organisation, > structuring, storage, adaptation or alteration, retrieval, > consultation, use, disclosure by transmission, dissemination or > otherwise making available, alignment or combination, restriction, > erasure or destruction; > > This data is processed with a compatible computer owned by the end > user for the purpose of identification of git commits. It's sent to > other parties only when specific commands are given. All this was > defined by git authors/contributors (from all around the world). > Again, like any database, you can query it for its contents. What you put in it is what it has. If you put personal data in, then it is there. Where can data reside in Git? 1. The blobs - e.g. your source code 2. The commit messages. #2 is your most likely candidate of GDPR related activities. Do you use the developers names and email addresses in the message? Almost certainly. Note: this is no different than any other SCM (e.g. CVS, Subversion, file shares, etc.). > GDPR, Article 4, point (7): > ‘controller’ means the natural or legal person, public authority, > agency or other body which, alone or jointly with others, determines > the purposes and means of the processing of personal data; [...] > > Git authors can be considered joint controllers. > The Git distributed model means that COPIES of all of the data are on each Git server and developer environment. You (and I mean your organization) must address this in your IT plans. Note: this is no different than many other SCMs although some others SCM technologies only have the most recent version locally.. > If we'd assume the above interpretations, there would be many, many > consequences. > > I'm not a lawyer, and I have no idea if this interpretation is > reasonable. I don't even know if I'd like it to be. But here are some > facts: GDPR does focus on protecting the end user. Possibly, it's the > most strict data protection law in the world. It doesn't care how > difficult it is to adjust the organisation for compliance and it > doesn't care where the controller is located, as long as it processes > personal data of EU citizens (if I understand it correctly). > > Are there any lawyers in the git community? Could The Linux Foundation > help with legal support? It's a very non-trivial issue. It's non > obvious how local software relates to GDPR, and it's even more > difficult with Free/Open Source software with many, many authors. But > if the aforementioned interpretation was assumed, the git authors > could be held responsible for non-compliance. I have copied our Policy SME, maybe he will have opinions. -Jason ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Is git compliant with GDPR? 2020-07-02 16:28 ` Jason Pyeron @ 2020-07-02 16:40 ` Randall S. Becker 2020-07-03 6:22 ` demerphq 2020-07-02 17:06 ` Jakub Trzebiatowski 1 sibling, 1 reply; 10+ messages in thread From: Randall S. Becker @ 2020-07-02 16:40 UTC (permalink / raw) To: 'Jason Pyeron', git Cc: 'Matthew Horowitz', 'Jakub Trzebiatowski' On July 2, 2020 12:28 PM, Jason Pyeron wrote: > Subject: RE: Is git compliant with GDPR? > > -----Original Message----- > > From: Jakub Trzebiatowski > > Sent: Thursday, July 2, 2020 11:58 AM > > > First: I am not a lawyer, and even if I were, I (nor anyone else on this list) > would not be your lawyer - get a lawyer. > > Second: This thread is likely borderline off topic because for Git and GPDR to > meet, it would be in the context of SaaS or your internal organization. There > is almost nothing pure Git about these issues, see below. Discussion for the > sake of it follows. > > > > > I've been using git for years, but I've never before taken part in the > > discussion on the mailing list. I have a simple question, which > > probably isn't easy to answer. > > > > Is git compliant with GDPR, the EU data protection law? > > > > Before I'm able to commit with git, I'm asked for my first and last > > name. That is personal data. > > > > GDPR, Article 4, point (1): > > ‘personal data’ means any information relating to an identified or > > identifiable natural person (‘data subject’); [...] > > > > That data is handled by the git utility. It's sent to other parties > > operating remote git servers (as a result of my commands, but as far > > as I know that's not relevant). It sounds like it's being processed. > > Git is like a hard drive or database in your organization. It does not do > anything else than store the information. > > Exception 1: IF you configure it to do so. > > Exception 2: You are using a SaaS provider (e.g. github.com, gitlab.com, etc.) > > Note: this is no different than any other SCM (e.g. CVS, Subversion, file > shares, etc.). > > > > > GDPR, Article 4, point (2): > > ‘processing’ means any operation or set of operations which is > > performed on personal data or on sets of personal data, whether or not > > by automated means, such as collection, recording, organisation, > > structuring, storage, adaptation or alteration, retrieval, > > consultation, use, disclosure by transmission, dissemination or > > otherwise making available, alignment or combination, restriction, > > erasure or destruction; > > > > This data is processed with a compatible computer owned by the end > > user for the purpose of identification of git commits. It's sent to > > other parties only when specific commands are given. All this was > > defined by git authors/contributors (from all around the world). > > > > Again, like any database, you can query it for its contents. What you put in it > is what it has. If you put personal data in, then it is there. > > Where can data reside in Git? > > 1. The blobs - e.g. your source code > > 2. The commit messages. > > #2 is your most likely candidate of GDPR related activities. > > Do you use the developers names and email addresses in the message? > Almost certainly. > > Note: this is no different than any other SCM (e.g. CVS, Subversion, file > shares, etc.). > > > GDPR, Article 4, point (7): > > ‘controller’ means the natural or legal person, public authority, > > agency or other body which, alone or jointly with others, determines > > the purposes and means of the processing of personal data; [...] > > > > Git authors can be considered joint controllers. > > > > The Git distributed model means that COPIES of all of the data are on each > Git server and developer environment. You (and I mean your organization) > must address this in your IT plans. > > Note: this is no different than many other SCMs although some others SCM > technologies only have the most recent version locally.. > > > If we'd assume the above interpretations, there would be many, many > > consequences. > > > > I'm not a lawyer, and I have no idea if this interpretation is > > reasonable. I don't even know if I'd like it to be. But here are some > > facts: GDPR does focus on protecting the end user. Possibly, it's the > > most strict data protection law in the world. It doesn't care how > > difficult it is to adjust the organisation for compliance and it > > doesn't care where the controller is located, as long as it processes > > personal data of EU citizens (if I understand it correctly). > > > > Are there any lawyers in the git community? Could The Linux Foundation > > help with legal support? It's a very non-trivial issue. It's non > > obvious how local software relates to GDPR, and it's even more > > difficult with Free/Open Source software with many, many authors. But > > if the aforementioned interpretation was assumed, the git authors > > could be held responsible for non-compliance. > > > I have copied our Policy SME, maybe he will have opinions. I am not speaking for the Git Foundation here, nor am I a lawyer; However, to use some practices from some of my customers who have this concern, the team members are directed to use tokenized names and email addresses that can be resolved by their security teams during an audit. Obviously the team members recognize the tokens so they know who is making what change. This means that externally, any names/emails that might get pushed upstream are non-identifying. The problem with this approach is that it is not global. As a result, if you want to contribute to a public project you have to self-identify, which may imply consent under GDPR. This is for the protection of the project itself as a project cannot take code from anonymous sources. If you are unwilling to share that information, do not contribute to a project. Randall -- Brief whoami: NonStop developer since approximately 211288444200000000 UNIX developer since approximately 421664400 -- In my real life, I talk too much. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Is git compliant with GDPR? 2020-07-02 16:40 ` Randall S. Becker @ 2020-07-03 6:22 ` demerphq 2020-07-03 13:52 ` Randall S. Becker 0 siblings, 1 reply; 10+ messages in thread From: demerphq @ 2020-07-03 6:22 UTC (permalink / raw) To: Randall S. Becker Cc: Jason Pyeron, Git, Matthew Horowitz, Jakub Trzebiatowski On Thu, 2 Jul 2020 at 18:42, Randall S. Becker <rsbecker@nexbridge.com> wrote: > I am not speaking for the Git Foundation here, nor am I a lawyer; However, to use some practices from some of my customers who have this concern, the team members are directed to use tokenized names and email addresses that can be resolved by their security teams during an audit. Obviously the team members recognize the tokens so they know who is making what change. This means that externally, any names/emails that might get pushed upstream are non-identifying. I think this is a really good point. I think git could make itself much more GDPR friendly by having some support for this type of idea built in. Not sure how it could work, maybe some kind of object that can be deleted after the fact which maps an identifier used for the author with name and email. If that name and email change the object can be updated, and if there is a need to "forget" the author, the object can be deleted. The object would not be shared on clone, so it would stay private to the repo that held it. I guess you can argue that this isnt git's problem. But at a corporate level, it will be seen as git's fault regardless if it cause a big disruption. It could/would also be a reason that european companies might decide not to use git. cheers, Yves -- perl -Mre=debug -e "/just|another|perl|hacker/" ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Is git compliant with GDPR? 2020-07-03 6:22 ` demerphq @ 2020-07-03 13:52 ` Randall S. Becker 0 siblings, 0 replies; 10+ messages in thread From: Randall S. Becker @ 2020-07-03 13:52 UTC (permalink / raw) To: 'demerphq' Cc: 'Jason Pyeron', 'Git', 'Matthew Horowitz', 'Jakub Trzebiatowski' On July 3, 2020 2:23 AM, demerphq wrote: > On Thu, 2 Jul 2020 at 18:42, Randall S. Becker <rsbecker@nexbridge.com> > wrote: > > I am not speaking for the Git Foundation here, nor am I a lawyer; However, > to use some practices from some of my customers who have this concern, > the team members are directed to use tokenized names and email addresses > that can be resolved by their security teams during an audit. Obviously the > team members recognize the tokens so they know who is making what > change. This means that externally, any names/emails that might get pushed > upstream are non-identifying. > > I think this is a really good point. I think git could make itself much more > GDPR friendly by having some support for this type of idea built in. > > Not sure how it could work, maybe some kind of object that can be deleted > after the fact which maps an identifier used for the author with name and > email. If that name and email change the object can be updated, and if there > is a need to "forget" the author, the object can be deleted. The object would > not be shared on clone, so it would stay private to the repo that held it. > > I guess you can argue that this isnt git's problem. But at a corporate level, it > will be seen as git's fault regardless if it cause a big disruption. It could/would > also be a reason that european companies might decide not to use git. How you choose to identify yourself to git is entirely arbitrary. There are SSO solutions used by GitHub that have the personal information stripped out. I contend that this is not git's problem because anyone can use anything to self-identify. Git does not care. Policies can be implemented (commit-hooks) to automatically tokenize but that's up to what the corporation wants to do. In fact, git is less subject to GDPR issues than other VCS systems, which uses the logon credentials that are personal-identifying in many locations and could represent a security vulnerability. So while a corporation can choose to find fault with git, the fault is in their own credential management policies. It might be worth some documentation to explain this. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Is git compliant with GDPR? 2020-07-02 16:28 ` Jason Pyeron 2020-07-02 16:40 ` Randall S. Becker @ 2020-07-02 17:06 ` Jakub Trzebiatowski 2020-07-02 18:38 ` Paul Smith 2020-07-02 18:47 ` Jason Pyeron 1 sibling, 2 replies; 10+ messages in thread From: Jakub Trzebiatowski @ 2020-07-02 17:06 UTC (permalink / raw) To: Jason Pyeron; +Cc: git, Matthew Horowitz czw., 2 lip 2020 o 18:27 Jason Pyeron <jpyeron@pdinc.us> napisał(a): > > > -----Original Message----- > > From: Jakub Trzebiatowski > > Sent: Thursday, July 2, 2020 11:58 AM > > > > Hello, > > First: I am not a lawyer, and even if I were, I (nor anyone else on this list) would not be your lawyer - get a lawyer. I don't think I'm in need of a lawyer. I wanted to start a discussion on a topic that in my opinion deserves being discussed, because I'm a git user and I believe it's interesting. > > Second: This thread is likely borderline off topic because for Git and GPDR to meet, it would be in the context of SaaS or your internal organization. There is almost nothing pure Git about these issues, see below. Discussion for the sake of it follows. I do agree that that sounds reasonable. But could I ask you why do you assume that there needs to be a service (or Software as a Service) to make software fall under GDPR? The GDPR definitions don't seem to mention that. > > > > I've been using git for years, but I've never before taken part in the > > discussion on the mailing list. I have a simple question, which > > probably isn't easy to answer. > > > > Is git compliant with GDPR, the EU data protection law? > > > > Before I'm able to commit with git, I'm asked for my first and last > > name. That is personal data. > > > > GDPR, Article 4, point (1): > > ‘personal data’ means any information relating to an identified or > > identifiable natural person (‘data subject’); [...] > > > > That data is handled by the git utility. It's sent to other parties > > operating remote git servers (as a result of my commands, but as far > > as I know that's not relevant). It sounds like it's being processed. > > Git is like a hard drive or database in your organization. It does not do anything else than store the information. Storing is processing. I'm not saying that git is evil or wrong, I'm saying that it might be the case that it processes personal data (both understood as in GDPR). git is also a software created by people and used by people. > > Exception 1: IF you configure it to do so. Sure, it doesn't change much. Processing data initiated by the user isn't any kind of distinguished processing, as far as I know. > > Exception 2: You are using a SaaS provider (e.g. github.com, gitlab.com, etc.) > > Note: this is no different than any other SCM (e.g. CVS, Subversion, file shares, etc.). I'm totally aware. I know how git works, including some of the internals, and I'm in general aware of standard solutions in the IT industry. Probably if git would be considered non-compliant, then so would be other SCMs. > > > > > GDPR, Article 4, point (2): > > ‘processing’ means any operation or set of operations which is > > performed on personal data or on sets of personal data, whether or not > > by automated means, such as collection, recording, organisation, > > structuring, storage, adaptation or alteration, retrieval, > > consultation, use, disclosure by transmission, dissemination or > > otherwise making available, alignment or combination, restriction, > > erasure or destruction; > > > > This data is processed with a compatible computer owned by the end > > user for the purpose of identification of git commits. It's sent to > > other parties only when specific commands are given. All this was > > defined by git authors/contributors (from all around the world). > > > > Again, like any database, you can query it for its contents. What you put in it is what it has. If you put personal data in, then it is there. It's not a general purpose database, it's a structured database and a software that operates on that database. That database has a field for personal data, and that data is processed by the software. > Where can data reside in Git? > > 1. The blobs - e.g. your source code > > 2. The commit messages. > > #2 is your most likely candidate of GDPR related activities. > > Do you use the developers names and email addresses in the message? Almost certainly. > > Note: this is no different than any other SCM (e.g. CVS, Subversion, file shares, etc.). > > > GDPR, Article 4, point (7): > > ‘controller’ means the natural or legal person, public authority, > > agency or other body which, alone or jointly with others, determines > > the purposes and means of the processing of personal data; [...] > > > > Git authors can be considered joint controllers. > > > > The Git distributed model means that COPIES of all of the data are on each Git server and developer environment. You (and I mean your organization) must address this in your IT plans. > > Note: this is no different than many other SCMs although some others SCM technologies only have the most recent version locally.. > > > If we'd assume the above interpretations, there would be many, many > > consequences. > > > > I'm not a lawyer, and I have no idea if this interpretation is > > reasonable. I don't even know if I'd like it to be. But here are some > > facts: GDPR does focus on protecting the end user. Possibly, it's the > > most strict data protection law in the world. It doesn't care how > > difficult it is to adjust the organisation for compliance and it > > doesn't care where the controller is located, as long as it processes > > personal data of EU citizens (if I understand it correctly). > > > > Are there any lawyers in the git community? Could The Linux Foundation > > help with legal support? It's a very non-trivial issue. It's non > > obvious how local software relates to GDPR, and it's even more > > difficult with Free/Open Source software with many, many authors. But > > if the aforementioned interpretation was assumed, the git authors > > could be held responsible for non-compliance. > > > I have copied our Policy SME, maybe he will have opinions. > > -Jason > In general, I totally agree with everything you said. But you said that git itself (as a software) doesn't fall under GDPR, and that's the only thing I'm not sure about. I was wondering if someone with a deeper understanding of GDPR would tell my _why_. Because when interpreting the law literally, it sounds like it does. Also, to clarify, I'm not seeking legal advice for myself or my organization. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Is git compliant with GDPR? 2020-07-02 17:06 ` Jakub Trzebiatowski @ 2020-07-02 18:38 ` Paul Smith 2020-07-02 19:25 ` Jason Pyeron 2020-07-02 18:47 ` Jason Pyeron 1 sibling, 1 reply; 10+ messages in thread From: Paul Smith @ 2020-07-02 18:38 UTC (permalink / raw) To: Jakub Trzebiatowski, Jason Pyeron; +Cc: git, Matthew Horowitz On Thu, 2020-07-02 at 19:06 +0200, Jakub Trzebiatowski wrote: > But you said that git itself (as a software) doesn't fall under GDPR, > and that's the only thing I'm not sure about. I was wondering if > someone with a deeper understanding of GDPR would tell my _why_. > Because when interpreting the law literally, it sounds like it does. You might be interested in reading the conversation that was had on this list the last time this subject was raised, in 2018: https://public-inbox.org/git/5587534.o6tcmYBVvN@mfick-lnx/T/ I can't say whether it will satisfy you or not. ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Is git compliant with GDPR? 2020-07-02 18:38 ` Paul Smith @ 2020-07-02 19:25 ` Jason Pyeron 2020-07-03 6:29 ` demerphq 0 siblings, 1 reply; 10+ messages in thread From: Jason Pyeron @ 2020-07-02 19:25 UTC (permalink / raw) To: git; +Cc: 'Matthew Horowitz', 'Jakub Trzebiatowski', paul > -----Original Message----- > From: Paul Smith > Sent: Thursday, July 2, 2020 2:38 PM > > On Thu, 2020-07-02 at 19:06 +0200, Jakub Trzebiatowski wrote: > > But you said that git itself (as a software) doesn't fall under GDPR, > > and that's the only thing I'm not sure about. I was wondering if > > someone with a deeper understanding of GDPR would tell my _why_. > > Because when interpreting the law literally, it sounds like it does. > > You might be interested in reading the conversation that was had on > this list the last time this subject was raised, in 2018: > > https://public-inbox.org/git/5587534.o6tcmYBVvN@mfick-lnx/T/ > > I can't say whether it will satisfy you or not. IMHO the most valuable bits were (I left out the discussion of changes to Git): 1: From: David Lang Date: Wed, 6 Jun 2018 18:38:55 -0700 (PDT) Message-ID: <alpine.DEB.2.02.1806061831340.7659@nftneq.ynat.uz> (raw) https://public-inbox.org/git/alpine.DEB.2.02.1806061831340.7659@nftneq.ynat.uz/#t I'm going to take the risk of inserting actual real-world data into the mix rather than just speculation :-) Here is an example of that the Rsyslog project is doing (main developers based in Germany). I'll say as someone who's day job has been very involved with GDPR stuff recently, this looks like a very reasonable statement to me. But I am not a lawyer. I will also say that I think it would be very reasonable for projects to not accept code from someone who doesn't give them any way to contact them later in case there is a question about authorship or licensing. David Lang https://github.com/rsyslog/rsyslog/pull/2746/files LEGAL GDPR NOTICE: According to the European data protection laws (GDPR), we would like to make you aware that contributing to rsyslog via git will permanently store the name and email address you provide as well as the actual commit and the time and date you made it inside git's version history. This is inevitable, because it is a main feature git. If you are concerned about your privacy, we strongly recommend to use --author "anonymous <gdpr@example.com>" together with your commit. Also please do NOT sign your commit in this case, as that potentially could lead back to you. Please note that if you use your real identity, the GDPR grants you the right to have this information removed later. However, we have valid reasons why we cannot remove that information later on. The reasons are: * this would break git history and make future merges unworkable * the rsyslog projects has legitimate interest to keep a permanent record of the contributor identity, once given, for - copyright verification - being able to provide proof should a malicious commit be made Please also note that your commit is public and as such will potentially be processed by many third-parties. Git's distributed nature makes it impossible to track where exactly your commit, and thus your personal data, will be stored and be processed. If you would not like to accept this risk, please do either commit anonymously or refrain from contributing to the rsyslog project. 2: From: "Philip Oakley" Date: Sun, 3 Jun 2018 23:28:43 +0100 Message-ID: <5F80881E35F941E88D9C84565C437607@PhilipOakley> (raw) https://public-inbox.org/git/5F80881E35F941E88D9C84565C437607@PhilipOakley/#t > On Sun, Jun 03, 2018 at 04:28:31PM +0100, Philip Oakley wrote: <snip/> > You provide a lot of arguments about why it is not a necessity to have > this, but let's assume it is; is there any actual problem you see with > the proposal, except that someone would have to implement it? It's the strawman problem. If it was a real 'real issue' then it would have already shown up with companies clamouring to pay folk to fix our (git's) latest problem. But the haven't, so I think it's a much more balanced issue. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Is git compliant with GDPR? 2020-07-02 19:25 ` Jason Pyeron @ 2020-07-03 6:29 ` demerphq 0 siblings, 0 replies; 10+ messages in thread From: demerphq @ 2020-07-03 6:29 UTC (permalink / raw) To: Jason Pyeron; +Cc: Git, Matthew Horowitz, Jakub Trzebiatowski, paul On Thu, 2 Jul 2020 at 21:27, Jason Pyeron <jpyeron@pdinc.us> wrote: > > On Sun, Jun 03, 2018 at 04:28:31PM +0100, Philip Oakley wrote: > <snip/> > > You provide a lot of arguments about why it is not a necessity to have > > this, but let's assume it is; is there any actual problem you see with > > the proposal, except that someone would have to implement it? > > It's the strawman problem. If it was a real 'real issue' then it would have > already shown up with companies clamouring to pay folk to fix our (git's) > latest problem. But the haven't, so I think it's a much more balanced issue. > I don't agree. These things tend to come in waves. Just because the first wave hasnt hit yet doesn't mean it wont come. GDPR is still super new, people are still coming to understand it. Over time this understanding will lead to more people exercising the right to be forgotten. cheers, Yves -- perl -Mre=debug -e "/just|another|perl|hacker/" ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Is git compliant with GDPR? 2020-07-02 17:06 ` Jakub Trzebiatowski 2020-07-02 18:38 ` Paul Smith @ 2020-07-02 18:47 ` Jason Pyeron 1 sibling, 0 replies; 10+ messages in thread From: Jason Pyeron @ 2020-07-02 18:47 UTC (permalink / raw) To: git; +Cc: 'Matthew Horowitz', 'Jakub Trzebiatowski' > -----Original Message----- > From: Jakub Trzebiatowski > Sent: Thursday, July 2, 2020 1:06 PM > > czw., 2 lip 2020 o 18:27 Jason Pyeron napisał(a): > > > > > -----Original Message----- > > > From: Jakub Trzebiatowski > > > Sent: Thursday, July 2, 2020 11:58 AM > > > > > > Hello, > > > > First: I am not a lawyer, and even if I were, I (nor anyone else on this list) would not be your > lawyer - get a lawyer. > I don't think I'm in need of a lawyer. I wanted to start a discussion > on a topic that in my opinion deserves being discussed, because I'm a > git user and I believe it's interesting. > > > > Second: This thread is likely borderline off topic because for Git and GPDR to meet, it would be in > the context of SaaS or your internal organization. There is almost nothing pure Git about these > issues, see below. Discussion for the sake of it follows. > > I do agree that that sounds reasonable. But could I ask you why do you > assume that there needs to be a service (or Software as a Service) to > make software fall under GDPR? The GDPR definitions don't seem to > mention that. You will need to read the whole GDPR, and understand it which is no small task. I feel it does, the GDPR says: ‘controller’ means the natural or legal person, public authority, agency or other body which, alone or jointly with others, determines the purposes and means of the processing of personal data; where the purposes and means of such processing are determined by Union or Member State law, the controller or the specific criteria for its nomination may be provided for by Union or Member State law; ‘processor’ means a natural or legal person, public authority, agency or other body which processes personal data on behalf of the controller; Here your question seems to extend "legal person" from the organization, to its systems, and further to the software (e.g. Git) running on those systems. Whereas a SaaS provider is a legal person subject to GDPR or is a "Third Party". > > > > > > > I've been using git for years, but I've never before taken part in the > > > discussion on the mailing list. I have a simple question, which > > > probably isn't easy to answer. > > > > > > Is git compliant with GDPR, the EU data protection law? > > > > > > Before I'm able to commit with git, I'm asked for my first and last > > > name. That is personal data. > > > > > > GDPR, Article 4, point (1): > > > ‘personal data’ means any information relating to an identified or > > > identifiable natural person (‘data subject’); [...] > > > > > > That data is handled by the git utility. It's sent to other parties > > > operating remote git servers (as a result of my commands, but as far > > > as I know that's not relevant). It sounds like it's being processed. > > > > Git is like a hard drive or database in your organization. It does not do anything else than store > the information. > > Storing is processing. I'm not saying that git is evil or wrong, I'm > saying that it might be the case that it processes personal data (both > understood as in GDPR). > > git is also a software created by people and used by people. Again the relevance is on the organization. > > > > > Exception 1: IF you configure it to do so. > > Sure, it doesn't change much. Processing data initiated by the user > isn't any kind of distinguished processing, as far as I know. > > > > > Exception 2: You are using a SaaS provider (e.g. github.com, gitlab.com, etc.) > > > > Note: this is no different than any other SCM (e.g. CVS, Subversion, file shares, etc.). > > I'm totally aware. I know how git works, including some of the > internals, and I'm in general aware of standard solutions in the IT > industry. Probably if git would be considered non-compliant, then so > would be other SCMs. I am referring to configurations that are following organization policies, which in themselves are causing the GDPR concerns. E.g. commit data is tweeted. Or as Randall S. Becker said on Thursday, July 2, 2020 12:41 PM: > some practices from some of my customers who have this concern, the team members are directed to use > tokenized names and email addresses that can be resolved by their security teams during an audit. Obviously > the team members recognize the tokens so they know who is making what change. This means that externally, > any names/emails that might get pushed upstream are non-identifying. The organization explicitly added GDPR covered information (see European Parliament question E-007174/2017). > > > > > > > > > GDPR, Article 4, point (2): > > > ‘processing’ means any operation or set of operations which is > > > performed on personal data or on sets of personal data, whether or not > > > by automated means, such as collection, recording, organisation, > > > structuring, storage, adaptation or alteration, retrieval, > > > consultation, use, disclosure by transmission, dissemination or > > > otherwise making available, alignment or combination, restriction, > > > erasure or destruction; > > > > > > This data is processed with a compatible computer owned by the end > > > user for the purpose of identification of git commits. It's sent to > > > other parties only when specific commands are given. All this was > > > defined by git authors/contributors (from all around the world). > > > > > > > Again, like any database, you can query it for its contents. What you put in it is what it has. If > you put personal data in, then it is there. > > It's not a general purpose database, it's a structured database and a > software that operates on that database. That database has a field for > personal data, and that data is processed by the software. > I disagree, but see https://blog.sqlauthority.com/2018/01/19/sql-server-make-sql-server-gdpr-compliance/ . I think we can all agree if software could be complaint/noncompliant, then a SQL server is a perfect candidate. That article addresses the issues of how to configure it and the business procedures to align with GDPR obligations. That (and only that) discussion I think is very on topic here. > > Where can data reside in Git? > > > > 1. The blobs - e.g. your source code > > > > 2. The commit messages. > > > > #2 is your most likely candidate of GDPR related activities. > > > > Do you use the developers names and email addresses in the message? Almost certainly. > > > > Note: this is no different than any other SCM (e.g. CVS, Subversion, file shares, etc.). > > > > > GDPR, Article 4, point (7): > > > ‘controller’ means the natural or legal person, public authority, > > > agency or other body which, alone or jointly with others, determines > > > the purposes and means of the processing of personal data; [...] > > > > > > Git authors can be considered joint controllers. > > > > > > > The Git distributed model means that COPIES of all of the data are on each Git server and developer > environment. You (and I mean your organization) must address this in your IT plans. > > > > Note: this is no different than many other SCMs although some others SCM technologies only have the > most recent version locally.. > > > > > If we'd assume the above interpretations, there would be many, many > > > consequences. > > > > > > I'm not a lawyer, and I have no idea if this interpretation is > > > reasonable. I don't even know if I'd like it to be. But here are some > > > facts: GDPR does focus on protecting the end user. Possibly, it's the > > > most strict data protection law in the world. It doesn't care how > > > difficult it is to adjust the organisation for compliance and it > > > doesn't care where the controller is located, as long as it processes > > > personal data of EU citizens (if I understand it correctly). > > > > > > Are there any lawyers in the git community? Could The Linux Foundation > > > help with legal support? It's a very non-trivial issue. It's non > > > obvious how local software relates to GDPR, and it's even more > > > difficult with Free/Open Source software with many, many authors. But > > > if the aforementioned interpretation was assumed, the git authors > > > could be held responsible for non-compliance. > > > > > > I have copied our Policy SME, maybe he will have opinions. > > > > -Jason > > > > In general, I totally agree with everything you said. > > But you said that git itself (as a software) doesn't fall under GDPR, > and that's the only thing I'm not sure about. I was wondering if > someone with a deeper understanding of GDPR would tell my _why_. > Because when interpreting the law literally, it sounds like it does. > > Also, to clarify, I'm not seeking legal advice for myself or my organization. -Jason ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-07-03 13:52 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-07-02 15:58 Is git compliant with GDPR? Jakub Trzebiatowski 2020-07-02 16:28 ` Jason Pyeron 2020-07-02 16:40 ` Randall S. Becker 2020-07-03 6:22 ` demerphq 2020-07-03 13:52 ` Randall S. Becker 2020-07-02 17:06 ` Jakub Trzebiatowski 2020-07-02 18:38 ` Paul Smith 2020-07-02 19:25 ` Jason Pyeron 2020-07-03 6:29 ` demerphq 2020-07-02 18:47 ` Jason Pyeron
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).