From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marc Sune Subject: Re: Beyond DPDK 2.0 Date: Mon, 27 Apr 2015 17:34:46 +0200 Message-ID: <553E5716.9050309@bisdn.de> References: <26FA93C7ED1EAA44AB77D62FBE1D27BA54D1A917@IRSMSX102.ger.corp.intel.com> <20150424175124.GA30624@mhcomputing.net> <553B9706.1060904@bisdn.de> <20150426215644.GA9021@neilslaptop.think-freely.org> <553E06D8.2060604@bisdn.de> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Cc: "dev-VfR2kkLFssw@public.gmane.org" To: "Wiles, Keith" , Neil Horman Return-path: In-Reply-To: List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces-VfR2kkLFssw@public.gmane.org Sender: "dev" On 27/04/15 15:39, Wiles, Keith wrote: > > On 4/27/15, 4:52 AM, "Marc Sune" wrote: > >> >> On 27/04/15 03:41, Wiles, Keith wrote: >>> On 4/26/15, 4:56 PM, "Neil Horman" wrote: >>> >>>> On Sat, Apr 25, 2015 at 04:08:23PM +0000, Wiles, Keith wrote: >>>>> On 4/25/15, 8:30 AM, "Marc Sune" wrote: >>>>> >>>>>> On 24/04/15 19:51, Matthew Hall wrote: >>>>>>> On Fri, Apr 24, 2015 at 12:39:47PM -0500, Jay Rolette wrote: >>>>>>>> I can tell you that if DPDK were GPL-based, my company wouldn't = be >>>>>>>> using >>>>>>>> it. I suspect we wouldn't be the only ones... >>>>>>>> >>>>>>>> Jay >>>>>>> I could second this, from the past employer where I used it. Righ= t >>>>> now >>>>>>> I am >>>>>>> using it in an open source app, I have a bit of GPL here and ther= e >>>>> but >>>>>>> I'm >>>>>>> trying to get rid of it or confine it to separate address spaces, >>>>> where >>>>>>> it >>>>>>> won't impact the core code written around DPDK, as I don't want t= o >>>>> cause >>>>>>> headaches for any downstream users I attract someday. >>>>>>> >>>>>>> Hard-core GPL would not be possible for most. LGPL could be >>>>>>> possible, >>>>>>> but I >>>>>>> don't think it could be worth the relicensing headache for that >>>>>>> small >>>>>>> change. >>>>>>> >>>>>>> Instead we should make the patch process as easy as humanly possi= ble >>>>> so >>>>>>> people >>>>>>> are encouraged to send us the fixes and not cart them around thei= r >>>>>>> companies >>>>>>> constantly. >>>>> +1 and besides the GPL or LGPL ship has sailed IMHO and we can not = go >>>>> back. >>>> Actually, IANAL, but I think we can. The BSD license allows us to f= ork >>>> and >>>> relicense the code I think, under GPL or any other license. I'm not >>>> advocating >>>> for that mind you, just suggesting that its possible should it ever >>>> become >>>> needed. >>>> >>>>>> I agree. My feeling is that as the number of patches in the mailin= g >>>>> list >>>>>> grows, keeping track of them gets more and more complicated. >>>>>> Patchwork >>>>>> website was a way to try to address this issue. I think it was an >>>>>> improvement, but to be honest, patchwork lacks a lot of >>>>>> functionality, >>>>>> such as properly tracking multiple versions of the patch (supersed= ing >>>>>> them automatically), and it lacks some filtering capabilities e.g. >>>>>> per >>>>>> user, per tag/label or library, automatically track if it has been >>>>>> merged, give an overall status of the pending vs merged patches, s= et >>>>>> milestones... Is there any alternative tool or improved version fo= r >>>>> that? >>>>> >>>> Agreed, this has come up before, off list unfortunately. The volume= of >>>> patches >>>> seems to be increasing at such a rate that a single maintainer has >>>> difficulty >>>> keeping up. I proposed that the workload be split out to multiple >>>> subtrees, >>>> with prefixes being added to patch subjects on the list for local >>>> filtering to >>>> stem the tide. Specifically I had proposed that the PMD's be split >>>> into a >>>> separate subtree, but that received pushback in favor of having each >>>> library >>>> having its own separate subtree, with a pilot program being made out= of >>>> the I40e >>>> driver (which you might note sends pull requests to the list now). = I'd >>>> still >>>> like to see all PMD's come under a single subtree, but thats likely = an >>>> argument >>>> for later. >>>> >>>> That said, Do you think that this patch latency is really a contribu= tor >>>> to low >>>> project participation? It definately a problem, but it seems to me >>>> that >>>> this >>>> sort of issue would lead to people trying to parcitipate, then givin= g >>>> up >>>> (i.e. >>>> we would see 1-2 emails from an individual, then not see them again)= . >>>> I'd need >>>> to look through the mailing list for such a pattern, but anecdotally >>>> I've >>>> not >>>> seen that happen. The problem you describe above is definately a >>>> problem, but >>>> its one for those individuals who are participating, not for those w= ho >>>> are >>>> simply choosing not to. And I think we need to address both. >>>> >>>>> I agree patchwork has some limitation, but I think the biggest issu= e >>>>> is >>>>> keeping up with the patches. Getting patches introduced into the ma= in >>>>> line >>>>> is very slow. A patch submitted today may not get applied for weeks= or >>>>> months, then when another person submits a patch he is starting to >>>>> run a >>>>> very high risk of having to redo that patch, because a pervious pat= ch >>>>> makes his fail weeks/months later. I would love to see a better too= l >>>>> then >>>>> patchwork, but the biggest issue is we have a huge backlog of patch= es. >>>>> Personally I am not sure how Thomas or any is able to keep up with = the >>>>> patches. >>>>> >>>> This is absolutely a problem. I'd like to think, more than a tool l= ike >>>> patchwork, a subtree organization to allow some modicum of parallel >>>> review and >>>> integration would really be a benefit here. >>> Subtrees could work, but the real problem I think is the number of >>> committers must be higher then one. Something like GitHub (and I assu= me >>> Linux Foundation) have a method to add committers to a project. In th= e >>> case of GitHub they just have to have a free GitHub account and they = can >>> become committers of the project buying the owner of the project enab= les >>> them. >>> >>> On GitHub they have personal accounts and organization accounts I kno= w >>> only about the personal accounts, but they allow for 5 private repos = and >>> any number of public repos. The organization account has a lot of ext= ra >>> features that seem better for a DPDK community IMO and should be the = one >>> we use if we decide it is the right direction. We can always give it = a >>> shot for while and keep the dpdk.org and use dev-VfR2kkLFssw@public.gmane.org and its rep= o >>> mirrored from GitHub as a transition phase. This way we can fall back= to >>> dpdk.org or move one to something else if we like. >>> >>> https://help.github.com/categories/organizations/ >>> >>> The developers could still send patches via email list, but creating = a >>> repo and forking dpdk is easy, then send a pull request. >> For the github "community" or free service, organization accounts just >> allow you to set teams, where each time can be assigned to one or more >> repositories. The differences are summarized here: >> >> https://help.github.com/articles/what-s-the-difference-between-user-an= d-or >> ganization-accounts/ >> >> And the permission schema, per team, is summarized here: >> >> https://help.github.com/articles/permission-levels-for-an-organization= -rep >> ository/ >> >> Some limitations: i) only if the team has write permissions (IOW push >> permissions) you can manage issues ii) there cannot be per-branch ACLs= . > I was assuming the organization GitHub is just to allow more then one > admin/maintainers along with teams if needed. I would assume the repos = are > still public and others are allowed to fork or pull the repos. I think = of > the org version is just extra controls on top of a personal repo like > design. The org/personal one should appear to the > non-maintainers/admins/owner as a normal repo on GitHub, correct? Right > > The GitHub organization is built for open-source and you can still have > private repos, but then you start to have a cost depending on the numbe= r > of private repos you want. If you do not have a lot of private repos th= en > you should have no cost (I think). I do not see any reason for private > repos, but I guest we could have some and we get 5 free and 10 is $25 p= er > month. I don't see the reason either, and I don't know why private repos would=20 be useful here. >>> >>>>> The other problem I see is how patches are agreed on to be included= in >>>>> the >>>>> mainline. Today it is just an ACK or a NAK on the mailing list. The= n I >>>>> see >>>>> what I think to be only a few people ACKing or NAKing patches. This >>>>> process has a lot of problems from a patch being ignore for some >>>>> reason >>>>> or >>>>> someone having negative feed back on very minor detail or no way to >>>>> push a >>>>> patch forward a single NAK or comment. >>>>> >>>> So, this is an interesting issue in ideal meritocracies. Currently >>>> is/should be >>>> looking for ACKs/NAK/s from the individuals listed in the MAINTAINER >>>> files, and >>>> those people should be the definitive subject matter experts on the >>>> code >>>> they >>>> cover. As such, I would agrue that they should be entitled to a >>>> modicum >>>> of >>>> stylistic/trivial leeway. That is to say, if they choose to block a >>>> patch >>>> around a very minor detail, then between them changing their positio= n, >>>> and the >>>> patch author changing the code, the latter is likely the easier cour= se >>>> of >>>> action, especially if the author can't make an argument for their >>>> position. >>>> That said, if such patch blockage becomes so egregious that individu= als >>>> stop >>>> contributing, that needs to be known as well. If you as a patch >>>> author: >>>> >>>> 1) Have tried to submit patches >>>> 2) Had them blocked for what you consider trivial reasons >>>> 3) Plan to not contribute further because of this >>>> 4) Still rely on the DPDK for your product >>>> >>>> Please, say something. People in charge need to know when they're >>>> pushing >>>> contributors away. >>>> >>>> FWIW, I've tried to do some correlation between the git history and = the >>>> mailing >>>> list. I need to do more searches, but I have a feeling that early o= n, >>>> the >>>> majority of people who stopped contributing, did so because their >>>> patches >>>> weren't expressely blocked, but rather because they were simply >>>> ignored. >>>> No one >>>> working on DPDK bothered to review those patches, and so they never = got >>>> merged. >>>> Hopefully that problem has been addressed somewhat now. >> I agree 100% >>>>> I would like to see some type of layering process to allow patches = to >>>>> be >>>>> applied in a timely manner a few weeks not months or completely >>>>> ignored. >>>>> Maybe some type of voting is reasonable, but we need to do somethin= g >>>>> to >>>>> turn around the patches in clean reasonable manner. >>>>> >>>>> Think we need some type of group meeting every week to look at the >>>>> patches >>>>> and determining which ones get applied, this gives quick feedback t= o >>>>> the >>>>> submitter as to the status of the patch. >>>>> >>>> I think a group meeting is going to be way too much overhead to mana= ge >>>> properly. >>>> You'll get different people every week with agenda that may not line= up >>>> with >>>> code quality, which is really what the review is meant to provide. = I >>>> think >>> I was only suggesting the maintainers attend the meeting. Of course t= hey >>> have to attend or have someone attend for them, just to get the votin= g >>> done. If you do not attend then you do not get to vote or something l= ike >>> that is reasonable. Not that we should try and define the process her= e. >>> >>>> perhaps a better approach would be to require that that code owners >>>> from >>>> the >>>> maintainer file provide and ACK/NAK on their patches within 3-4 days= , >>>> and >>>> require a corresponding tree maintainer to apply the patch within 7 = or >>>> so. That >>>> would cap our patch latency. Likewise, if a patch slips in creating= a >>>> regression, the author needs to be alerted and given a time window i= n >>>> which to >>>> fix the problem before the offending patch is reverted during the QE >>>> cycle. >>>> >>>> >>>>>> On the other side, since user questions, community discussions and >>>>>> development happens in the same mailing list, things get really >>>>>> complicated, specially for users seeking for help. Even though I >>>>>> think >>>>>> the average skills of the users of DPDK is generally higher than i= n >>>>>> other software projects, if DPDK wants to attract more users, havi= ng >>>>>> a >>>>>> better user support is key, IMHO. >>>>>> >>>>>> So I would see with good eyes a separation between, at least, >>>>>> dpdk-user >>>>>> and dpdk-dev. >>>> I wouldn't argue with this separation, seems like a reasonable >>>> approach. >>>> >>>>> I do not remember seeing too many users on the list and making a li= st >>>>> just >>>>> for then is OK if everyone is fine with a list that has very few >>>>> emails. >>>>>> If the number of patches keeps growing, splitting the "dev" mailin= g >>>>>> lists into different categories (eal and common, pmds, higher leve= l >>>>>> abstractions...) could be an option. However, this last point open= s a >>>>>> lot of questions on how to minimize interference between the >>>>>> different >>>>>> parts and API/ABI compatibility during the development. >>>>> I believe if we just make sure we use tags in the subject line then= we >>>>> can >>>>> have our email clients do the splitting of the emails instead of >>>>> adding >>>>> more emails lists. >>>>> >>>> Agreed >> I think it is a good idea too. Maybe we can standardize some format e.= g. >> [TAG][PATCH vX], or something like that. >> >>>>>>> Perhaps it means having some ReviewBoard type of tools, a clone i= n >>>>>>> Github or >>>>>>> Bitbucket where the less hardcore kernel-workflow types could sen= d >>>>> back >>>>>>> their >>>>>>> small bug fixes a bit more easily, this kind of stuff. Google has >>>>> been >>>>>>> getting >>>>>>> good uptake since they moved most of their open source across to >>>>> Github, >>>>>>> because the contribution workflow was more convenient than Google >>>>> Code >>>>>>> was. >>>>> I like GitHub it is a much better designed tool then patchwork, plu= s >>>>> it >>>>> could get more eyes as it is very well know to the developer commun= ity >>>>> in >>>>> general. I feel GitHub has many advantages over the current systems= in >>>>> place but, it does not solve the all patch issues. >>>>> >>>> Github is actually a bit irritating for this sort of thing, as it >>>> presumes a web >>>> based interface for discussion. They have some modicum of email >>>> forwarding >>>> enabled, but it has never quite worked right, or integrated properly= . >> An alternative to githubs and bitbuckets is a self-hosted forge, like >> gitlab: >> >> https://about.gitlab.com/ >> >> To be honest, I mostly work on open-source repositories, and in our >> organization we use only gitlab for private repositories, so I haven't >> played that much with it. But it seems to do its job and has almost al= l >> of the features of the "community" github, if not more. I don't know i= f >> you can even integrate it with github's accounts somehow, to prevent t= o >> have to register. >> >> However, one of the important points of using github/bitbucket is >> visibility and ease the contribution process. By using an self-hosted >> solution, even if it is similar to github and well advertised in DPDK'= s >> website, you kind of loose part of that advantage. > I would suggest we use GitHub then picking yet another not as well know > Git Repo system, if we decide to change. I agree. I was just pointing out this as an option instead of=20 github/bitbucket. Basically to (still) self-host the repository and tools= . >>> Email forwarding has seemed to work for me and in one case it took a = bit >>> to have GitHub stop sending me emails on a repo I did not want anymor= e >>> :-) >>>>> The only way we can get patch issues resolved is to put a bit more >>>>> process >>>>> in place. >>>>>> Although I agree, we have to be careful on how github or bitbucket= is >>>>>> used. Having issues or even (e.g. github) pull requests *in additi= on* >>>>> to >>>>>> the normal contribution workflow can be a nightmare to deal with, = in >>>>>> terms of synchronization and preventing double work. So I guess >>>>>> setting >>>>>> up an official github or bitbucket mirror would be fine, via some >>>>> simple >>>>>> cronjob, but I guess it would end-up not using PRs or issues in >>>>>> github >>>>>> like the Linux kernel does. >>>> 100% agree, we can't be split about this. Allowing contributions fr= om >>>> n >>>> channels just means most developers will only see/reviews 1/nth of t= he >>>> patches >>>> of interest to them. >>> If we setup a GitHub or some other site, we would need to make Github >>> the >>> primary site to remove this type of problem IMO. >> You mean changing the workflow from email based to issues and pull-req >> or github pull req? Do you really think this is possible? > Yes, I think pull-req is the standard GitHub method as everyone needs a > repo anyway. If we can figure out how to integrate the email patches th= at > would be great. I think it is quite complicated. It needs to be completely seemless or=20 it won't work, and we will have part of the discussions in the mailing=20 list, and part in the pull-req issues. I would think it the other way around =3D> pull requests are "echoed" to=20 the mailing list to be discussed there, and always CCed (how) to the=20 issue to capture the discussion there too. Not trivial at all. marc >>>>> From what I can tell GitHub seems to be a better solution for a f= ree >>>>> open >>>>> environment. Bitbucket I have never used and GitHub seems more popu= lar >>>>> from one article I read. >>>>> >>>>> >>>>> >>>>> https://www.google.com/webhp?sourceid=3Dchrome-instant&ion=3D1&espv= =3D2&ie=3DUT >>>>> F- >>>>> 8# >>>>> q=3Dbitbucket%20vs%20github >>>>> >>>>> >>>>>> Btw, is this github organization already registered by Intel or so= me >>>>>> other company of the community? >>>>>> >>>>>> https://github.com/dpdk >>>>>> >>> I was hoping someone would own up to the GitHub dpdk site. >> Just wanted to know if this was the case. But, even if that would not = be >> the case, I *guess* that, as it happens with other services like >> twitter, facebook..., Intel could claim the user, since it has the >> registered trademark. >> >> marc >> >>>>>> Marc >>>>> If we can used the above that would be great, but a name like >>>>> =C5=92dpdk-community=C2=B9 or something could work too. >>>>> >>>>> We can host the web site here and have many sub-projects like >>>>> Pktgen-DPDK >>>>> :-) under the same page. Not to say anything bad about our current = web >>>>> pages as I find it difficult to use sometimes and find things like >>>>> patchwork link. Maintaining a web site is a full time job and GitHu= b >>>>> does >>>>> maintain the site, plus we can collaborate on host web page on the >>>>> GitHub >>>>> site easier. >>>>> >>>>> Moving to the Linux Foundation is an option as well as it is very w= ell >>>>> know and has some nice ways to get your project promoted. It does >>>>> have a >>>>> few drawbacks in process handling and cost to state a few. The proc= ess >>>>> model is all ready defined, which is good and bad it just depends o= n >>>>> your >>>>> needs IMO. >>>>> >>>>> Regards, >>>>> ++Keith >>>>> >>>>>>> Matthew.