All of lore.kernel.org
 help / color / mirror / Atom feed
* Next steps from GTI TAC meeting on 2023-03-08 - Evaluate cost of glibc migration.
@ 2023-04-03 17:24 Carlos O'Donell
  2023-04-12 20:46 ` Konstantin Ryabitsev
  2023-05-19 21:00 ` Konstantin Ryabitsev
  0 siblings, 2 replies; 37+ messages in thread
From: Carlos O'Donell @ 2023-04-03 17:24 UTC (permalink / raw)
  To: Konstantin Ryabitsev, gti-tac, Khahil White

Konstantin,

The GTI TAC met on 2023-03-08 (please ignore the date typo 06-08):
https://lore.kernel.org/gti-tac/b72a0c4f-fed3-7418-e9d6-7a277bf37b3e@redhat.com/

There was consensus that we should start working through the process steps one
project at a time and pipe-clean the process and learn as we go.

I volunteered glibc as the first project because I know the most about the project
and the requirements.

I am providing the full list of glibc services for review and pricing by LF IT.

When I say "review" I am looking for a critical eye from LF IT where you might
recommend a better solution that you can support in the long term e.g. wiki,
bugzilla, mailing lists, mailing list archives, etc. Likewise for things we should
drop as legacy and not convert them over.

I expect that pricing depends on the final solution and the exact work to be done
so this will take some iteration. Likewise I expect pricing is split between NRE
for the shift vs. ongoing.

That means the the next steps likely look like this (needs further discussion):
* Iterate over the services below and discuss how and what to migrate them to.
* Start socializing with the glibc project community.
* Finalize on the approximate details of the migration and ongoing services.
* Finalize on the cost and timeline.
* Get the GTI TAC to review and approve.
* Get the GTI Board to review and approve.
* Take it to the glibc project for approval.
* Execute on timeline.

Complete set of known glibc services:

* mailing lists
  * Mailman 2 mailing lists: https://sourceware.org/mailman/listinfo/*
    * libc-announce
    * libc-alpha
    * libc-stable
    * libc-help
    * libc-locales
    * libc-testresults
    * glibc-cvs
    * glibc-bugs
    * glibc-bugs-regex (limited bugs just for regex).
    * Closed legacy mailing lists:
      * libc-ports: https://sourceware.org/mailman/listinfo/libc-ports
      * libc-hacker: https://sourceware.org/mailman/listinfo/libc-hacker
      * Also the older MHonArc archives (/legacy-ml/) - no longer updated, but the 
        /legacy-ml/ URLs and /ml/ redirects to them need to keep working.
        E.g. https://sourceware.org/legacy-ml/libc-alpha/2020-01/
        E.g. https://sourceware.org/legacy-ml/glibc-bugs-regex/2020-03/
    * Mailing lists accept non-html email only.
    * Run through spamassasin
    * Run through clamav
  * Pipermail archives:
    * https://sourceware.org/pipermail/*
    * e.g. https://sourceware.org/pipermail/libc-alpha/
  * public-inbox archives:
    * https://inbox.sourceware.org/*
    * e.g. https://inbox.sourceware.org/libc-alpha/
    * Not all inboxes work correctly e.g. glibc-bugs-regex doesn't work.
  * Migration notes:
    * Only supporting public-inbox for ml archives.

* bugzilla 5.0.4+
  * Uses backend SQL database of MariaDB 10.3
  * Must be able to send email to glibc-bugs mailing list.
    * Don't know how email is routed to this list.
  * Must also send email glibc-bugs-regex mailing list.
    * Don't know how email is routed to this list.
  * Must be able to send email to all users on the bug.
  * Must be able to receive email when someone responds to a glibc-bugs
    email e.g. sourceware-bugzilla@sourceware.org.
  * Custom Administration->Groups settings for User RegExp.
    * canconfirm: Allow certain domains to always be able to confirm bugs.
    * editbugs: Likewise but for editbugs.
  * Must have REST API enabled to allow RM to generate release list
    of fixed bugs using the glibc/scripts/list-fixed-bugs.py script
    e.g. https://sourceware.org/bugzilla/rest.cgi/
    * Implies that non-logged-in users can list and view all bugs
      that were fixed for the release.
  * Must have account creation disabled due to spamming.
  * Must have someone with Bugzilla admin access to:
    * Add new users to bugzilla.
    * Add new Product components, versions, and milestones.
    * Add new Key Words
    * Remove users.
  * Migration notes:
    * https://sourceware.org/bugzilla/
    * Consider starting fresh in new BZ instance and freeze old product.
    * glibc in old instance marked "Not open for new bugs."

* git 2.31
  * Allows per-user access to commit to the glibc repo.
  * Allows per-user access to commit to the legacy glibc-ports repo.
  * Uses group access to control repository access.
  * Must be able to send email to glibc-cvs mailing list with one
    email for each commit made by a developer to any branch of the repository.
  * AdaCore hooks need more thorough audit for required services.
    * Must be able to send email to bugzilla to update bugs.
      * Done by AdaCore hook 'file-commit-cmd'
      * Configured to use email-to-bugzilla-filtered command.
        * Uses connection to SQL database to determine if bug exists.
  * Currently uses shared AdaCore hooks configured via origin/meta/config 
    * Active hooks:
      * post-receive
        * AdaCore post_receive
        * /git/glibc.git/hooks-bin/post-receive
          * Triggers irkerhook.py (see notes below).
          * Does not work today, likely due to requirement to register OFTC user.
      * post-update
	* Standard git-update-server-info.
      * pre-receive
	* AdaCore pre_receive
    * AdaCore config:
      * No max line lengths.
      * Allow UTF-8 in commit messages.
      * 5MiB max email size.
      * Max 500 commit messages for larger commit series sent to glibc-cvs.
      * Reject merge commits to master and release branches.
      * Allow rebasing only private branches (non master and non release).
      * Run minimal style checker, nominally for whitespace issue rejection.
        * Run extra commit checking to avoid source address for author being wrong.
          * /git/glibc.git/hooks-bin/commit_checker
            * From email format checker. No special requirements.
        * /git/glibc.git/hooks-bin/style_checker
          * Style chcker. No special requirements.
      * Send email to bugzilla if a commit mentions a bug.
        * /git/glibc.git/hooks-bin/email-to-bugzilla-filtered
          * Uses /sourceware/infra/bin/email-to-bugzilla
          * Must be able to connect to bugzilla SQL database.
          * Does not appear to work today. We don't get emails for commits with bugs.
      * Send IRC message to per-project configured IRC channel.
        * Involves irkerhook.py and git config information for project.
        * Hook must be able to connect to external IRC networks to post IRC notices.
  * Migration notes:
    * Allow community to manage access?

* wiki
  * Uses MoinMoin 1.9.10
  * Must have account creation disabled due to spamming.
    * Uses EditorGroup permissions to allow any community member to add a new
      community member to the wiki e.g. human vetting another human.
  * Must be able to send notification emails.
  * Cron run to purge users not in EditorGroup to prevent wiki slowdown.
  * Migration notes:
    * Preference for something git based.

* patch management.
  * Uses patchwork v3.1.1.post18-g11cf1f3
  * Must be able to receive email (as part of collecting patch data)
  * Must be able to send emails as part of account verification.
  * Uses django for administration
  * Must allow authenticated REST API access for patchwork.
    * Currently rate limited.
    * Used by SLI tools (Carlos O'Donell)
      * Run manually on developer systems.
    * Auto-close on commit patchwork bot (Siddhesh Poyarekar)
      * Run on sourceware.org via cron.
  * Used for weekly patch management meetings.
  * git-pw integration used to access patchwork directly using REST API and API token.
  * Migration notes:
    * Patchwork a strong requirement for upstream CI/CD.

* Red Hat Bluejeans remote meeting system.
  * Must allow remote video and audio for participants around the world.
  * Allows weekly glibc patch review meetings for patch review collaboration.
  * Meetings must operate without host needing to be present so community can host.
    * Delegating host is difficult in bluejeans.
  * Managed by Bluejeans/Verizon.
  * The glibc community has switched to LF BBB instance for the last 24 meetings.
    * Since 2022-12-12 we have been using LF BBB instance successfully to host weekly meetings.
  * Migration notes:
    * Already migrated to LF BBB. NOP.

* pre-commit CI system.
  https://gitlab.com/djdelorie/glibc-cicd
  * Run inside a VM.
  * Uses networkless containers for further build isolation.
  * Highest risk system because it runs mailing list posted patches.
  * Event curation system (curator):
    * Must have network access to patchwork REST API.
    * Must have access to SQL database for storing state.
      * Currently using MariaDB.
    * Must allow runners to access curatore REST API URL.
    * One curator currently hosted by DJ Delorie.
  * Event running system (runner + trybots):
    * Must have network access to curator REST API.
    * Must have local network access to rabbitmq queue (job delegation)
    * trybots must have local network access to rabbitmq.
      * Must have network access to patchwork REST API to post results.
      * Must have network access to container registries to pull modern containers.
      * Must have network git access to pull updated glibc git repo.
    * Generally the runner and trybots are on one site together.
      * Avoid passing rabbitmq traffic beyond the local network.
      * Eventual emailing of results to the mailing list will happen via another bot
        that is distinct from this system to avoid the runners needing anything but
        restricted network access.
    * One runner hosted by DJ Delorie	
    * One i686 trybot hosted by DJ Delorie
    * One "patch applies" trybot hosted by DJ Delorie
  * Migration notes:
    * Could argue no migration required. LF hosts patcwork. Community hosts try bots.

* Website (sourceware.org)
  * CVS hosted website.
  * Static redirect to gnu.org website.
  * Migration notes:
    * Need something for a static site.

* Website (gnu.org)
  https://www.gnu.org/software/libc/
  * CVS hosted website uploads along with manual.
    * Manuals are generated with scripts in the CVS repo and generated files committed.
  * All static content.
  * Website automatically updated after CVS commits.
  * Manged by the GNU Project/FSF.
  * Migration notes:
    * No migration required see previous "Website (Sourceware.org)" entry.

* Release tarballs (ftp upload of gpg-signed release tarballs)
  https://ftp.gnu.org/gnu/libc/
  * Use gnupload script to gpg sign uploaded tarballs.
   * Uses ncftpput to place files into /incoming directories.
   * Network ftp access required.
  * Managed by the GNU Project/FSF.
  * Migration notes:
   * No migration required, would continue to upload to FSF.
   * Longer term discussion to use something more advanced.
   * Though good to have a backup following kernel best practice.

* Translation project services
  https://translationproject.org/html/welcome.html
  * https network access to TP servers to fetch uploaded translation files.
  * Managed by the Translation Project.
  * Migration notes:
    * Don't expect to have a replacement.

* IRC services on OFTC and Libera.Chat
  * Using #glibc on both networks for community interaction.
  * Migration notes:
   * Don't expect to have a replacement.

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 37+ messages in thread
* Re: Next steps from GTI TAC meeting on 2023-03-08 - Evaluate cost of glibc migration.
@ 2023-05-19 22:47 Joseph Myers
  2023-05-22 10:11 ` Joel Brobecker
  2023-05-23 17:38 ` Konstantin Ryabitsev
  0 siblings, 2 replies; 37+ messages in thread
From: Joseph Myers @ 2023-05-19 22:47 UTC (permalink / raw)
  To: cti-tac

On Fri, 19 May 2023, Konstantin Ryabitsev wrote:

> Our mailing list infrastructure is uniquely tailored to public-inbox users,
> with messages being written to the archives *before* they are even sent out to
> subscribers, which helps speed up access for users who aren't using the

There is a certain advantage there regarding the problem we currently have 
where messages sent as HTML are missing from public-inbox archives, if the 
process that strips HTML parts takes place before feeding to public-inbox 
(whereas at present it takes place after that, I think).

(I consider it important that we do *not* insist on contributors meeting a 
shibboleth of sending plain-text email before they can interact with our 
mailing lists, especially for user lists such as libc-help - we should be 
friendly to people sending email in the ordinary form they are used to 
using today even if we might rather they didn't send HTML email - we 
shouldn't impose that view of proper email on them.)

> The major improvement of this vs. bugzilla's native email interface is that
> bugs can be created from mailing list discussions, not just the other way
> around. The bugbot will need to be slightly adjusted for CTI needs, but
> otherwise I think it would be a better solution than Bugzilla's native email
> integration. For example, when bugmail is sent to the mailing list and someone

Note that people can send email to Bugzilla in reply to a message they 
received directly from Bugzilla - not just in reply to a message on 
glibc-bugs.  I think it's important that both cases of replying should 
work automatically, however that's implemented.

(If someone replies to a glibc-bugs message and forgets to remove 
glibc-bugs from their reply, the list gets two copies of the message at 
present - one that the user sent directly, one forwarded by Bugzilla.  
Avoiding that duplication somehow would be nice to have, but entirely 
optional.)

Note also that Bugzilla is one of the hardest migration issues (for 
non-GCC projects), because (as I mentioned in one of the meetings) 
Sourceware Bugzilla is used by many different projects, most of which are 
not part of CTI.

> * Sending per-commit email to mailing lists -- this seems like a vestigial
>   feature from the pre-git past. Does it really serve any purpose? I am
>   fighting to kill a similar feature used on the kernel.org side, because I do
>   not see any remaining legitimate use for it -- it just creates a lot of mail
>   traffic that nobody really reviews.

I think it's extremely useful to have those messages for watching out for 
commits that shouldn't have gone in or went in in an unintended form (this 
is for human-determined "shouldn't" or "unintended" - not anything 
expected to be covered by automated checks).  And also to have a message 
ready-made to reply to when something was committed without being posted 
to the mailing list.

> * We do not currently allow direct hooks on the server side, because running
>   arbitrary scripts with the permissions of the git server is a very bad
>   anti-pattern when it comes to repo security. It also tends to make pushes
>   super slow and frustrating to developers, especially those working on slow
>   or unstable connections that can go rapidly stale, resulting in failed
>   pushes.
> 
>   So, we will need to wrap our collective heads around what your hooks are
>   doing now and implement them via alternative means. For example:

Certainly some kind of system for proxying / containing checks to limit 
what they can do if buggy makes sense - they don't need to run with the 
permissions of the git server, they can run in some kind of isolation, as 
long as they prevent bad commits from getting into the history.

>   * denying force-pushes to specific branches (rebase/non-fast-forward
>     updates) is a native feature of Gitolite

What about denying merge commits on master and release branches?

Note that GCC has its own custom set of namespaces for branches (under 
refs/users/ and refs/vendors/) and rules about what branch names may be 
created at all etc.  (Unfortunately there seem to be some missing checks 
for the case of lightweight tag creation, resulting in some improperly 
named tags being present.)

>   The only thing that would require more work is style checkers and other CI
>   functionality that really should not be happening on-push. I suggest that
>   this moves entirely to patchwork CI, discussed below.

I think those things *should* be happening on push - to prevent obviously 
bad history getting onto master or release branches (at least) in the 
first place.  For glibc, that's (apart from the non-fast-forward checks 
and rejecting merge commits on branches meant to have linear history):

* Disallowing lines with trailing whitespace in certain files.

* Disallowing commits with a mailing list address as the author email 
(even if your mailing list setup is designed to be DKIM-safe so author 
emails don't need rewriting, people might still use "git am" with old 
messages, so this remains relevant).

* Disallowing commit subject lines that look like a ChangeLog header.

* Disallowing single-word (or empty) commit subject lines.

For GCC there are several extra checks that I enumerated - including, in 
particular, the fairly complicated checks for ChangeLog format to avoid 
the nightly cron job updating ChangeLogs falling over.  (The check to 
avoid From-SVN: lines is similarly to prevent a commit that would cause 
subsequent problems for automation that relies on those lines only being 
present in commits converted from SVN in order to look up such a commit by 
SVN commit number.)

> ## Release tarballs
> 
> There are multiple ways of getting this done, including completely automated.
> For example, stable kernel releases are generated server-side by using and
> verifying the PGP signature found in a git note attached to the release tag.
> This doesn't work if the tarball is not directly generated from a git
> repository (for example, if config scripts must run first).

Is "git archive" output or similar actually stable enough that the server 
can reliably generate (at any time in the future, with a future git 
version) a tarball matching the tarball signature the release manager 
generated at release time, or is that not what you meant?

I'd certainly expect the tarballs served as release tarballs to be 
byte-for-byte identical to the ones at https://ftp.gnu.org/gnu/glibc/ (and 
the signatures likewise to be identical).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2023-05-25 11:12 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-03 17:24 Next steps from GTI TAC meeting on 2023-03-08 - Evaluate cost of glibc migration Carlos O'Donell
2023-04-12 20:46 ` Konstantin Ryabitsev
2023-04-12 21:28   ` Brian Behlendorf
2023-05-19 21:00 ` Konstantin Ryabitsev
2023-05-22 14:37   ` Ian Kelling
2023-05-24 20:50     ` Carlos O'Donell
2023-05-24 21:53       ` Ian Kelling
2023-05-25 11:02         ` Carlos O'Donell
  -- strict thread matches above, loose matches on Subject: below --
2023-05-19 22:47 Joseph Myers
2023-05-22 10:11 ` Joel Brobecker
2023-05-24 12:44   ` Carlos O'Donell
2023-05-24 13:43     ` Siddhesh Poyarekar
2023-05-24 14:12       ` Carlos O'Donell
2023-05-24 15:18         ` Siddhesh Poyarekar
2023-05-23 17:38 ` Konstantin Ryabitsev
2023-05-23 19:34   ` Joseph Myers
2023-05-23 19:52     ` Konstantin Ryabitsev
2023-05-23 20:12       ` Joseph Myers
2023-05-23 20:21         ` Konstantin Ryabitsev
2023-05-23 22:12           ` Joseph Myers
2023-05-24 16:29             ` Konstantin Ryabitsev
2023-05-24 18:13               ` Siddhesh Poyarekar
2023-05-24 18:55                 ` Brian Behlendorf
2023-05-24 21:27                   ` Siddhesh Poyarekar
2023-05-25 11:06                     ` Carlos O'Donell
2023-05-24 19:06                 ` Konstantin Ryabitsev
2023-05-24 19:42                   ` Joseph Myers
2023-05-24 21:12                   ` Siddhesh Poyarekar
2023-05-25 11:12                     ` Carlos O'Donell
2023-05-24 18:58               ` Joseph Myers
2023-05-24 12:11       ` Siddhesh Poyarekar
2023-05-24 18:44         ` Joseph Myers
2023-05-24 19:58           ` Carlos O'Donell
2023-05-24 12:57     ` Carlos O'Donell
2023-05-24 18:46       ` Joseph Myers
2023-05-24 20:19         ` Carlos O'Donell
2023-05-24 20:48           ` Joseph Myers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.