From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4F60A38B for ; Tue, 29 Jul 2025 05:21:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753766495; cv=none; b=QwbAtRKNv2ezk6d04DQnnq9azFcuYPUQq5jHkMFndcstvqniqMslWro9VEeL2sMrcYjp5tMxwQpeU9t8pt6mmrs0qlEUhMZV6fkCRD5XFum8FkvVyAcXCNvojLbI/RsgFKnWIKWceqA5/UqhdZctnWrw6Z2u6ZBGBoHSwahjbF8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753766495; c=relaxed/simple; bh=un3caCQsfB6NVtvD52Mw7u4ZQYu5Py1QtGPoo+xADic=; h=Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc: Message-Id:References:To; b=RAzMJ9jstiKMmRLn5kPLLPUIIsfAvv9rWs5Js5k2DPaBiTAJuCb6oQTED+o217bZIzJF4344lpZLJ5D6PY0R2hZrHhKfiAYI8d89V5/Niigc+oxpQJ9airVdqBWNVxCRtZfb4PZB+sQROLYViGq7X59j7MQIJ4FM3EfC7qeLQFI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ivyUxvUv; arc=none smtp.client-ip=209.85.214.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ivyUxvUv" Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-24049d1643aso9317025ad.3 for ; Mon, 28 Jul 2025 22:21:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1753766493; x=1754371293; darn=vger.kernel.org; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=5bxTRZjLvYRhP1ISqIrwzHe0TuRWdcDjRkzwBwRawRE=; b=ivyUxvUvZE6jGw/mxJGXX4cyg0k6cQ/Ew/lE/tFbXFHF8VQJUjyr58ogCYYJ3Sx1K4 eHh3EZ8c+n+JsiJAeSPopj88MbdoFJwO5U4ZSDfDRD5a9T4lbD5+tYm0BpPj22CmaiB7 Ereo2iS254NZoGxtWnMLG7nlf4KNBANyi2QcI54tFPFfU7AlfjXGnZ5zdQITenfvZ4MV hKIwfd8zEyavoVR85HYL6r8TgIJGgj/fqd8PGrsBQVOZ33dQF/Gtd87n/uUQoYDyvRbR 4hHNtFeeHfpUE0rUsw1aS3f0Jn+ZkIF3CP55GhVklMCjkAr87AikCLvgCxqJz19z4NjK pd3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753766493; x=1754371293; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5bxTRZjLvYRhP1ISqIrwzHe0TuRWdcDjRkzwBwRawRE=; b=i2PkO8DEY1cnwgBZKUBdqju4U09ZfwejC493zWJCMCc/yQVsKnYf8nb06yPlqMoY5Q 154kMm+5wQjYBL5PSmMUUmdD0VEBxXH/GsI8gHwHMt+4Zovu2o8QkdP4zfXvw5HQmvQf o35JkXfTE8NhtGFhJqzZycJ3GLctbGVILNk4di36DhDQkLF0qhPff4dazTQlB0nODDBg JPcmbGlMhWUvsNMax704S+c+EMQa+02SmKdmYOZ25nHYovrtVc6W0SQAVOgByhwF4R+A jfUKnLkrz1QwL1j4otoEIdneIgmemytMhu9fw/b+khhw/S3o1dG9nBRCfb1FP0PVGmEu mKeQ== X-Gm-Message-State: AOJu0YxJ0kjrbNp0Lt7K4Sh4R/WsqjEqZSY5ee0QfdcSTergVC213D7o wDnz7kc8eyoQlHsmMqRGB5CmQf3PkaL3pHwXYpJ4SxsJVLTe2IqCuF3N2dusjnjV X-Gm-Gg: ASbGncvYdo1FwQDVANSkOhKsfdHhCB9Wv+aIO5aK+xqvEQzC2mk8r8pkIEBqWWSVMvo t372eaGHT54ZSdZoUFd3sxHzoGWBUqjfXt4X8tCGpHy+XdxPiDa7KoIZ+dkpSHxndFBEtsxL5/C jZEhsvFYhL0vbqSRBWTS3pzGhbsrH+GKnwSRzRgn0Hez0sYxsMSbhrR9UkCCzYf7Q7ckLikJi4p 6ZppPpJYjzSbe27nqmKnkPTHK7dVL7faSBQeUAj4l5D0x0UP7iLeTLtVG9Q4MsVTcBq93DM/9IF /G0u/OHhvGP7+jd+7suqlfbaT+IeLWzAS/hfVvtVzUAoBK5zt7EeyU7KJRurMU9DgWe6mUpvgBa dAiY/DQRWpcT49Lrc93pRbCITIQ+kFN2J+MiLN2xn1zH4Og== X-Google-Smtp-Source: AGHT+IF63oe2/6b06xpit+DWNL4ttD0wJucT9u5AemWF091Pi826cBjh1w8E6vIgjACuQAsgyGxYWg== X-Received: by 2002:a17:902:e78e:b0:235:1962:1c13 with SMTP id d9443c01a7336-23fb302a2c8mr161392035ad.14.1753766493159; Mon, 28 Jul 2025 22:21:33 -0700 (PDT) Received: from smtpclient.apple ([106.222.211.195]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-b3f7f58bec0sm6128921a12.17.2025.07.28.22.21.31 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 28 Jul 2025 22:21:32 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.400.131.1.6\)) Subject: Re: Discussion: Future-Proofing Git for Massive AI Parallelism From: Tanish Desai #TD In-Reply-To: Date: Tue, 29 Jul 2025 10:51:18 +0530 Cc: "git@vger.kernel.org" Content-Transfer-Encoding: quoted-printable Message-Id: References: To: Skybuck Flying X-Mailer: Apple Mail (2.3826.400.131.1.6) > On 20 Jul 2025, at 6:11=E2=80=AFPM, Skybuck Flying = wrote: >=20 > Dear Git Community, >=20 > I=E2=80=99d like to spark a conversation about the evolving demands on = version control systems in the age of AI - > specifically, massive parallel processing and collaboration among = swarms of autonomous AI agents. >=20 > Git=E2=80=99s architecture is rock solid for human developers, but = when scaled to the synthetic masses, some limitations start to bite. Yes this is true and I am also conservative about it.=20 I was building a product which collaborated multiple AI Agents and git failed miserably. >=20 > Challenges We=E2=80=99re Facing: >=20 > - Human-Centric Workflows: > Commits, branches, merges=E2=80=94great for humans. But when = thousands of AI agents try to play ball, > Git feels like it=E2=80=99s hosting a developer convention inside a = phone booth. > =20 > - Large Binary Assets: > AI projects sling around multi-gigabyte models and datasets like = frisbees. Git LFS helps, but it=E2=80=99s struggling in the big leagues. > =20 > - Conflict Resolution at Scale: > With thousands of agents updating stuff 24/7, merge conflicts become = a cosmic horror. Human-driven resolution? Not scalable. Many AI agents now-a-days support automatically resolving merge conflict using LLM but why not adding a native support in Git itself? > =20 > - Authentication Overload: > Static credentials and manual account setups don't scale when every = AI agent needs dynamic, role-based access. >=20 > - Semantic Blindness: > Git tracks text, not meaning. AI changes like hyperparameters or = architecture tweaks need smarter, semantic versioning. > =20 > Potential Paths Forward: >=20 > Short-Term: >=20 > Supercharge Git via smart tooling: >=20 > - Tighten integration with MLOps systems like DVC, MLflow, LakeFS: >=20 > These tools specialize in handling the chaotic realities of AI = development=E2=80=94massive datasets, frequent experiments, and = ever-evolving model versions. > By deeply integrating Git with them, we can: > --- Offload Large File Management: Let DVC or LakeFS handle model = binaries and datasets with scalable storage backends, while Git focuses = on code. > --- Track Experiments Natively: MLflow records hyperparameters, = metrics, and artifacts=E2=80=94linking them directly to Git commits = provides rich reproducibility. > --- Enable Smarter Merges: AI-native tools can inform merge decisions = based on model performance metrics or semantic changes, not just = line-by-line diffs. 3-way diff merge is not an solution for AI generated=20 workloads mainly because they are so fast at=20 generating code. 3=E2=80=91way diff merges assume that changes happen=20 at human speed: you fetch the latest remote state,=20 make your edits, and then merge and push before anyone else has moved the branch.=20 But AI agents operate orders of magnitude faster.=20 By the time one agent fetches, modifies, merges, and pushes,=20 another agent has already updated the same files=E2=80=94 so every =E2=80=9Cmerge=E2=80=9D either conflicts or silently overwrites = prior work. > --- Facilitate Parallel Agent Workflows: These platforms already = support multi-run and multi-agent tracking. Git can lean on them to = orchestrate agent commits > without bottlenecks. A simple strategy many multi=E2=80=91agent systems (and I) use is = file=E2=80=91level locking, which Git doesn=E2=80=99t support natively: on Linux, wrap your merge command with=20 flock /path/to/file.lock -c "git merge origin/main=E2=80=9D(only files = involving the merge), and require all agents/people to also use the same=20 flock /path/to/file.lock when editing that file=E2=80=94 if it=E2=80=99s locked by the merge, they=E2=80=99ll automatically wait,=20= then resume against the updated version. But this strategy slows down AI agents.=20 > --- Unify Dev & Ops Pipelines: A tighter link between version control = and operational tools helps automate everything from data prep to = deployment. > --- If Git becomes more than just a file versioning tool and evolves = into a smart orchestration layer, integrating these systems could turn = it into the > central nervous system of AI development. >=20 > - Create orchestration layers for automated agent commits and = batching: >=20 > When thousands of AI agents are making changes = simultaneously=E2=80=94whether to code, models, or config files=E2=80=94it= =E2=80=99s chaos unless there=E2=80=99s a system coordinating > those contributions. Orchestration layers act like traffic = controllers, guiding when, how, and what agents commit. Even I share that vision: a Git redesigned for AI fleets. The final goal would be to develop Git to a level=20 fleets of AI agents could seamlessly use it to build projects which earlier took decades to complete=20 now could be build in weeks by agents. Also thanks for mailing this : )=20 >=20 >=20 >=20 > What These Layers Would Do: > --- Batch Commits: Instead of every agent making atomic commits = constantly (leading to performance overload and conflict central), the = system groups related > changes together and pushes them as unified commits. > --- Schedule and Prioritize: Not all agents are equal. Some are more = critical or trusted. An orchestration layer can schedule their commits = based on priority, > timing, or dependencies. > --- Conflict Mitigation: Before committing, the system checks for = overlaps and intelligently merges or staggers updates to reduce merge = hell. > --- Audit and Rollback: It can log which agent did what, allowing = transparency and reversibility if something breaks. > --- Meta-Agent Oversight: You could even create supervisor AI agents = whose job is to monitor and optimize commit behavior across the fleet. fleet is the key! >=20 > Why It's Important: > --- Without orchestration, it's like 10,000 bots trying to edit a = document at once. Git wasn't built for that kind of speed or = concurrency. > --- This layer turns AI collaboration into a harmonized symphony, = instead of a noisy code stampede. >=20 > If Git had built-in support for this kind of orchestration=E2=80=94or = if a wrapper system implemented it=E2=80=94you could revolutionize how = synthetic intelligence collaborates at scale. > Want to brainstorm what these meta-agents or orchestration rules would = look like? > I=E2=80=99m loaded with ideas. >=20 > - Improve tracking/versioning of AI-native assets: configs, metrics, = logs >=20 > Long-Term: Consider an =E2=80=9CAI-Native=E2=80=9D versioning system > - Semantic conflict resolution powered by AI > - Native support for large models and datasets > - Dynamic permissions for AI agents without static user accounts > - Graph-based, event-driven change tracking beyond linear commit = history >=20 > Let=E2=80=99s explore what=E2=80=99s possible. Whether it=E2=80=99s = evolving Git or drafting a next-gen system, your expertise could help = shape how AI collaborates at scale. >=20 > Thanks for reading=E2=80=94and yes, no rogue AI has committed rm -rf = /=E2=80=A6 yet. >=20 > Sincerely, > Skybuck Flying