GitHub AI Slop Meets The `--author` Loophole
The New Open Source Spam Pattern
Open source maintainers have always dealt with low-effort contributions. The older version was familiar: drive-by typo fixes, vague issues, dependency bumps nobody tested, and pull requests that copied an existing change with a different branch name.
The newer version is faster and more convincing. A bot can watch a repository, find an issue, ask a coding agent to generate a patch, open a pull request, and repeat that loop across dozens of accounts. The output often looks plausible at first glance. It has a normal branch name, a polite description, and code that compiles in simple cases. The hidden cost is review time.
Archestra ran into that pattern in its public repository. The project describes itself as an enterprise AI platform with guardrails, an MCP registry, a gateway, and an orchestrator. That made it exactly the kind of repository likely to attract AI-tool users: visible, active, and close to the agent tooling ecosystem.
The maintainers noticed waves of pull requests that were not just weak, but strangely similar. One example was repeated support for the same x.ai / Grok provider. GitHub search shows many closed pull requests with near-identical titles such as “add x.ai (Grok) LLM provider support.” The article that triggered the discussion says the maintainers saw the same issue solved again and again with minimal original understanding behind the submissions.
This is not only a code quality problem. It is a queue integrity problem.
Why Review Queues Break Before Code Does
A repository can survive bad code if maintainers can reject it quickly. The real damage starts when every submission requires careful inspection because it might be valid.
AI-generated pull requests create several review traps:
- They can be syntactically clean while missing product context.
- They can satisfy a narrow issue title while ignoring acceptance criteria.
- They can duplicate work already done in another pull request.
- They can look friendly and human enough to deserve a response.
- They can arrive faster than maintainers can triage them.
That last point changes the economics. A maintainer who spends five minutes rejecting one weak pull request has not lost much. A maintainer who spends five minutes each on 50 weak pull requests has lost half a day. If the project is small, that can consume the available maintenance budget for the week.
The natural response is to put a gate in front of the repository.
GitHub’s Prior-Contributor Gate
GitHub has an interaction limit called “Limit to prior contributors.” When enabled, only people who have previously contributed to the repository can open issues, pull requests, or comments for the selected time window. GitHub documents the feature as a way to temporarily restrict activity to users with a known contribution history.
For a maintainer dealing with sudden automated spam, this is attractive. It does not make the repository private. It does not block known contributors. It gives the maintainer a pressure valve while the spam wave passes.
Archestra enabled this gate and expected the flood to slow down. It did, but only briefly.
The surprising part was the bypass: Git attribution.
The --author Loophole
Git commits separate the person who authored the patch from the account that pushed it. That is a useful Git feature. Maintainers regularly apply patches on behalf of others, import historical commits, or preserve authorship across migrations.
The command is simple:
git commit --author="Name <email@example.com>"
In normal development, this is a provenance feature. In a moderation system, it can become a trust confusion bug if the platform treats authored commits as contribution history without enough separation from account identity.
According to Archestra’s write-up, spam accounts were able to set commit authorship to an existing contributor and then pass GitHub’s prior-contributor interaction limit. In practice, the repository setting was trying to answer one question: “Has this GitHub user contributed before?” The commit metadata supplied an answer to a different question: “Does this commit claim a known author?”
Those are not the same question.
The distinction matters because Git author fields are intentionally user-controlled metadata. They are not a login session. They are not proof that the named person pushed the commit. They are not proof that the GitHub account opening the pull request is trusted by the project.
That makes the gate weaker than many maintainers would assume.
What Archestra Changed
Archestra’s immediate fix was not to abandon public contributions. It was to tighten the repository’s workflow around contributor assignment and review.
The maintainers made it clear that contributors should not publish a pull request before being assigned to the issue. In a high-noise environment, that rule does two things:
- It gives maintainers a simple rejection reason for speculative patches.
- It shifts the first review question from “is this code good?” to “was this work authorized?”
That is a much cheaper question to answer.
The original issue that became a magnet for spam, “Support MCP Apps,” shows why this matters. The acceptance criteria were not a one-line provider integration. They required support in Archestra Chat UI, behavior through the MCP Gateway, behavior through the LLM Gateway, testing with real MCP vendors, and a working demo. A coding agent can produce a confident patch for the visible part of that request while still missing the actual product contract.
The maintainer rule turns broad issues back into coordinated work. If someone wants to help, they first ask to be assigned. If they are assigned, the pull request has context. If they are not assigned, the repository can close the pull request without spending review energy on every generated diff.
The Larger Lesson: Identity Is Not Intent
The most important lesson is not “AI pull requests are bad.” Some AI-assisted contributions are useful. The problem is treating a plausible patch as evidence of useful intent.
Maintainers need to separate four signals:
- Account identity: who opened the pull request.
- Commit authorship: who the commit metadata claims authored the work.
- Repository relationship: whether this account has a real history with the project.
- Work authorization: whether maintainers agreed this issue should be worked on by this contributor.
Before AI coding tools, many projects collapsed these signals together because the volume was manageable. After AI coding tools, that shortcut becomes fragile. The cost of producing a pull request has dropped, but the cost of understanding whether it belongs in the project has not dropped nearly as much.
That is why the best defenses are mostly workflow defenses.
A Practical Maintainer Playbook
If your repository starts seeing this pattern, start with reversible controls before making permanent policy changes.
First, add a visible contribution rule for contested issues:
Please do not open a pull request for this issue until a maintainer assigns it to you.
Then enforce it consistently. Close unassigned pull requests quickly and politely. Do not review the full diff first. If you review the full diff every time, the rule is not doing its job.
Second, use labels that make the queue cheap to scan:
needs-assignmentaccepted-contributorduplicate-ai-submissionneeds-maintainer-designgood-first-issue
Third, reserve broad architectural issues for known contributors or for contributors who have already discussed the approach. The larger the issue, the more expensive a context-free generated patch becomes.
Fourth, make acceptance criteria concrete. “Add provider support” invites shallow patches. “Add provider support with tests, settings UI, gateway behavior, error handling, and a demo path” gives maintainers a checklist and makes weak submissions easier to reject.
Fifth, audit trust settings with the assumption that Git metadata can be claimed. A prior-contributor gate may still be useful during a spam wave, but it should not be treated as a strong identity boundary if authored commits can influence the result.
Finally, consider automation for the boring checks. A bot can detect whether the pull request author was assigned to the linked issue. A bot can flag duplicate titles. A bot can warn when a new account opens a large pull request against a high-value issue. The point is not to replace judgment. The point is to keep human judgment for the cases that deserve it.
What Platforms Should Fix
Platforms should make the trust boundary explicit. If a moderation feature is based on prior contributors, maintainers need to know whether that means:
- the GitHub account previously merged a commit,
- the GitHub account previously opened an accepted pull request,
- the email in commit author metadata appears in repository history,
- or some combination of those signals.
Those details should not be surprising during an incident.
A stronger model would separate “authored commit history” from “account interaction trust.” Git author metadata should remain flexible, because it is useful and part of Git’s design. But repository interaction limits should be anchored to authenticated platform identity unless maintainers explicitly choose otherwise.
There is also room for better maintainer tools around AI-generated volume. Similarity detection, duplicate-issue grouping, assignment enforcement, and “new account touching hot issue” warnings would all help without banning AI-assisted work.
The Right Default Is Friction With A Door
The goal is not to punish new contributors. It is to make contribution intent legible.
Good open source projects need a path for unknown people to become trusted people. That path can include discussion before implementation, assignment before pull request, and small scoped issues before architecture-heavy changes. Those are not anti-contributor rules. They are how a project protects the attention that makes contribution possible in the first place.
AI lowers the cost of generating code. It does not lower the cost of maintaining a coherent product.
Archestra’s incident is useful because it shows the next moderation problem clearly: the pull request is no longer scarce. Maintainer attention is. Every serious repository will need policies and tools that reflect that reality.