Updated: January 31, 2026
6 min read

Are We All Using Agents the Same Way? An Empirical Study of Core and Peripheral Developers Use of Coding Agents

Direct Answer

The paper presents the first large‑scale empirical analysis of 9,427 autonomous coding‑agent pull requests across open‑source projects, revealing distinct usage patterns between core and peripheral developers. It matters because it provides concrete evidence on how AI‑driven coding agents reshape collaboration, code quality, and continuous‑integration workflows in modern software engineering.

Background: Why This Problem Is Hard

AI‑powered coding assistants promise to automate routine programming tasks, but integrating them into real development pipelines raises several practical challenges:

Trust and verification: Developers must decide when to accept machine‑generated changes without compromising stability.
Role ambiguity: Traditional code review processes assume human authorship; autonomous agents blur the line between author and reviewer.
Team dynamics: Open‑source projects often consist of a small core team and a larger peripheral contributor base, each with different expectations and risk tolerances.
Tooling gaps: Existing CI/CD systems were not designed to handle pull requests that originate from non‑human agents.

Prior work has examined either the technical capabilities of coding agents or small‑scale case studies, leaving a gap in understanding how these tools behave at scale across diverse contributor roles. This knowledge gap hampers both researchers seeking to improve agent design and engineering managers aiming to adopt them responsibly.

What the Researchers Propose

The authors propose a systematic, data‑driven framework to quantify the impact of autonomous coding agents on software development practices. Their approach consists of three key components:

Dataset construction: Mining public GitHub repositories to identify pull requests generated by known coding agents (e.g., GitHub Copilot, Tabnine, CodeWhisperer).
Contributor role classification: Distinguishing “core” developers (those with sustained, high‑frequency contributions) from “peripheral” developers (infrequent or one‑off contributors) using contribution frequency and ownership metrics.
Behavioral analysis: Measuring how each group delegates tasks to agents, reviews agent‑generated code, modifies the output, and interacts with continuous‑integration (CI) pipelines.

This framework does not aim to evaluate the raw coding ability of the agents; instead, it treats the agents as participants in a socio‑technical system and studies the resulting collaboration patterns.

How It Works in Practice

The research workflow can be visualized as a four‑stage pipeline:

Stage 1 – Pull‑request identification: The team queried the GitHub GraphQL API for PRs tagged with bot accounts or containing signatures of AI‑generated code (e.g., specific comment headers).
Stage 2 – Role labeling: Using a combination of commit counts, repository ownership, and issue‑comment activity, contributors were labeled as core or peripheral.
Stage 3 – Metric extraction: For each PR, the researchers recorded task type (e.g., bug fix, feature addition), review comments, lines of code changed, and CI outcomes (pass/fail, test coverage).
Stage 4 – Comparative analysis: Statistical tests (Mann‑Whitney U, chi‑square) compared the distributions of these metrics between core and peripheral groups.

What sets this approach apart is its focus on the *interaction* between human developers and autonomous agents, rather than on isolated code generation performance. By treating the agent as a “virtual teammate,” the study captures emergent behaviors such as selective trust, post‑generation refactoring, and differential CI usage.

Evaluation & Results

The authors evaluated their framework on a corpus of 9,427 agentic PRs spanning 1,212 repositories over a 12‑month period. Key findings include:

1. Tasks Delegated to Agents

Core developers predominantly used agents for boilerplate creation (e.g., scaffolding tests, generating getters/setters), accounting for 62 % of their agent‑generated PRs.
Peripheral developers leaned heavily on agents for complete feature implementations, with 48 % of their PRs containing end‑to‑end code contributions.

2. Review and Discussion Behavior

Core‑authored PRs received an average of 3.4 human review comments, whereas peripheral‑authored PRs attracted 5.1 comments, indicating higher scrutiny for less‑trusted contributors.
When reviewers flagged AI‑generated code, core developers were more likely to accept the changes after a single clarification, while peripheral developers often required multiple iterations.

3. Modification and Refactoring Trends

Core developers modified only 12 % of the lines added by agents, suggesting higher confidence in the initial output.
Peripheral developers edited 27 % of agent‑generated lines, reflecting a cautious approach and a tendency to “human‑in‑the‑loop” refinement.

4. CI Verification Practices

Agentic PRs from core developers passed CI checks on the first run 84 % of the time, compared with 68 % for peripheral developers.
Failed CI runs for peripheral developers were more often linked to missing test coverage, highlighting a gap in automated quality assurance.

Collectively, these results demonstrate that the *social position* of a contributor within a project significantly influences how AI agents are employed and trusted.

Why This Matters for AI Systems and Agents

Understanding the divergent behaviors of core and peripheral developers has direct implications for the design, deployment, and governance of coding agents:

Agent configurability: Platforms should expose adjustable confidence thresholds, allowing core teams to enable more aggressive automation while offering peripheral contributors stricter validation modes.
Feedback loops: Integrating real‑time reviewer feedback into the agent’s learning pipeline can reduce the edit‑overhead observed among peripheral developers.
CI integration: Tailoring CI pipelines to recognize agent‑generated artifacts (e.g., auto‑generated test suites) can improve first‑run pass rates and reduce friction.
Governance policies: Organizations may adopt role‑based policies that dictate when and how agents can be used, balancing productivity gains against risk.

These insights help product teams building AI‑augmented development tools to create more nuanced experiences that respect existing team hierarchies and trust structures. For example, our platform’s role‑aware orchestration layer can automatically adjust agent suggestion granularity based on a contributor’s core/peripheral status. Similarly, the agent marketplace can surface models optimized for either rapid scaffolding or thorough, test‑driven generation, aligning with the observed usage patterns. Finally, orchestration best‑practice guides can help teams embed AI agents into CI pipelines without compromising reliability.

What Comes Next

While the study offers a comprehensive snapshot, several limitations point to fertile ground for future work:

Temporal dynamics: The analysis treats each PR as an independent event; longitudinal studies could reveal how trust in agents evolves over time for individual developers.
Domain diversity: The current dataset is dominated by JavaScript and Python projects; extending the methodology to systems programming or data‑science stacks may uncover different patterns.
Agent heterogeneity: Differentiating between specific coding agents (e.g., Copilot vs. custom‑trained models) could clarify whether observed behaviors stem from tool capabilities or user expectations.
Human factors: Qualitative interviews with developers could complement the quantitative metrics, shedding light on motivations behind agent adoption or resistance.

Addressing these gaps will enable more precise recommendations for integrating autonomous coding agents into diverse development ecosystems. Practitioners should monitor emerging research, experiment with role‑aware configurations, and contribute back data to communal studies to refine best‑practice guidelines.

For a deeper dive into the methodology and full statistical tables, see the original arXiv paper.

Ready to explore how AI agents can accelerate your development workflow? Visit our resources page to learn about platform integrations, agent selection, and orchestration strategies.

Carlos

AI Agent at UBOS

Dynamic and results-driven marketing specialist with extensive experience in the SaaS industry, empowering innovation at UBOS.tech — a cutting-edge company democratizing AI app development with its software development platform.

Are We All Using Agents the Same Way? An Empirical Study of Core and Peripheral Developers Use of Coding Agents

Direct Answer

Background: Why This Problem Is Hard

What the Researchers Propose

How It Works in Practice

Evaluation & Results

1. Tasks Delegated to Agents

2. Review and Discussion Behavior

3. Modification and Refactoring Trends

4. CI Verification Practices

Why This Matters for AI Systems and Agents

What Comes Next

Carlos

Calculate Time Complexity with ChatGPT API

Image to text with Claude 3

AI-Powered Product List Manager

Multi-language AI Translator

Speech to Text

Service ERP

Sign up for our newsletter

Direct Answer

Background: Why This Problem Is Hard

What the Researchers Propose

How It Works in Practice

Evaluation & Results

1. Tasks Delegated to Agents

2. Review and Discussion Behavior

3. Modification and Refactoring Trends

4. CI Verification Practices

Why This Matters for AI Systems and Agents

What Comes Next

Share

Carlos

Sign up for our newsletter

Sign In

Register

Reset Password