← Back to writing

How Senior Developers Use AI Coding Tools: What Works

How Senior Developers Use AI Coding Tools: What Works

In 2023, 60% of developers routinely used AI to write and optimize code. Three senior engineers with 10–20+ years of experience share the workflows, tools, and prompt patterns that actually move the needle.

Mayur Domadiya · June 11, 2026 · 8 min read

In 2023, more than 60% of developers surveyed in the State of DevOps Report reported routinely using AI to analyze data, generate and optimize code, and teach themselves new technologies. But 60% adoption does not mean 60% are using these tools in ways that actually move the needle. The gap between using AI and using it effectively is defined by specific choices: which tasks to delegate, which model tier to use, how to structure prompts, and what never to send to a public model. Three senior engineers with 10 to 20+ years of experience share the workflows and patterns that separate productive AI use from the version that wastes time and creates security exposure.

The Two Roles AI Plays Well — and the One It Can't Fill

The most useful framing for how to deploy AI in a development workflow comes from Aurélien Stébé, a full-stack web developer and AI engineer with more than 20 years of experience: "Generative AI is both an expert coworker to brainstorm with who can match your level of expertise, and a junior developer you can delegate simple atomic coding or writing tasks to."

Both roles are real, and both are useful — but they require different approaches.

As a junior developer, AI excels at the tasks that consume time without requiring deep judgment: setting up boilerplate code, refactoring a function to a new pattern, structuring API requests correctly, converting data between file formats, and generating text summaries of existing code. These are tasks that take a long time manually but can be quickly checked for completeness and accuracy — which is the critical qualifier. The check has to happen.

As an expert coworker, AI reduces what iOS engineer Dennis Lysenko calls "open loops" — the points in a development session where an unfamiliar API or framework forces a context switch out of flow state. "Generative AI is able to help me quickly solve around 80% of these problems and close the loops within seconds of encountering them, without requiring the back-and-forth context switching," Lysenko reports. For Joao de Oliveira, an AI/ML engineer, working with generative AI at Hearst produced a 98% success rate extracting structured data from unstructured sources — a result that would have required weeks of manual processing.

The role AI cannot fill: generating production-ready code wholesale. Even when output contains no hallucinations, "in most cases there are almost always lines that need to be tweaked because AI lacks the full context of the project and its objectives," de Oliveira notes. Developers who treat AI-generated code as a finished deliverable rather than a starting point are accumulating silent technical debt at every session.

AI as Personal Tutor: Learning a Framework in Under Two Hours

De Oliveira's clearest learning example: "I learned Terraform in one hour using GPT-4. I would ask it to draft a script and explain it to me; then I would request changes to the code, asking for various features to see if they were possible to implement." The equivalent through search and documentation would have taken days.

The enabling factor is the context window. The amount of information a model can reason across simultaneously has grown dramatically:

ChatGPT launched with a roughly 3,000-word context window. GPT-4 supports 100,000+ words. Gemini 1.5 supports up to 1 million words with a near-perfect needle-in-a-haystack score. The practical implication for development work has shifted at each step. As Stébé explains: "With earlier versions of these tools I could only give them the section of code I was working on as context; later it became possible to provide the README file of the project along with the full source code. Nowadays I can basically throw the whole project as context in the window before I ask my first question."

For research tasks — understanding a new API, surveying architectural options, evaluating approaches to a class of problems — a large context window plus a well-structured prompt can compress hours of reading into minutes of targeted extraction. Lysenko and Stébé both describe using AI to brainstorm solutions to unfamiliar problems and research new APIs, treating the tool as a first-pass research surface before committing to a direction.

The caveat: all three developers emphasize that learning and research use cases require enough existing technical knowledge to detect hallucinations. GPT 3.5 responses to programming questions contain incorrect information 52% of the time. A developer who cannot evaluate whether an answer is plausible is not learning — they are building on a foundation that may be false. De Oliveira's standing rule: always cross-reference AI output against search results and trusted documentation whenever accuracy is consequential.

Prompt Engineering: The Patterns That Consistently Deliver

The discipline of prompting has moved far enough that it has its own name, and the developers who get consistently useful output from AI have internalized specific patterns. The difference in output quality between an average prompt and a well-constructed one is significant enough to be visible in code review frequency.

Zero-shot, one-shot, and few-shot learning. Provide no examples, one, or a few. The goal is to give the minimum context needed and let the model use its prior knowledge. For familiar output patterns, zero-shot is faster. For unusual formats or edge-case behavior, a few examples anchor the model to the correct format and reduce hallucination frequency on that specific output type.

Chain-of-thought prompting. Ask the AI to explain its reasoning step by step before arriving at an answer. This produces better results on complex logic problems and makes the model's assumptions visible — you can identify the point where it diverged from your intent rather than debugging output that looked right.

Iterative prompting. Guide the model toward the desired output by refining incrementally rather than trying to write a perfect prompt upfront. Three or four refinement passes — "rephrase this paragraph more concisely," "expand only the error handling section," "constrain the response to three options" — typically produce better results than one comprehensive request.

Negative prompting. Tell the AI explicitly what not to do. "Do not include boilerplate comments," "avoid placeholder examples," "do not use deprecated API syntax" — explicit exclusions reduce output patterns you would otherwise manually remove.

Lysenko adds a practical rule that cuts noise immediately: ask the AI for short responses upfront. "90% of the responses from GPT are fluff, and you can cut it all out by being direct about your need for short responses." He also recommends asking the model to summarize the task before executing — this confirms it understood the prompt correctly before it generates a long output based on a misinterpretation.

Stébé builds permanent context into his workflow using custom AI personas — Markdown files that establish role, expectations, and output constraints as baseline prompts for different purposes. His code reviewer persona instructs the AI to review formatting, correctness, and higher-level design in three distinct sections, list critical fixes before minor ones, and continue iterating until it finds nothing further to improve. The persona encodes a consistent review discipline that applies uniformly rather than varying based on how the prompt was phrased that day.

GPT-3.5 vs GPT-4: The Gap Is Bigger Than It Looks

Model selection affects output quality more than most engineering teams account for. GPT 3.5's 52% incorrect rate on programming questions is not a minor caveat — it means the default model tier is wrong more often than right on technical questions. GPT-4 is 40% more likely to provide factual responses, according to OpenAI's internal evaluations, and accurately cites its sources, making cross-referencing outputs practical in a way it is not with the earlier model.

Clients, largely, are no longer looking for people who code. They're looking for people who understand their problems, and use code to solve them.

Lysenko is direct: "I can't stress enough how different they are. It's night and day: 3.5 just isn't capable of the same level of complex reasoning." For development work where an output that merely looks correct creates a bug that surfaces in code review two days later, the 40% factual accuracy improvement is material — not marginal.

The practical guidance: default all development work to GPT-4 or an equivalent tier. The additional cost per query is trivial relative to the debugging time the lower error rate saves. Using GPT-3.5 to optimize API costs while relying on its output for code generation is a false economy. For CTOs establishing AI tool policy for engineering teams, the model tier choice belongs in the policy document alongside the tool authorization list.

The Security Failure Most Teams Underestimate

The most directly preventable risk in developer AI adoption is also the most consistently underestimated: sending sensitive data to public models. Researchers have demonstrated that real API keys and other sensitive credentials accidentally hardcoded into software can be extracted via GitHub Copilot and Amazon CodeWhisperer. According to IBM's Cost of a Data Breach Report, stolen or compromised credentials are the leading cause of data breaches worldwide.

The mechanism is simple: when developers submit code containing hardcoded secrets to public AI tools, that data may be used to train the models and can potentially surface in responses to other users. The same risk applies to proprietary business logic, customer data, internal architecture documentation, and any material subject to an NDA. The developers most likely to trigger this failure mode are not careless — they are in flow state, solving a problem, and they paste a code block without inspecting what it contains.

An effective AI tool policy for an engineering team explicitly addresses four things: never include production credentials, API keys, or tokens in any prompt; strip or mock sensitive data fields before submitting code samples; use enterprise-licensed API tiers with data retention opt-outs for any work involving business-sensitive context; and build this expectation into code review and onboarding rather than leaving it in a security document that nobody reads after week one. This policy applies to early-stage SaaS teams as much as to enterprises — any codebase with API keys, customer data, or proprietary logic in it carries this exposure from the first prompt.

What This Means

The 60% adoption number does not tell you who is using these tools well. The developers getting consistent, high-quality output from AI have internalized a specific set of practices: they delegate bounded and checkable tasks, use GPT-4 tier by default, prompt with role context and explicit constraints, and enforce a hard line on what data enters public models.

The larger shift is worth taking seriously at the organizational level. As Lysenko observes, the role of the developer is becoming more product-oriented over time — clients are less focused on engineers who write code and more focused on engineers who understand the problem well enough to use code to solve it. AI accelerates execution across a wide range of development tasks. What it cannot replace is the judgment required to know what the right problem is and what a correct solution looks like. That judgment gap is where experienced engineers create value that compounds over time.

For CTOs and engineering leads, the practical action is to treat AI tool adoption as a product engineering decision — not a personal preference left to individual developers. Which tools are authorized, what data policies apply, which model tiers are default, and what prompt standards the team uses are choices that accumulate in both directions: toward compounding productivity gains, or toward accumulated technical debt, inconsistent quality, and security exposure that builds quietly until it isn't quiet anymore.

Want AI built into your engineering workflow?

Book a free 20-minute AI Feature Scoping Call. We will map your highest-ROI AI feature, tell you the real cost, and whether Boundev is the right fit. No decks. No BS.

Book scoping call →
MD

Mayur Domadiya

Founder & CEO, Boundev AI

Mayur builds Boundev AI, the AI engineering subscription for US SaaS companies. Connect on Twitter or LinkedIn.

Get shipped

Rather we just build it?

Book a free scoping call and we'll ship your production-safe AI feature this week.