ChatGPT Codex complete usage tutorial, 8-step practical operation for automatically writing code in 2026
ChatGPT Codex is the coding-agent product OpenAI officially launched in October 2025, and by May 2026 it had already reached its second generation. Unlike the early days of writing code with ChatGPT, Codex is not a simple chat-and-get-code experience. It is a full-fledged agent that can autonomously run commands inside a container, open pull requests, run tests, and fix bugs. This article walks through getting started from scratch in eight practical steps.
The article assumes you already have a ChatGPT account and are comfortable with basic git and command-line operations. If you are a complete beginner, we recommend starting with a Notion AI or Cursor tutorial first, then coming back to Codex.
The core difference between ChatGPT Codex and regular ChatGPT

Writing code with regular ChatGPT is single-turn Q&A. You paste in a requirement, it returns some code, you paste in an error, and it returns a fix. Codex does not work that way. It spins up a Linux container in OpenAI's cloud, clones your entire repository into it, and can then autonomously run npm install, run pytest, edit files, make commits, and push PRs.
In short, Codex upgrades ChatGPT from a consultant that can only "advise" into a colleague that can actually "work." Hand it a GitHub Issue and it can take the whole flow from reading the code to submitting a PR. You never have to copy and paste by hand in between.
Subscription and access requirements

Codex is currently available only to ChatGPT Plus and Pro users. Plus, at 20 dollars a month, lets you run 20 tasks per day. Pro, at 200 dollars a month, lets you run 200 tasks per day. Free users cannot use Codex for now, which is different from GPT-4o.
If your team collaborates with multiple people, the ChatGPT Team plan starts at 30 dollars per user per month, and Codex task quotas can be pooled and shared. The Enterprise plan is priced separately and comes with an SLA and SOC 2 compliance.
Step one: activate Codex in the ChatGPT sidebar

After logging in to chatgpt.com, find the "Codex" entry at the bottom of the left sidebar. The first time you click it, it walks you through authorizing GitHub. The authorization flow grants the Codex agent read and write access to the specific repositories you choose under your GitHub account. We recommend authorizing only the repositories you need to work on rather than opening up everything.
Once authorized, Codex automatically forks an isolated Linux container. This container is in a fresh state for every task and does not retain any intermediate files from the previous run. By default the environment comes with Node 22, Python 3.12, Go 1.23, Ruby 3.3, and Rust 1.78, covering most mainstream languages.
Step two: configure the project environment

Add a codex.yml configuration file to the root of your repository. It contains two sections. The setup section defines the steps to install the environment, such as npm install or poetry install. The test section defines the test command, such as npm test or pytest -v. Codex prepares the environment according to setup and verifies your changes according to test.
If your project uses Docker, you can also specify a Dockerfile. Codex will build the container per the Dockerfile and then run it. This approach suits projects with complex environments that depend on system packages, for example ones that need libpq-dev, ffmpeg, or postgres-client.
Step three: submit your first task

Back in the Codex interface, click New Task. In the input box, write a natural-language task description, for example "Add signature-expiry validation to the JWT verification logic in src/auth.ts and add a unit test." After you submit, Codex first reads the relevant files, thinks for a while, and then starts editing.
The entire process is shown in real time in the Codex interface. You can see which files it read, which commands it ran, and which lines it changed. If it drifts off course midway, you can interrupt it directly and say "First look at how the verify function is used in src/middleware.ts."
Step four: review the diff and push the PR
When the task is done, Codex gives you a complete diff view. Every file change is displayed in GitHub style, red for removals and green for additions. You can confirm each hunk one by one. If there are parts you disagree with, you can edit them directly or ask for a redo.
Once you are satisfied, click Create PR. Codex automatically pushes to a new branch and opens a PR on GitHub. The PR description automatically includes a summary of the changes, the linked Issue, and the results of the tests it ran. The whole flow takes 5 to 30 minutes for a complete PR, which is normal for a medium-complexity task.
The kinds of tasks Codex is good at
In testing, Codex is best at the following kinds of tasks. The first is adding unit tests: give it an uncovered function and it can generate a complete test file directly. The second is refactoring, for example converting callbacks to async/await or class components to function components. The third is fixing lint errors: run lint once and it silently fixes all the warnings.
The fourth is updating documentation: it can reverse-engineer a README from the code. The fifth is dependency upgrades: for a cross-version migration like bumping React 17 to 18, a single Codex run can usually get about 80 percent done.
The kinds of tasks Codex is not good at
Codex also has clear weaknesses. The first is tasks that require a lot of business context, for example "Change this to a new payment flow according to our business rules." It does not know your company's payment background. The second involves front-end UI visuals: it cannot see the rendered page and can only guess.
The third is performance optimization, which needs profiling data Codex cannot obtain. The fourth is large-scale architecture changes: for a change spanning 50 files, Codex easily loses context. We recommend splitting a large task into 5 to 10 small tasks and handing them to Codex separately.
A hands-on comparison with Cursor and Claude Code
Cursor is an AI assistant embedded in the IDE, focused on "quickly editing code through conversation." Codex is an asynchronous agent, focused on "throw it a task and come back to a PR." The two have different use cases and do not conflict; they complement each other. Use Cursor for live edit-as-you-write coding during the day, and before you leave work throw a few tasks at Codex to run overnight.
Claude Code is a command-line agent from Anthropic with capabilities close to Codex. The difference is that Claude Code runs in your local terminal while Codex runs in OpenAI's cloud. Local is better for privacy but you have to install Docker yourself; the cloud is convenient but your code is exposed to OpenAI. As of May 2026, enterprise users lean more toward Codex while individual users lean toward Claude Code.
Security and cost considerations
Codex runs inside OpenAI's cloud container. Your code is read and analyzed but is not used for training by default. You can turn off "Improve the model" in Settings to avoid training. For sensitive enterprise code, however, we still recommend running Claude Code locally or self-hosting an OpenAI Enterprise private deployment.
On cost, each Codex task consumes on average 50,000 to 200,000 tokens. At GPT-4o pricing, that is roughly 0.1 to 0.5 dollars per task. Using all 200 daily tasks on the Pro plan is the equivalent of about 100 dollars a day, which is not cheap. We recommend prioritizing highly repetitive tasks for Codex, where the value is greatest.
Team collaboration and best practices
In a team setting, Codex shines even more than for individuals. Each engineer throws 1 or 2 of their daily tedious tasks at Codex to run, and code review is handled centrally. Over a week, a 5-person team can produce about 30 percent more code. The point is not to replace people but to let people focus on architecture decisions and complex business logic.
Best practices include making the task description as specific as possible: giving file paths and function names is twice as accurate as giving an abstract description. Limit each task to a single goal; do not "fix a bug and refactor while you are at it," or the PR will spiral out of control. Pair it with a PR template so Codex automatically fills in test coverage and the changelog. Configure Required Review on important branches so Codex PRs cannot be auto-merged.
For team governance, we recommend tagging all Codex commits with a unified commit-message prefix such as codex or bot, which makes auditing and after-the-fact tracing easier. Track the monthly Codex PR pass rate; below 60 percent indicates a problem with how tasks are defined that needs adjusting.
The direction of iteration over the next year
OpenAI's internal roadmap reveals that Codex will gain several capabilities in the second half of 2026. The first is multi-repository coordination: a single task can change code across 3 to 5 repositories at once, for example modifying the front end, back end, and SDK simultaneously. The second is stronger code comprehension, able to grasp the global context of a million-line-scale monorepo.
The third is deep integration with the main ChatGPT conversation: in a regular chat you say "Help me open a PR for this requirement" and it automatically jumps to Codex to finish it. The fourth is support for local IDE sync preview, letting Cursor and VS Code users see the status of the remote Codex container inside their IDE.
Anthropic's Claude Code and Google's Jules are building similar products. By 2027, the market is expected to settle into 3 or 4 stable coding-agent services, each owning a niche scenario. Thanks to its ecosystem advantage, OpenAI Codex remains the overall first choice.
Frequently Asked Questions
Is Codex free?
It is not free. Codex is an add-on feature of the ChatGPT Plus and Pro subscriptions. Plus is 20 dollars a month with a quota of 20 tasks per day, and Pro is 200 dollars a month with 200 tasks per day. Free users cannot use Codex for now. OpenAI has said it may open a free trial in the future, but there is no timeline yet.
Is the code Codex changes safe and trustworthy?
Code changed by Codex must pass your manual review before it is merged. By default the diff it generates is not auto-pushed to main; everything goes through the PR flow. But the generated code occasionally has bugs, security vulnerabilities, or API misuse. For important projects, we recommend a second review with Code Review tools and not blindly merging Codex PRs.
Which is better, Codex or Cursor?
The two are positioned differently and cannot be compared directly. Cursor is a fast conversational assistant inside the IDE, suited to editing code live as you develop. Codex is a cloud agent, suited to throwing it a task and waiting for the result asynchronously. They complement each other: use Cursor while developing and use Codex to run overnight tasks at home. If your budget allows, subscribing to both is the most efficient.
Does Codex support communicating in Chinese?
Yes. Task descriptions can be written in Chinese and Codex understands them fully. For the code itself, however, we recommend English comments, because Codex has more English training data and performs more stably on English code. Chinese variable names are fine too, but downstream toolchains may be incompatible. We recommend writing task descriptions in Chinese and keeping the code itself in English.
Can Codex work on private company code?
Yes. The GitHub authorization flow supports private repositories. But the code is uploaded to OpenAI's cloud container, so for sensitive commercial code you need to consider compliance risk. The OpenAI Enterprise plan offers SOC 2 Type II certification and a commitment that data is not used for training. For heavily regulated industries such as finance and healthcare, we recommend confirming with your legal team before using it.
📝 本文来自抖文 www.douwen.me ,转载请保留出处。
原文链接:https://www.douwen.me/archives/991/
💬 评论 (6)
Practical tips not fluff.
Stats really back it up.
Clear and to the point.
Loved the FAQ section.
Step-by-step is gold.
Best summary I've read on this.