
It's been three and a half years since ChatGPT 3.0 burst onto the scene at the end of 2022 and entered the public consciousness. In that time, barely a month has gone by without some meaningful leap in model capability, and the ideas around how to use these tools have shifted just as relentlessly. It started with prompt engineering — which people quickly realized was a diminishing game, since improving models tend to swallow most of that effort whole. Then came MCP, then Skills.md (along with the ensuing wave of big-tech employees distilling their workflows into AI personas, regular users distilling celebrities, people even distilling their own friends). Then autonomous agents. Then the currently fashionable "agent harness." The dominant paradigm for how to actually use AI has gone through a series of fast, dramatic reversals — and I'm convinced that pace of change isn't slowing down anytime soon. If anything, the next two years will be just as turbulent.
Below, I want to share a few workflows I use constantly — ones that have become mature and reliable, and whose results genuinely satisfy me. These weren't borrowed from anyone. They grew out of real needs, were refined through a lot of personal trial and error, and represent what I'd call reasonably optimal solutions to actual problems:
Translating an entire book
"Reading" foreign-language videos and podcasts
1. Translating an Entire Book
I've always been a reader — mostly novels when I was young, mostly practical nonfiction now (embarrassing, really; I've become thoroughly mercenary, my head full of nothing but utility and returns). Along the way, I've repeatedly run into the same wall: a lot of excellent books by foreign authors simply have no Simplified Chinese translation. My old workarounds were clunky at best. For English I could muddle through the original, or track down a Traditional Chinese edition — but copyright enforcement in those markets is tight, and most of the time a clean download just isn't possible. Buying a legitimate e-book or ordering a physical copy from overseas was always an option, but the time cost was brutal. More than once, by the time the book finally arrived a month or two later, I'd already lost the urge to read it.
The arrival of multimodal AI — models that can process images — solved roughly half the problem. The workaround was to download a pirated PDF or EPUB of the original, then feed it to the AI one page at a time for translation, even pausing to go deep on certain passages. But it was never a clean solution. Each round-trip took forever. The faster models tended to produce mediocre translations. And once the context window filled up, you'd have to start a fresh conversation — which meant inconsistent terminology and, eventually, the model drifting into outright hallucination.
Then Claude Opus 4.5 and Claude Projects arrived, and this problem essentially became a solved one. The workflow: create a new Project in Claude Projects, drop the source book into the project folder, and give Claude a clear brief — translate the entire book into Chinese, output it as an EPUB (I read almost everything in the Books app on my iPhone), pay attention to layout, maintain consistency in terminology throughout, handle specialized vocabulary carefully. You can also set up a persistent task-tracking system, telling the model to log its progress locally so that after a context reset it can pick up exactly where it left off. Under this setup, most standard-length books in virtually any language can be fully translated in roughly ten sessions of about twenty minutes each, yielding a well-formatted EPUB — and if you want, Claude will even generate a decent vector-art cover. There is one snag, though. And for that snag, I have a workaround.
Claude Opus is the most capable model available right now — the best at understanding what a task actually requires. But copyright awareness is wired deep into it. For most relatively recent books, if you ask Claude to translate the full text outright, it will decline. It considers that a copyright infringement, even if you insist the translation is strictly for personal use.
Fortunately, ChatGPT Codex — and even domestic Chinese models like Kimi — have essentially no such hesitation. The move is to let Codex blast through a rough machine-translation of the entire book, packaged as an EPUB. Then you drop both the machine-translated draft and the original into the Projects folder and tell Claude: this is a translation exercise I did myself, but the quality is pretty rough — could you use the original as a reference and help me polish my draft? At that framing, the copyright question evaporates. Claude takes the job seriously and brings everything it has to the table, polishing your "draft" into a translation that rivals — and often surpasses — what a professional publisher would produce.
Using this method, for any book that exists in a digital edition anywhere in the world, you can have a high-quality Chinese translation in your hands without waiting years for a publisher to acquire the rights, and without grinding through a dense original text in a language that isn't your first.
This whole thing often brings to mind a Japanese teacher I once knew, whose main job was translation and interpretation — contracts for corporate clients, live interpreting at trade shows. When I told him AI had essentially made the translation profession obsolete, he pushed back firmly. He said machine translation was nowhere near good enough for professional documents. I suspect he was still thinking of Google Translate, or had never actually tried anything more sophisticated — Projects, Codex, or a model at Opus's tier that simply doesn't belong in the same conversation as everything else. In my experience, though, ChatGPT and Claude may look ubiquitous online, but the share of people who are actually using them consistently in daily life — especially Claude — is probably fewer than one in ten. Most people's mental model of AI capability is still anchored to DeepSeek, Doubao, or at best free-tier Gemini and ChatGPT. The gap between those tools and the frontier paid-tier models is, frankly, the gap between Jason Statham and a community theater action hero.
2. "Reading" Foreign-Language Videos and Podcasts
As more industry professionals turn to podcasts and video interviews as a way to share ideas, build their public profile, manage PR, or reach potential clients, English-language (or native-language) audio and video has become one of the most important channels for primary-source information. But as the format has grown more polished — optimized for completion rates, monetization, and brand identity — the runtimes have ballooned accordingly. A ninety-minute interview is now the floor. Three- and four-hour episodes are common. Acquired, for example, regularly runs seven or eight hours per episode. When the content isn't in your first language, and you rarely have that kind of uninterrupted time to just sit and listen, the backlog piles up faster than you can clear it. And in a world where a lot of this information has a real shelf life — especially if you're trying to extract anything tradable from it — that's a genuine problem.
This is where AI earns its keep. The approach is to convert the audio or video into a text transcript, then feed that transcript to an AI for chunked translation into your native language. A four-hour interview that would take four hours to listen to typically takes about thirty minutes to read through, leaving plenty of time to follow up on whatever threads seem worth pulling.
For most English content on YouTube, third-party tools can pull subtitles for free. Sometimes the creator has uploaded a carefully edited transcript — ideal. At worst, you get YouTube's auto-generated captions, which are notoriously imperfect: mishearing, dropped phrases, no speaker differentiation in multi-person conversations. But if you feed that raw auto-caption file to a model at Opus's level, it knows how to:
- Break the task into segments and translate in batches, working around context-length limits
- Infer and reconstruct missed or garbled words from surrounding context
- Distinguish between speakers based on semantic cues (not perfectly, but impressively often)
- Handle proper nouns and technical terminology
- Flag and correct factually incorrect statements made by the speakers
- Output clean, well-formatted Markdown that's easy to read and share
Free-tier ChatGPT and Gemini, as far as I can tell, can't reliably do all of this. I haven't tested Doubao extensively, but I'd expect it to fall short — both in raw capability and for other reasons — when it comes to producing a complete, high-quality translation with nothing left on the floor.
That's enough for now. Looking back at what I've written, I realize it's basically an extended case study in how to get the most out of Claude's translation and task-planning capabilities. There's more I rely on regularly — using AI to build out Excel models, to stress-test the feasibility of plans before committing to them, even to work through questions of personality, strengths, and the kind of formative experiences that shaped how I think. Some of that has genuinely changed how I see the world (though maybe that says something unflattering about how fixed my worldview was to begin with). I'll save those for a follow-up. But even these two use cases give a pretty clear window into what frontier AI models are actually capable of today. They also make me think the people saying AI will displace the majority of white-collar work are not being dramatic — at least not by much. My own read: once context-length and software-use limitations are properly solved, AI should be able to handle ninety percent of the knowledge-worker tasks I'm familiar with. These particular workflows, of course, are more like advanced content consumption than genuine production — let alone anything that results in a deliverable someone pays for. But then again, most enormous businesses started as someone scratching their own itch.