The Visual Promise
For decades, the evolution of digital interfaces followed a clear direction: making computers increasingly resemble the physical world.
DOS was pure logic. Pure text, without a body. You typed a command and the system responded. There was something brutally honest about it, but it was inaccessible to most people—the language of the machine required education to use.
The GUI (graphical interface) changed everything. With the mouse, the screen gained depth. Windows, icons, folders, and menus appeared. The computer began to imitate a desk. And the fundamental gesture of interaction was the click—a form of mediated touch. You pointed, touched, dragged. The screen became a tactile-visual surface.
The touchscreen pushed this logic to its literal extreme: the finger replaced the mouse. Mediated touch became direct touch. The metaphor disappeared. You actually touched digital objects.
This progression was a genuine achievement. It democratized access to computing in a way no previous technology had managed. Interaction became far more intuitive. For the problem that existed—making computers understandable to humans without technical training—the visual paradigm was the right answer.
Marshall McLuhan would describe this movement as the consolidation of visual space in computing: an environment organized through perspective, hierarchies, geometric layouts. Linear, sequential, guided by the eye.
And it worked. For quite a while.
The Reversal
McLuhan also formulated one of the most unsettling laws about technology: every medium, when pushed to its extreme, reverses into its original purpose.
The road was created to bring places closer together. Pushed to the extreme, it produced the traffic jam that paralyzes cities. Television was created to inform. Pushed to the extreme, it produced the noise that misinforms. The graphical interface was created to simplify. Pushed to the extreme, it produced the airplane cockpit that no one can fully understand or operate.
Open Google Analytics. The Word menu. The Cloudflare dashboard. What you see is not an interface—it is an encyclopedia of menus, tabs, submenus, configurations buried under three levels of navigation. You do not use these tools. You survive them.
The GUI accumulated decades of visual features without a compositional principle. Every new capability became a button. Every button required a menu. Every menu generated a hierarchy. The result is a cognitive hell for the average user who just wants to get things done.
The paradox is cruel: the interface that was created to show now hides. The tool that was created to simplify is now the main obstacle between the user and what they want to do.
This is not a design failure. It is the dynamic that reveals something deeper: the Visual-First paradigm has reached its point of reversal.
And it was precisely at this moment that a new actor entered the development environment.
The Acoustic Turn
While graphical interfaces were reaching their maximum complexity, two forces began moving in the opposite direction.
The first was the arrival of coding agents: systems capable of interpreting intent, executing actions, observing results, and adjusting behavior autonomously. The second was the rise of voice as a primary interface: you speak to your TV, to your car, to AI assistants. Alexa. ChatGPT voice mode. Claude Code with integrated voice commands, and more recently OpenClaw (a personal agent you can interact with via WhatsApp).
None of these trends are accidental. They converge because they point to the same movement: the transition from visual space to acoustic space.
McLuhan never described these spaces in absolute terms, but in degrees. Every technology positions itself somewhere along a scale between the visual and the acoustic. The visual is more organized, hierarchical, oriented by the eye. The acoustic is more relational, contextual, oriented by dialogue.
In computing, this scale is clear:
GUI → pure API → CLI → Voice
← more visual more acoustic →
APIs are precise but opaque to humans. GUIs are visible but inaccessible to agents. Voice is the pure acoustic pole: natural language, no formal syntax, no visual structure.
The CLI (Command Line Interface) occupies an interesting position on this scale: it has human-readable semantics (git commit, docker run, npm install) without being natural language, even though it often appears through verbs. An agent understands it. A developer understands it. Windows and Linux terminal users understand it. The intention is expressed directly. This makes it the current meeting point between what machines can execute precisely and what humans can inspect, understand, and even use.
There is also a historical factor that cannot be ignored: decades of tutorials, documentation, forum posts, and StackOverflow answers taught developers using terminal commands, which are ubiquitous across operating systems. These patterns were massively absorbed into the training datasets of LLMs. Modern language models have a surprising ability to produce, interpret, and compose CLI commands—not by accident, but because the terminal was already the dominant way of describing technical procedures in text.
The CLI never disappeared. It lived in servers, in CI/CD pipelines, in infrastructure. Anyone who has always worked at those layers knows this. What changed is that it is now returning to the center—not out of nostalgia, but out of structural necessity in the new environment that is emerging.
The Manifesto: Acoustic by Design
This brings us to the central argument.
Every piece of software built today should follow a CLI-First paradigm.
Not CLI-only—graphical interfaces still have their place, especially for end users who are accustomed to that paradigm. But the CLI must be a first-class citizen in the product architecture, not a late add-on.
Why should every piece of software have a CLI?
First: you test earlier. An interactive CLI allows you to validate software—beyond unit tests—at a very early stage: no server running, no finished graphical interface, no structural complexities that will arrive later in the project. Humans can operate it. Agents can operate it. Scripts can run end-to-end tests from day one. This reduces the feedback cycle from weeks to hours.
Second: any software with a CLI is agent-operable. This is not a technical detail—it is an architectural decision about the future of your product. A chatbot that talks to users and executes actions can be built on top of any well-designed CLI. You do not need to build a separate API for automation. The CLI is already the automation interface.
Third: CLI makes your software voice-ready. What does a voice assistant do under the hood when it performs an action? It invokes a CLI, an API, or some combination of the two. Software that begins with a well-defined CLI is already prepared to be controlled by voice. A well-written --help is more than enough contract for an agent to discover how to use it.
Hostinger (a company that provides internet servers) understood this well: its chatbot allows you to say “create a CNAME for the subdomain palhano.com,” and the operation happens. You didn’t click anything. The CLI was there, operating beneath natural language. It found the functionality buried in a mountain of menus for you—naturally, like a conversation.
GitHub is the canonical case. It offers CLI (gh pr create, gh repo clone), API, and GUI—three layers coexisting without canceling each other out. The GUI did not need to be sacrificed for the CLI to exist. And the result is that GitHub is operable by humans, by agents, and by voice interfaces with equal naturalness.
CLIs also age well. The Git of 2005 is still completely operable in 2025, still integrated into new workflows, still extended by agents. Commands are stable contracts. GUIs are opinions about UX that change with design trends, fashion, and time—and they age poorly.
The problem with modern software is not that it has a GUI. It is that it is designed for GUI first—designed more for the eye than for the ear—and then tries to add automation afterward. This retrofit is expensive, inconsistent, and structurally inferior. You can feel when a CLI is translating the GUI instead of expressing the domain model directly. The abstractions fight each other.
Today, writing a CLI alongside a GUI does not require extra sweat from developers. Coding agents generate this layer with consistency and low cost. CLI-First is no longer a luxury for large teams. It is an accessible decision for any project.
In Summary
Now reception happens through the ear. Action happens through the mouth.
This inversion—from eye to ear, from touch to voice—does not happen by accident. It happens because the visual paradigm reached its limit, and its point of reversal: the interface that came to show now hides; the tool that came to simplify is now the obstacle.
And it happens because we have reached the moment when agents and humans must share the same operational environment. An environment readable by both. One that is composable. Stable enough to be automated, simple enough to be understood—and audited.
The terminal has always been that environment. The CLI has always operated in this mode.
McLuhan said that new media often reactivate older forms of perception. The return of the acoustic in computing confirms this intuition—but goes beyond it.
This is not about reviving the past. It is about recognizing that the right architecture for the age of agents is one that is designed to be spoken to, not merely clicked.
Acoustic by Design is not a trend. It is the next paradigm of software development.
And the question every team should ask when starting a project today is simple:
Does the software we are building already know how to talk?