We’re looking at a structural shift in how humans interact with technology. It’s not just an interface evolution. It’s an environmental shift. In McLuhanian terms, we’re moving away from a predominantly visual mode (sequential, analytical, built on menus and clicks) and plunging deeper into the acoustic environment: simultaneous, conversational, immersive, and tribal.
For centuries, since Gutenberg, the world was organized by sight. Print shaped a linear culture: beginning, middle, end. A book demands sequence. A newspaper arranges the world in columns. The computer screen inherited that logic: windows, buttons, visual hierarchies. Websites—and corporate software itself—were built like a tree of menus. You navigate. You click. You drag. You search.
But electrical technologies began dissolving that arrangement in the early 20th century. Radio, TV, the internet, and now AI agents push us into a more simultaneous space. The acoustic isn’t just sound. It’s relational structure, simultaneity. It’s conversation. It’s presence.
And the click, gradually, starts being replaced by speech.
WhatsApp as Cognitive Infrastructure
In Brazil, WhatsApp is used by approximately 167 million monthly active users. That’s not just an adoption metric. It’s cultural evidence. The conversational interface has become the country’s basic communication infrastructure.
People no longer get informed primarily through printed newspapers, or even TV. Not necessarily through digital portals either, with their more concise and direct communication. They receive headlines on WhatsApp. Often they don’t read the full article. The headline already circulates as conversation. News becomes dialogue.
When you read a WhatsApp message from someone you know, you “hear” that person’s voice in your head. That’s acoustic. It’s not imagistic like a book. It’s sonic, conversational. There are interruptions, simultaneity, fragmentation, immediate replies, back-and-forth—features absent from books and even from TV.
The book is visual. WhatsApp is acoustic.
And today, the success of an interface is directly tied to how close it is to that conversational logic.
Programming by Talking: The Case of Conversational Agents
Recent projects of AI coding agents operating via WhatsApp and Telegram have seen meteoric growth. Not just because they use AI. But because they are born acoustic.
On these platforms, the user doesn’t open an IDE to program. They don’t write lines of code in an editor with syntax highlighting. They talk. They say what they want. The agent translates intention into execution. The code gets produced, but the user doesn’t “program” visually. They program by conversing. At its core, this is still software development—but mediated by natural language. The interface is the conversation. The channel is the messenger. The environment is acoustic.
This choice is explosive because the interface is already familiar. There’s no UX learning curve. The only curve is semantic: learning how to phrase requests better. But the medium is already intimate. When the interface matches a cultural habit, adoption slides in with no brakes.
The next IDE might not be a heavy app with multiple panels. It might be a chat window. How many programmers ever imagined that an IDE like VSCode could one day turn into Telegram?
The Conversation Economy and the Agents That Buy
If programming can be conversational, so can consumption.
We’re watching the growth of the idea of agents that act on behalf of the user—especially in commerce. Buying is one of the most valuable economic activities. And buying requires search, comparison, negotiation.
Imagine this scenario: you tell your agent:
“I want a Garmin Venu 4 watch at the lowest possible price in the next 30 days, with a maximum budget of $754.72.”
The agent monitors the market. Negotiates. Calls APIs. Analyzes promotions. Eventually it comes back: “I found it for $792.45. It’s above budget, but it’s the best offer so far. Want me to buy it?”
You answer. It executes.
How many sites did you visit? None.
The agent visited APIs. Maybe did scraping. Maybe talked to other commercial agents in its openclaw network. But no human opened an e-commerce site. The visual stops being the stage. It becomes backstage. That doesn’t mean websites will disappear. But it does mean their function will change. They’ll need to be “conversable.” They’ll need clear, structured APIs that can dialogue with agents.
Buying becomes a conversation between you and your agent, and between your agent and selling systems.
It’s an acoustic economy.
The Economic Impact of Migrating from Visual to Acoustic
If consumption becomes conversational, the visual funnel loses centrality. Layout, banners, visual highlights still exist—but the agent doesn’t see banners. It reads metadata. It interprets API payloads. The battle won’t be only for visual positioning on a site anymore. It will be for better data structure, better programmatic responses, better automated argumentation.
Because agents don’t just buy. They negotiate. They compare attributes. They analyze technical specs.
And here’s a critical point: whoever explains better wins. If an ERP or e-commerce system only responds, “I have the product for $752.83,” it loses to another system that responds: “I have the black model. The brown leather strap, while elegant, isn’t recommended for athletes because sweat quickly degrades the material. If the user is an athlete, the silicone version is more durable.”
The agent brings that argument to the customer. The sale stops being only price. It becomes structured technical narrative—qualitative, reasoned.
From Menu to Dialogue: The Transformation of ERPs
Now we get to the crucial point for the Brazilian software house ecosystem.
Brazil has thousands of software houses producing ERPs, POS systems, CRMs, and countless transactional and intelligence solutions for companies in every niche. There are more than 15,000 software houses (ISVs) operating in the country according to an ABES report. It’s important to stress that all these solutions were conceived in the visual paradigm: hierarchical menus, screens, reports, filters waiting for clicks.
But the world we described demands another layer. The user no longer wants to navigate to “Reports > Sales > Last 30 days > Products > Sort by volume.” They want to say: “Show me the best-selling product last month.” And the system needs to understand—and converse.
They want to say:
- “Find a new supplier for soybean oil that’s close to my city and cheaper than what we currently have. Send an SMS alert when you find one.”
- “Request quotes from suppliers for the 10 best-selling products of the week.”
- “Analyze whether there’s room to reduce cost for the top-selling products of the semester.”
These are commands that don’t necessarily map to existing menus. They’re intentions. The ISV (Independent Software Vendor) must be able to translate them into actions. That means the ERP needs to talk.
Software That Buys, Sells, and Explains
Beyond understanding human commands, software will also need to dialogue with external agents.
If an automated buying agent queries your system, it won’t ask in menu language. It will send structured queries. It may request technical justifications. It may negotiate terms.
The software will have to:
- Expose information in a structured way.
- Respond contextually.
- Argue technically.
- Negotiate via API.
- Provide rich metadata based on the search context.
Today, most software isn’t ready for that. It shows predefined data. It doesn’t converse about it. In the new scenario, software that doesn’t “speak”—in the broad sense of dialoguing, negotiating, explaining, and acting through conversation—loses competitiveness. It stops being consulted by buying agents. It stops participating in automated negotiation. It loses the conversation.
The question stops being “Does your software have AI?”
And becomes: “Can your software hold a conversation?”
The Need for a Native Conversational Layer
Important: this doesn’t mean abandoning menus. The visual doesn’t disappear. It becomes one layer among others. But the software of the future needs a native conversational layer—not a superficial add-on. Not a FAQ chatbot. A real operational interface.
That layer must:
- Translate intention into transaction.
- Translate conversation into structured queries.
- Translate technical argument into strategic responses.
- Translate human decisions into executable automation.
That changes architecture. Changes API design. Changes governance. Changes development culture. The acoustic interface isn’t just an alternate channel. It’s a new environment.
Software That Doesn’t Converse Gets Pushed to the Edge of the Market
We’re witnessing a structural migration: from a sequential visual environment to a conversational acoustic environment.
- Communication migrates to messengers.
- Programming migrates to dialogue.
- Buying migrates to agents.
- Negotiation migrates to agents and conversational APIs.
- Software migrates from the menu to intention.
Software that doesn’t speak—meaning it can’t dialogue, negotiate, explain, and act through conversation—tends to lose relevance in the medium term. This isn’t abstract futurism. It’s a logical extrapolation from trends already visible.
WhatsApp and other messengers are already cultural infrastructure.
Agents are already starting to buy.
Programming is already starting to become conversational.
APIs are already the new battlefield.
The question isn’t whether the world will be acoustic.
The question is: who will be ready to speak inside it.