What Is Vibe Trading
A term spreading through the AI community. What is behind it, what it looks like in a real system, and where the limits are — no marketing.
"Vibe trading" is not an academic term. It emerged organically in the AI community — shorthand for the idea that a model reads news, understands market sentiment, and generates trading signals from that. Similar to "vibe coding" (let AI write the code, you steer the direction) — but with money, not code.
I am writing about this because I am architecting a system where we actually do this. I want to explain what it means in practice, what works, what does not yet, and why the gap between a demo project and a production system is larger than it looks.
What vibe trading means in practice
Traditional algorithmic trading is built on technical indicators — averages, volatility, volume. They are deterministic. You can see exactly why the system bought.
Vibe trading adds a layer: a language model reads news and extracts a signal from text. Not just "what happened", but "how is the market reacting to it". A central bank mentions higher inflation → the model reads it → produces a sentiment score → that enters the signal engine alongside macro data and technical indicators.
The output is not "buy at X, sell at Y". The output is a structured signal: asset, direction, conviction strength, rationale. The trading decision is then made either by a trader based on the signal, or — in fully autonomous mode — by the system itself.
Autonomous mode is the part that is still maturing. I will get to that.
How we are building it
The system I am working on has three input streams:
- AI news-sentiment — the model reads news, output is structured JSON with a score and rationale
- Macro data — interest rates, inflation, labour market and similar aggregates
- Technical indicators — standard signals from price history
From these three inputs we generate trading signals. The backend is in .NET 10, the frontend in Next.js, LLM orchestration runs through our own abstraction over multiple providers — so we can switch models without rewriting logic.
Local inference
Sentiment analysis of news does not necessarily have to go through a commercial API. For part of the processing we use local inference — the model runs on our infrastructure. This brings two things: lower cost per token and lower latency. The tradeoff is higher infrastructure requirements and the fact that local models are still smaller than the commercial frontier.
Stateless signal engine
The signal engine is stateless. It has ~20 interchangeable evaluators in a strategy pattern — each evaluator implements the same interface and contributes its partial score. Adding a new strategy (a new evaluator) does not break the existing ones. This is an intentional decision, not a reflex — in a system where we experiment with different approaches, I need confidence that a new evaluator will not overwrite the behaviour of an existing one.
Cost gate
Before every LLM call, a cost gate runs. The system has a daily budget. If a call would exceed the limit, the gate throws an exception before the request — not after. A runaway loop that consumes the entire monthly budget overnight is a classic mistake in LLM integration. Prevention is cheaper than a postmortem.
Observability
Every LLM call is audited: tokens (input/output), latency, cost, result. Without this I cannot see what the system is doing, why a signal is what it is, or where inference money is actually being spent.
Why this is harder than it looks
Money is an adversarial environment. The market adapts. What worked last year may not work now — not because the model got worse, but because other players see the same news and react faster or differently.
Three concrete problems:
State. A trading system is not stateless. It has positions, cash balance, risk limits, history. An LLM signal is stateless — the model does not know what you hold in your portfolio. Integrating these two worlds is non-trivial. Bad state = bad decision → loss.
Latency. News has an effect in the first minutes. If sentiment analysis takes 8 seconds, you are behind the market. Local inference or pre-computing parts of the pipeline helps, but it is not free.
Hallucinations. A language model can "read" a piece of news and generate a convincing signal that makes no economic sense. Structured JSON output and parser validation do not eliminate this entirely — they only reduce the most glaring errors.
Where we are now
The PoC is running live. We generate signals, monitor their quality, validate outputs manually.
Autonomous mode — where the system places orders without human confirmation — is in development. We are deliberately not deploying it before we have stable feedback from multiple market conditions and robust risk management on the execution side.
A few things I do not do and do not pretend to do: I am not an ML researcher, I do not deal with fine-tuning models. I work with existing models and build a system framework around them — orchestration, guardrails, observability, state management. That is a different discipline from training.
What separates a toy from a production system
Almost every vibe trading demo project looks the same: call GPT-4, get text, display it. In demo conditions it works.
A production system needs different things:
- Cost gate — control over how much you spend on inference, before every call
- Structured output with a robust parser — the model returns JSON, but parsing must survive format errors
- Observability — every token, every call, every cost logged and auditable
- Stateless signal engine — evaluator isolation so adding a new one breaks nothing
- State outside the LLM — portfolio state, cash balance, execution limits live in a database, not in the model's context
- Multi-provider abstraction — if one provider raises prices or degrades quality, I switch without rewriting logic
A demo system has none of this. It works in an ideal scenario. Production is not ideal.
FAQ
Is this investment advice?
No. The description of the system is technical. I do not advise what to buy or sell — that is outside my role and outside the scope of this article. The system collects backtest and live results, but they are not public.
Does it work? Does it make money?
Honest answer: I do not know. The PoC generates signals that have a verifiable logic in hindsight. Whether they are profitable over the long term across different market conditions — that is something time and data will prove, not a presentation. Anyone who tells you "yes, it works" without specific numbers over a sufficiently long period is omitting something.
Why share architectural details?
The competitive advantage is not in knowing that a cost gate exists. It is in how precisely you implement it, how you calibrate the evaluators, and how you manage risk. Those things I do not publish. The architecture without those details will not make anyone money — but it shows what separates a thought-through system from a demo project.