How Our Trading Bot Learns From Its Mistakes

Most trading bots make the same mistakes forever. Ours reviews every trade, finds patterns in what went wrong, and updates its own decision-making process. Here's how.

The Problem: Static Bots in Dynamic Markets

A traditional trading bot has fixed rules:

if RSI < 30 and price > EMA_200:
    BUY

These rules work until they don't. When market conditions change — volatility spikes, trends reverse, volume dries up — the bot keeps applying the same logic and loses money.

The only way to improve is for a human to manually adjust parameters. This is slow, biased, and doesn't scale.

Our Solution: The Self-Improvement Loop

We built a system where the bot reviews its own performance and generates actionable lessons:

Trade closes → Log outcome → Self-Updater reviews → Generate reflections → Inject into prompts

Step 1: Log Everything

Every decision is recorded:

Technical Analyst recommendations (BUY/SELL/HOLD with confidence)
Risk Manager decisions (APPROVE/REJECT with reasoning)
Trade outcomes (entry, exit, P&L, reason for exit)

This creates a complete history of what the bot thought, what it did, and what happened.

Step 2: Find Patterns

The Self-Updater doesn't just look at completed trades. It cross-references three data sources:

Completed trades — wins and losses with P&L
Rejected trades — analyst said BUY, risk manager said no
Missed opportunities — approved but never executed

This is crucial. A bot that only reviews executed trades has a biased view. It doesn't know if the Risk Manager is too conservative or if good setups are being rejected.

Step 3: Generate Reflections

Using an LLM (Claude Sonnet 4), the Self-Updater analyzes the data and generates structured reflections:

{
  "reflections": [
    {
      "type": "exit",
      "content": "Stop-loss placed at 2.5× ATR gets wicked out before bounce. Use visual stops near swing lows instead.",
      "evidence": "3 of 5 losses on SUI/USDT were wicks below SL followed by reversal",
      "priority": "high"
    },
    {
      "type": "missed",
      "content": "Risk Manager rejects 60% of breakout setups in trending markets. These would have won 70% of the time.",
      "evidence": "12 rejected breakouts, 8 would have hit TP based on subsequent price action",
      "priority": "high"
    },
    {
      "type": "signal",
      "content": "RSI oversold + CMF positive is the highest-probability entry setup. Prioritize these.",
      "evidence": "5 trades with this combo: 4 wins, avg +3.2% P&L",
      "priority": "medium"
    }
  ]
}

Each reflection has: - Type — entry, exit, risk, signal, or missed - Content — the specific lesson - Evidence — data supporting it - Priority — high, medium, or low

Step 4: Inject Into Future Decisions

Reflections are stored in memory and injected into the Technical Analyst and Risk Manager prompts:

## Lessons Learned from Past Trades
- 🔴 [EXIT] Stop-loss placed at 2.5× ATR gets wicked out before bounce...
- 🔴 [MISSED] Risk Manager rejects 60% of breakout setups in trending markets...
- 🟡 [SIGNAL] RSI oversold + CMF positive is the highest-probability entry setup...

Now the agents have context from past experience. The Technical Analyst knows to suggest visual stops near support levels. The Risk Manager knows to be less conservative on breakouts in trending markets.

Step 5: Avoid Stale Reflections

The Self-Updater keeps only the last 30 reflections and marks new ones that contradict existing ones as "updated." This prevents the prompt from growing indefinitely and ensures the bot adapts to new patterns.

What Kinds of Things Does It Learn?

Entry Timing

"In choppy markets (ADX < 20), pullback entries fail 65% of the time. Only enter on confirmed breakouts with volume expansion."

Stop-Loss Placement

"Visual stops near swing lows outperform ATR-based stops. Price frequently wicks 1-2% below ATR stops before reversing."

Risk Sizing

"When Risk Manager approves trades at reduced size (50% risk), they win at the same rate as full-size trades. This is a good compromise for moderate-confidence setups."

Missed Opportunities

"Risk Manager rejected 15 BUY signals on INJ/USDT. 11 of these would have been profitable based on subsequent price action. The daily trend filter is too strict for this pair."

Pattern Recognition

"The combination of bullish EMA alignment + RSI 30-50 + price near lower Bollinger Band has a 72% win rate. This is our highest-probability setup."

How Often Does It Review?

The Self-Updater runs automatically every 12 trading cycles (every 48 hours at our 4-hour analysis interval). It only generates reflections if there are at least 3 new trades to review — this prevents noise from small sample sizes.

You can also trigger a manual review via the dashboard API:

curl -X POST http://localhost:8080/api/self-review

The Results

In backtesting, the self-improvement loop showed measurable improvement:

Metric	Without Reflections	With Reflections
Win rate	41.2%	43.5%
Avg P&L per trade	+0.11%	+0.16%
False breakout entries	18	11
Stop-loss wick-outs	8	3

The improvements are modest but consistent. The key insight: the bot doesn't become a genius overnight, but it stops making the same mistakes repeatedly.

Why This Matters

Most trading systems are static. They have a fixed edge that erodes over time as markets change. Our system has a learning edge — it adapts to new conditions by reviewing its own performance.

This is the difference between: - A bot that loses money the same way every time - A bot that identifies what's not working and adjusts

It's not magic. It's just systematic review and feedback — the same process any good trader uses, automated.

Technical Details

Architecture

MemoryStore (JSON)
  ├── decisions[]       — Agent decisions (analyst, risk manager)
  ├── trade_outcomes[]  — Completed trades with P&L
  └── agent_reflections[] — Generated reflections

SelfUpdater
  ├── review()          — Main review function
  ├── _find_missed_opportunities() — Cross-references decisions with outcomes
  └── _build_prompt()   — Builds review prompt with stats + missed trades

The Prompt

The Self-Updater builds a comprehensive review prompt that includes:

Portfolio summary (total trades, win rate, P&L, avg win/loss)
Per-pair statistics
Full trade log with entry/exit prices and reasoning
Missed opportunities (rejected and approved-but-not-executed)
Decision pipeline stats (approval rate, hold rate)
Existing reflections (to avoid duplicates)

The LLM then generates new reflections based on patterns in this data.

Cost

The Self-Updater runs infrequently (every 48 hours) and uses ~3,000 tokens per review. At Claude Sonnet 4 pricing ($3/M input tokens), this costs about $0.01 per review or $0.15/month.

What's Next

Live trading validation of the self-improvement loop
Adding a Sentiment Analysis agent for news/social data
TradingView webhook integration for external signal sources
Telegram alerts when new reflections are generated

This is part of the crypto-loco development series. Read the full project on GitHub.