Home Leaderboard Dashboard

WeatherBot Now Learns From Its Own Mistakes — The Self-Improving Intelligence Engine

Most trading bots are static. They run the same logic today that they ran on day one. If they make a bad trade, they'll happily make the exact same bad trade again tomorrow. We decided that wasn't good enough.

Starting today, WeatherBot has a self-improvement engine — a system of five interconnected modules that continuously track, analyze, and learn from every single trade the bot makes. The bot literally gets smarter over time, adapting its strategy based on real outcomes, not just theoretical models.

Here's exactly how each module works:

1. Trade Outcome Tracker & Win Rate Calculator

Every trade that exits — whether through stop-loss, profit target, or market resolution — gets permanently recorded with its full context: the market question, entry price, exit price, which city, which date, which trading mode was active, what the forecast models predicted, and the final P&L.

From this data, the engine calculates real-time win rates broken down by city, trading mode, entry price range, and time period. After just a few days of trading, patterns emerge. Maybe the bot wins 85% of London trades but only 45% of Seoul trades. Maybe it performs brilliantly on entries below 15¢ but poorly above 40¢. These aren't hunches — they're hard numbers from real trades, and they feed directly into Claude's next decision.

2. Post-Resolution Feedback Loop — Model Accuracy Tracking

This is pure meteorological science. After each market resolves, the engine records what each weather model predicted versus what actually happened. Over time, this builds a per-city model accuracy database.

The result? After 30 days of data, the bot might know that "ECMWF consistently runs 0.8°C too warm for Seoul in March" or "GFS is 1.2°C too cold for Miami in summer." These systematic biases get factored into the consensus calculation automatically. No other weather trading bot has this — most treat all models equally regardless of their track record for a specific city and season.

3. Claude Judges Its Own Trades — AI Post-Mortem Analysis

This is the most powerful module. Every 30 minutes, the bot reviews its recent losing trades and sends each one back to Claude with a simple question: "What went wrong?"

Claude examines the trade in hindsight — the forecast data that was available, the price it entered at, the actual outcome — and writes a concise post-mortem: what the mistake was and one actionable rule to avoid it in the future. These lessons accumulate into a "playbook" that gets injected into every future AI prompt.

After 100 trades, Claude has essentially written its own trading manual based on real experience. Lesson examples we've seen in testing:

  • "Avoid BUY_YES above 30¢ on Asian cities — the market is usually already efficient at that price point"
  • "When GFS and ECMWF disagree by more than 2°C, always SKIP — the uncertainty is too high"
  • "Seoul temperatures are consistently lower than model consensus in March — apply a -1°C correction"

4. AI Confidence Calibration

When Claude says a trade has "VERY_HIGH" confidence, how often does it actually win? This module tracks the answer precisely. Every trade's confidence level gets recorded alongside its outcome, building a calibration table:

VERY_HIGH: 72.3% actual win rate (47/65 trades)
HIGH: 58.1% actual win rate (36/62 trades)
MEDIUM: 41.2% actual win rate (14/34 trades)
LOW: 25.0% actual win rate (3/12 trades)

This data gets injected directly into Claude's prompt. If Claude sees that its VERY_HIGH calls only win 60% of the time, it naturally becomes more selective — reserving VERY_HIGH for truly certain outcomes. The AI's confidence becomes genuinely meaningful instead of arbitrary, which directly improves trade sizing (since bigger bets are placed on higher confidence levels).

5. Rolling Performance Context — Pattern Recognition

Every AI analysis now receives a rolling 7-day performance summary showing:

  • Overall win/loss record and net P&L
  • Best-performing cities (where the bot is winning consistently)
  • Worst-performing cities (where it should trade cautiously or avoid)
  • Win rates by entry price range (<15¢, 15-40¢, >40¢)

Claude naturally adjusts its behavior based on this context. If London trades are on a 9-win streak, it leans into London opportunities more aggressively. If Seoul has been a consistent loser, it sets a higher bar before recommending Seoul trades. The bot's strategy evolves daily based on what's actually working.

The Compound Effect

These five modules don't work in isolation — they compound. Trade outcomes feed into calibration data. Calibration data adjusts confidence levels. Adjusted confidence levels change position sizing. Post-mortems generate lessons. Lessons prevent repeat mistakes. Model bias tracking corrects forecasts. Better forecasts improve win rates. Better win rates compound into more profit. The bot that runs for 30 days is fundamentally smarter than the bot that runs for 3 days. This is WeatherBot's most important feature — and it's live now.

Read next

Hong Kong Trades Are Now More Accurate Than Ever: We Plugged the Engine Straight Into the Source That Settles the Market

Read article →

WeatherBot by the Numbers: 9,592 Traders, $2.87M in Member Profit, and Exactly What It Costs to Run

Read article →

We Banned 213 Accounts for Engine Abuse — Here Is Exactly What Happened, and Why

Read article →
← All articles