AI trading agent: Generative adversarial networks and synthetic data

Previous section: AI trading agent: Fine-tuning overview

Project repository: Web3 AI trading agent

This section transforms the raw blockchain data—the real Uniswap V4 BASE mainnet ETH-USDC swap—into synthetic datasets for molding our base or instruct model into a specialized trading model. Using Generative Adversarial Networks (GANs), you’ll create diverse market scenarios that enhance model robustness while maintaining statistical authenticity to real Uniswap V4 trading patterns.

Real blockchain data collection

Collecting real blockchain data

BASE blockchain, especially through Uniswap V4 smart contract events, offers detailed trading information. This includes swap events showing full transaction details like amounts, prices, and participants; pool state updates such as liquidity changes and fees; price movements captured at tick-level precision; and volume data reflecting activity over various periods.

This is the real data we can collect and that our non-fine-tuned model acts on; this is the same data that we can use to actually fine-tune our model on to make it more specialized and get the larger model’s wisdom to shove it into the smaller more nimble model; this is also the same data that we can (and will) use to build our synthetic dataset on.

Raw data collection implementation

Make sure you have the following set up in config.py:

Chainstack RPC node URLs — Get these at Chainstack
START_BLOCK — begins collection from Uniswap V4 Universal Router deployment
BATCH_SIZE — processes 200 blocks per request for efficient data retrieval
Pool targeting — specifically monitors the ETH-USDC pool ID
Rate limiting — to respect your Chainstack plan RPS limits

Collect the trading data from BASE mainnet:

python on-chain/collect_raw_data.py

Data preprocessing pipeline

Process collected data for optimal training performance:

python off-chain/process_raw_data.py

The processing does a bunch: normalizes prices—addressing differences like USDC’s 6 decimals and ETH’s 18 decimals. This ensures we accurately calculate ETH/USDC exchange rates and convert amounts into standardized, human-readable formats. Then it structures the data into sequential patterns for GAN training, and identifies extreme price movements to handle outliers. The processed data is saved to data/processed/processed_swaps.csv with an optimized structure.

What are Generative Adversarial Networks (GANs)

GANs provide the actual engine for synthetic data generation, enabling creation of any market scenarios you need beyond historical limitations.

Specialized financial time series GAN architecture

We use a Wasserstein GAN with Gradient Penalty (WGAN-GP) architecture. By the way, this where we are still following the ideas and research presented in Generative Adversarial Neural Networks for Realistic Stock Market Simulations. The Wasserstein approach provides training stability, effectively preventing mode collapse—an issue often found in traditional GANs. It also offers meaningful and interpretable loss metrics, ensures better gradient flow for deeper networks capable of modeling complex market patterns, and delivers pretty consistent convergence throughout training.

Gradient penalty specifically enforces the Lipschitz constraint to make the GAN training stable.

GAN implementation architecture

Code organization and modularity Our GAN implementation provides a comprehensive framework in off-chain/gan/: Component breakdown

models.py — Generator and Discriminator class definitions with financial time series optimizations
training.py — WGAN-GP training loop with advanced stability techniques
generation.py — Synthetic data sampling and post-processing utilities
visualization.py — Training progress monitoring and data quality visualization

Generator architecture design

Time series optimization The generator network incorporates financial market-specific design elements: Temporal structure preservation

Transformer architecture — multi-head attention for capturing long-term dependencies in price movements
Multi-head attention — 8 attention heads focus on relevant historical patterns across different time scales
Positional encoding — maintains temporal order information for sequence modeling
Layer normalization — ensures stable training across different volatility regimes

Financial pattern awareness

Price momentum modeling — replicates realistic price continuation patterns
Volume-price correlation — maintains authentic relationships between trading metrics
Volatility clustering — reproduces periods of high and low market activity
Market microstructure — preserves bid-ask spread and liquidity characteristics

Discriminator architecture design

Authentication sophistication The discriminator employs advanced techniques for detecting synthetic data: Multi-scale analysis

Temporal resolution layers — analyzes patterns at different time scales
Feature pyramid networks — detects both local and global inconsistencies
Statistical moment matching — compares higher-order statistical properties
Spectral analysis — examines frequency domain characteristics

Financial realism validation

Economic rationality checks — ensures synthetic data follows market logic
Arbitrage opportunity detection — identifies unrealistic price relationships
Liquidity consistency — validates volume and liquidity interactions
Risk metric preservation — maintains authentic risk-return relationships

Synthetic data generation process

Execute comprehensive synthetic data creation Generate enhanced training datasets with controlled characteristics: First, train the GAN model (if you haven’t already):

python off-chain/generate_synthetic_data.py train

Or with the --quick-test flag:

python off-chain/generate_synthetic_data.py train --quick-test

Mac MPS PyTorch incompatibility with aten::_cdist_backward The minibatch discrimination in the GAN discriminator in off-chain/gan/models.py uses distance computations that trigger the aten::_cdist_backward MPS operator. This is not yet implemented for Apple Silicon MPS, so you’ll have to rely on CPU for the time being. Track the issue in MPS operator coverage tracking issue (2.6+ version) #141287.

PYTORCH_ENABLE_MPS_FALLBACK=1 python off-chain/generate_synthetic_data.py train --quick-test

Then generate synthetic data:

python off-chain/generate_synthetic_data.py generate

Training configuration management

Flexible training modes The system supports multiple training configurations for different use cases: Quick test mode for rapid iteration

# Quick test mode (faster, smaller model)
QUICK_TEST_MODE = True

Full training mode for production quality

# Full training mode
QUICK_TEST_MODE = False

Synthetic data quality validation

Our validation script includes a distribution analysis, where we use Kolmogorov-Smirnov tests to statistically confirm the equivalence between the real and synthetic data distributions. Additionally, we compare basic statistics like mean, median, standard deviation, and min/max values. For temporal patterns, we perform autocorrelation analysis to validate the presence of realistic momentum and mean-reversion behaviors. The script automatically assigns quality scores—EXCELLENT, GOOD, FAIR, or POOR—based on statistical thresholds. To validate synthetic data, run:

python off-chain/validate_synthetic_gan_data.py

Teacher to student distillation

This section implements knowledge transfer from a larger language model to a smaller one. Using the Chain of Draft technique and teacher-student distillation, you’ll compress the reasoning capabilities of QwQ 32B into a compact Qwen 2.5 3B model optimized for trading decisions. Example transformation Traditional verbose reasoning:

"Given the current market conditions where ETH has experienced significant upward momentum over the past several trading sessions, combined with increased trading volume and positive sentiment indicators, while also considering the current portfolio allocation which shows overweight exposure to ETH relative to our target allocation, I believe it would be prudent to consider taking some profits by selling a portion of our ETH holdings to rebalance toward our target allocation and lock in gains."

Chain of Draft optimization:

"ETH strong upward momentum
High volume confirms trend
Portfolio overweight ETH exposure
Profit-taking opportunity exists
Rebalancing maintains risk control
####
APE OUT 2.5 ETH"

Teacher model integration

Access sophisticated teacher models through OpenRouter’s infrastructure for cost-effective distillation.

OpenRouter setup and configuration

Establish connection to QwQ 32B through OpenRouter:

Obtain OpenRouter API key from openrouter.ai
Configure API access in config.py:

OPENROUTER_API_KEY = "YOUR_OPENROUTER_API_KEY"

Teacher data generation

Generate training examples through structured teacher model interaction:

python off-chain/distill_data_from_teacher.py

Data preparation, cleaning, and Canary words

Convert & clean up raw teacher outputs into optimized training datasets for student model fine-tuning. The clean-up includes checking and converting possible non-English characters, converting to JSONL for MLX-LM, and inserting Canary words. Prepare teacher responses for efficient MLX-LM training:

python off-chain/prepare_teacher_data_for_mlx.py

Canary word verification system

We’ll use Canary words as a method to confirm that our model truly leverages trained knowledge instead of relying on generic pre-trained responses. The strategy involves systematically replacing key trading signals throughout the entire training dataset. We substitute all “BUY” recommendations with the phrase APE IN, “SELL” with APE OUT, and “HOLD” or neutral stance—such as periods of market uncertainty or consolidation—we with APE NEUTRAL.

MLX-LM fine-tuning implementation

MLX-LM is an Apple Silicon-optimized training package. If you are running on a different hardware set or operating system, shop around for other packages, like Unsloth.

We are using LoRA for fine-tuning. In short, with LoRA you can fine-tune a model without modifying the entirety of it, which would not have been possible on a Mac (or probably any consumer hardware). The teacher_lora_config.yaml file defines comprehensive training parameters: Model specifications

Base model: Qwen/Qwen2.5-3B — foundation model for specialization
Adapter configuration — LoRA rank and alpha parameters for efficiency
Quantization settings — precision optimization for memory efficiency
Context length — maximum sequence length for training examples

Training parameters

Learning rate scheduling — optimized learning rate with warmup and decay
Batch size optimization — balanced for memory usage and training stability
Epoch configuration — sufficient training iterations for knowledge transfer
Validation split — holdout data for training progress monitoring

Data pipeline settings

Training data paths — location of prepared JSONL training files
Validation data — separate dataset for training progress evaluation
Tokenization parameters — text processing settings for model compatibility
Sequence formatting — conversation structure for optimal learning

Execute the fine-tuning using LoRA methodology & MLX-LM:

mlx_lm.lora --config off-chain/data/mlx_lm_prepared/teacher_lora_config.yaml

This will result in the LoRA generated delta as adapters.safetensors checkpoint files and the final file in the off-chain/models/trading_model_lora/ directory. Validate the fine-tuned LoRA delta by loading the base model (Qwen/Qwen2.5-3B in our case; correct to your model name if using a different one) with the created adapters.safetensors file:

mlx_lm.generate --model Qwen/Qwen2.5-3B \
  --adapter-path off-chain/models/trading_model_lora \
  --prompt "Given ETH price is $2506.92 with volume of 999.43 and volatility of 0.045, recent price change of 35.7765 ticks, and I currently hold 3.746 ETH and 9507.14 USDC, what trading action should I take on Uniswap?" \
  --temp 0.3

Check the response and look for the Canary words too.

Platform

Pricing

Core

Add-ons

Security

Web3 [de]coded

Subgraphs

MCP servers

IPFS storage

Marketplace

Chainstack Compare

Chainstack ChatGPT plugin

Chainstack DLP browser extension

Protocols

Advanced APIs

Tooling

AI trading agent: Generative adversarial networks and synthetic data

Real blockchain data collection

Collecting real blockchain data

Raw data collection implementation

Data preprocessing pipeline

What are Generative Adversarial Networks (GANs)

Specialized financial time series GAN architecture

GAN implementation architecture

Generator architecture design

Discriminator architecture design

Synthetic data generation process

Training configuration management

Synthetic data quality validation

Teacher to student distillation

Teacher model integration

OpenRouter setup and configuration

Teacher data generation

Data preparation, cleaning, and Canary words

Canary word verification system

MLX-LM fine-tuning implementation

Platform

Pricing

Core

Add-ons

Security

Web3 [de]coded

Subgraphs

MCP servers

IPFS storage

Marketplace

Chainstack Compare

Chainstack ChatGPT plugin

Chainstack DLP browser extension

Protocols

Advanced APIs

Tooling

​Real blockchain data collection

​Collecting real blockchain data

​Raw data collection implementation

​Data preprocessing pipeline

​What are Generative Adversarial Networks (GANs)

​Specialized financial time series GAN architecture

​GAN implementation architecture

​Generator architecture design

​Discriminator architecture design

​Synthetic data generation process

​Training configuration management

​Synthetic data quality validation

​Teacher to student distillation

​Teacher model integration

​OpenRouter setup and configuration

​Teacher data generation

​Data preparation, cleaning, and Canary words

​Canary word verification system

​MLX-LM fine-tuning implementation

Real blockchain data collection

Collecting real blockchain data

Raw data collection implementation

Data preprocessing pipeline

What are Generative Adversarial Networks (GANs)

Specialized financial time series GAN architecture

GAN implementation architecture

Generator architecture design

Discriminator architecture design

Synthetic data generation process

Training configuration management

Synthetic data quality validation

Teacher to student distillation

Teacher model integration

OpenRouter setup and configuration

Teacher data generation

Data preparation, cleaning, and Canary words

Canary word verification system

MLX-LM fine-tuning implementation