Transformers for Alpha: AI-Powered Asset Pricing

🚀 This paper introduces transformer-based AI models for asset pricing, showing how context-aware architectures like those in GPT-4 improve prediction by sharing information across stocks. The result? Sharpe ratios up to 4.57—outperforming traditional and ML models.

Key Performance Metrics

How Well Does This Strategy/Model Perform?

Sharpe Ratio:
- Linear Attention: 3.89
- Transformer: 4.57
Pricing Error (HJD):
- Linear Attention: 0.14
- Transformer: 0.09

Takeaway:
📈 The transformer-based AIPM significantly outperforms existing ML models by enhancing cross-asset information sharing and leveraging deeper architectures.

Key Idea: What Is This Paper About?

This paper introduces Artificial Intelligence Pricing Models (AIPMs) that embed transformer architectures into the stochastic discount factor (SDF). It demonstrates that sharing information across assets and scaling model complexity improves predictive accuracy and reduces pricing errors in asset returns.

Economic Rationale: Why Should This Work?

Transformers mimic the success of large language models by contextualizing information—here, across stocks instead of words. By embedding asset-level characteristics into a shared space, the model captures richer, nonlinear interactions and conditional dependencies.

Relevant Economic Theories and Justifications:

Contextual Learning: Just like words in context, stocks benefit from cross-sectional information to improve signal quality.
Factor Timing: The model times characteristic-managed portfolios dynamically, enhancing alpha.
Virtue of Complexity: Consistent with theory and empirical results, more complex models perform better under large data regimes.

Why It Matters:
Capturing nonlinear, high-dimensional relations across assets breaks traditional modeling limits and unlocks hidden predictive signals.

How to Do It: Data, Model, and Strategy Implementation

Data Used

Data Sources: Jensen-Kelly-Pedersen (2023) factor database (132 stock characteristics)
Time Period: 1963–2022
Asset Universe: US stocks (NYSE/AMEX/NASDAQ)

Model / Methodology

Model Types:
- Linear Portfolio Transformer
- Deep Nonlinear Transformer (multi-head attention, stacked transformer blocks)
Training: 60-month rolling windows, ridge penalty (for linear); Adam optimizer (for deep transformer)

Trading Strategy

Signal Generation: Transformer outputs conditional weights w_t = T^{(K)}(X_t)λ for the SDF.
Portfolio Construction: Risk-efficient SDF portfolio optimized via mean-variance principles
Rebalancing Frequency: Monthly

Key Table or Figure from the Paper

Explanation:
Shows that the nonlinear transformer model achieves a Sharpe ratio of 4.57 and a pricing error of 0.09, outperforming shallow neural networks (e.g., DKKM with SR 3.91, error 0.13) and traditional linear models (e.g., BSV with SR 3.60, error 0.15).

Final Thought

🚀 Transformer-based AIPMs redefine what's possible in asset pricing—context, scale, and complexity deliver real predictive edge.

Paper Details (For Further Reading)

Title: Artificial Intelligence Asset Pricing Models
Authors: Bryan T. Kelly, Boris Kuznetsov, Semyon Malamud, Teng Andrea Xu
Publication Year: 2025
Journal/Source: NBER Working Paper No. 33351
Link: http://www.nber.org/papers/w33351

LLM Agents for Crypto: Multi-Agent System Beats the Market

An explainable multi-agent system using fine-tuned GPT-4o models for crypto portfolio management. Specialized agents analyze news, factors, and charts, collaborate on decisions, and execute trades—outperforming benchmarks in returns, accuracy, and interpretability.

Mar 29, 2025

Can ChatGPT Overcome Behavioral Biases in Gold Investment?

This paper introduces a multi-step prompt strategy called Classify-and-Rethink (CAR) to help ChatGPT overcome behavioral biases—especially the framing effect—in financial decision-making. Applied to gold news, CAR improves score rationality and generates higher Sharpe ratios.