Hands-On Machine Learning for Algorithmic Trading
Hands-On Machine Learning for Algorithmic Trading - Stefan Jansen. Design and implement investment strategies based on smart algorithms that learn from data using Python.
Đặt in thành sách tại HoaXanh, xem sách in mẩu trong video bên dưới.
- 260,000đ
- Mã sản phẩm: HA3409
- Tình trạng: 2
Chapter 1: Machine Learning for Trading 8
How to read this book 9
What to expect 10
Who should read this book 10
How the book is organized 11
Part I – the framework – from data to strategy design 11
Part 2 – ML fundamentals 12
Part 3 – natural language processing 13
Part 4 – deep and reinforcement learning 13
What you need to succeed 14
Data sources 14
GitHub repository 15
Python libraries 15
The rise of ML in the investment industry 15
From electronic to high-frequency trading 16
Factor investing and smart beta funds 18
Algorithmic pioneers outperform humans at scale 20
ML driven funds attract $1 trillion AUM 21
The emergence of quantamental funds 22
Investments in strategic capabilities 23
ML and alternative data 23
Crowdsourcing of trading algorithms 25
Design and execution of a trading strategy 25
Sourcing and managing data 26
Alpha factor research and evaluation 27
Portfolio optimization and risk management 28
Strategy backtesting 28
ML and algorithmic trading strategies 29
Use Cases of ML for Trading 30
Data mining for feature extraction 30
Supervised learning for alpha factor creation and aggregation 31
Asset allocation 31
Testing trade ideas 32
Reinforcement learning 32
Summary 32
Chapter 2: Market and Fundamental Data 33
How to work with market data 34
Market microstructure 34
Marketplaces 34
Types of orders 36
Working with order book data 36
The FIX protocol 37
Nasdaq TotalView-ITCH Order Book data 38
Parsing binary ITCH messages 38
Reconstructing trades and the order book 42
Regularizing tick data 45
Tick bars 46
Time bars 47
Volume bars 49
Dollar bars 50
API access to market data 50
Remote data access using pandas 51
Reading html tables 51
pandas-datareader for market data 51
The Investor Exchange 52
Quantopian 53
Zipline 54
Quandl 55
Other market-data providers 56
How to work with fundamental data 57
Financial statement data 57
Automated processing – XBRL 57
Building a fundamental data time series 58
Extracting the financial statements and notes dataset 58
Retrieving all quarterly Apple filings 60
Building a price/earnings time series 61
Other fundamental data sources 62
pandas_datareader – macro and industry data 63
Efficient data storage with pandas 63
Summary 64
Chapter 3: Alternative Data for Finance 65
The alternative data revolution 66
Sources of alternative data 67
Individuals 68
Business processes 68
Sensors 69
Satellites 70
Geolocation data 70
Evaluating alternative datasets 71
Evaluation criteria 72
Quality of the signal content 72
Asset classes 72
Investment style 72
Risk premiums 72
Alpha content and quality 73
Quality of the data 73
Legal and reputational risks 73
Exclusivity 74
Time horizon 74
Frequency 74
Reliability 75
Technical aspects 75
Latency 75
Format 75
The market for alternative data 75
Data providers and use cases 77
Social sentiment data 77
Dataminr 78
StockTwits 78
RavenPack 78
Satellite data 78
Geolocation data 79
Email receipt data 79
Working with alternative data 79
Scraping OpenTable data 79
Extracting data from HTML using requests and BeautifulSoup 80
Introducing Selenium – using browser automation 81
Building a dataset of restaurant bookings 82
One step further – Scrapy and splash 83
Earnings call transcripts 84
Parsing HTML using regular expressions 85
Summary 87
Chapter 4: Alpha Factor Research 88
Engineering alpha factors 89
Important factor categories 90
Momentum and sentiment factors 90
Rationale 91
Key metrics 92
Value factors 93
Rationale 94
Key metrics 95
Volatility and size factors 96
Rationale 96
Key metrics 97
Quality factors 97
Rationale 98
Key metrics 98
How to transform data into factors 99
Useful pandas and NumPy methods 100
Loading the data 100
Resampling from daily to monthly frequency 100
Computing momentum factors 101
Using lagged returns and different holding periods 102
Compute factor betas 102
Built-in Quantopian factors 103
TA-Lib 103
Seeking signals – how to use zipline 104
The architecture – event-driven trading simulation 105
A single alpha factor from market data 106
Combining factors from diverse data sources 108
Separating signal and noise – how to use alphalens 110
Creating forward returns and factor quantiles 110
Predictive performance by factor quantiles 112
The information coefficient 114
Factor turnover 117
Alpha factor resources 117
Alternative algorithmic trading libraries 117
Summary 118
Chapter 5: Strategy Evaluation 119
How to build and test a portfolio with zipline 120
Scheduled trading and portfolio rebalancing 120
How to measure performance with pyfolio 122
The Sharpe ratio 122
The fundamental law of active management 123
In and out-of-sample performance with pyfolio 124
Getting pyfolio input from alphalens 125
Getting pyfolio input from a zipline backtest 125
Walk-forward testing out-of-sample returns 126
Summary performance statistics 127
Drawdown periods and factor exposure 128
Modeling event risk 129
How to avoid the pitfalls of backtesting 129
Data challenges 130
Look-ahead bias 130
Survivorship bias 130
Outlier control 131
Unrepresentative period 131
Implementation issues 131
Mark-to-market performance 131
Trading costs 132
Timing of trades 132
Data-snooping and backtest-overfitting 132
The minimum backtest length and the deflated SR 133
Optimal stopping for backtests 133
How to manage portfolio risk and return 134
Mean-variance optimization 135
How it works 136
The efficient frontier in Python 136
Challenges and shortcomings 139
Alternatives to mean-variance optimization 140
The 1/n portfolio 140
The minimum-variance portfolio 141
Global Portfolio Optimization - The Black-Litterman approach 141
How to size your bets – the Kelly rule 142
The optimal size of a bet 142
Optimal investment – single asset 143
Optimal investment – multiple assets 144
Risk parity 144
Risk factor investment 145
Hierarchical risk parity 145
Summary 146
Chapter 6: The Machine Learning Process 147
Learning from data 148
Supervised learning 150
Unsupervised learning 150
Applications 151
Cluster algorithms 151
Dimensionality reduction 152
Reinforcement learning 152
The machine learning workflow 153
Basic walkthrough – k-nearest neighbors 154
Frame the problem – goals and metrics 154
Prediction versus inference 155
Causal inference 155
Regression problems 156
Classification problems 158
Receiver operating characteristics and the area under the curve 159
Precision-recall curves 159
Collecting and preparing the data 160
Explore, extract, and engineer features 161
Using information theory to evaluate features 161
Selecting an ML algorithm 162
Design and tune the model 162
The bias-variance trade-off 163
Underfitting versus overfitting 163
Managing the trade-off 164
Learning curves 165
How to use cross-validation for model selection 166
How to implement cross-validation in Python 167
Basic train-test split 167
Cross-validation 168
Using a hold-out test set 168
KFold iterator 169
Leave-one-out CV 169
Leave-P-Out CV 170
ShuffleSplit 170
Parameter tuning with scikit-learn 170
Validation curves with yellowbricks 171
Learning curves 171
Parameter tuning using GridSearchCV and pipeline 172
Challenges with cross-validation in finance 172
Time series cross-validation with sklearn 173
Purging, embargoing, and combinatorial CV 173
Summary 174
Chapter 7: Linear Models 175
Linear regression for inference and prediction 176
The multiple linear regression model 177
How to formulate the model 177
How to train the model 178
Least squares 178
Maximum likelihood estimation 179
Gradient descent 180
The Gauss—Markov theorem 181
How to conduct statistical inference 182
How to diagnose and remedy problems 184
Goodness of fit 184
Heteroskedasticity 185
Serial correlation 186
Multicollinearity 187
How to run linear regression in practice 187
OLS with statsmodels 187
Stochastic gradient descent with sklearn 190
How to build a linear factor model 190
From the CAPM to the Fama—French five-factor model 191
Obtaining the risk factors 193
Fama—Macbeth regression 194
Shrinkage methods: regularization for linear regression 198
How to hedge against overfitting 198
How ridge regression works 199
How lasso regression works 201
How to use linear regression to predict returns 201
Prepare the data 201
Universe creation and time horizon 202
Target return computation 202
Alpha factor selection and transformation 203
Data cleaning – missing data 203
Data exploration 204
Dummy encoding of categorical variables 204
Creating forward returns 205
Linear OLS regression using statsmodels 206
Diagnostic statistics 206
Linear OLS regression using sklearn 207
Custom time series cross-validation 207
Select features and target 207
Cross-validating the model 208
Test results – information coefficient and RMSE 209
Ridge regression using sklearn 210
Tuning the regularization parameters using cross-validation 211
Cross-validation results and ridge coefficient paths 212
Top 10 coefficients 212
Lasso regression using sklearn 213