Could ChatGPT have earned abnormal returns? A retrospective test from the U.S. stock market
1. Introduction
This paper evaluates whether OpenAI’s ChatGPT could have generated abnormal stock returns as an investment advisor for individual investors over a 37-year retrospective period (1985–2021). Early attempts to solicit investment advice from ChatGPT highlighted its potential and limitations. Potter and Greifeld (2023) received only disclaimers about market unpredictability when requesting a market-beating ETF. Yeung (2023b) tricked the tool into suggesting stocks for specific goals (e.g., growth or value) with mixed ETF-comparable results, while Yeung (2023a) later obtained nuanced picks from GPT-4 but without quantitative benchmarks or explanations. Building on these, I prompt the GPT-4 model to select 10 U.S. stocks each July 1, restricting its knowledge to pre-selection data to simulate real-time advice without look-ahead bias. These stocks form price-weighted portfolios held for 24 months, reflecting how small investors might buy round lots of 100 shares per stock rather than using equal- or value-weighting. I assess performance using cumulative abnormal returns (CARs) against the market and risk-adjusted alphas via the CAPM, Fama-French three-factor, five-factor, and Fama-French three-factor with momentum/long-term reversal models.
The key findings show that ChatGPT’s portfolios earn positive but inconsistent abnormal returns. Average 24-month CARs are 27.242% (p < 0.001), accumulating positively from month 2 onward. Risk-adjusted alphas average 0.483% to 0.939% monthly across models, but are insignificant overall, with positive and significant alphas in only one out of every four years and no significantly negative ones. ChatGPT favors large growth stocks (negative size and book-to-market betas, p < 0.001) in sectors like technology and healthcare, with 26% annual turnover. Robustness checks confirm similar patterns for equal- and value-weighted portfolios.
This study contributes to the emerging literature on AI in finance by providing one of the first long-horizon retrospective tests of ChatGPT’s stock-picking ability, spanning diverse market conditions rather than focusing solely on post-2021 data as in studies such as Lopez-Lira and Tang (2024). Unlike those studies, which emphasize news-driven predictions and short-term horizons, I prioritize price-weighting for relevance to individual investors. I find general but inconsistent overperformance, echoing mixed results in Pelster and Val (2024) but highlighting ChatGPT’s avoidance of significant underperformance. It also reveals ChatGPT’s sector biases and prompting challenges, extending Ko and Lee (2023) on AI portfolio optimization.
2. Literature Review
Given the recency of ChatGPT’s release for public use, it is unsurprising that sparse research is available about this tool’s usefulness as a stock picker. However, the articles discussed below are indicative of much of the research that is currently accessible.
Lopez-Lira and Tang (2024) provided one of the more in-depth studies on this research question, using news headlines to assess whether a particular news item would be good, bad, or uncertain for the returns of a particular stock the next day. They find that older large language models (LLMs) struggle to provide general strong returns. As an extension of this idea, they also find that more complex LLMs are required to earn returns when analyzing complicated headlines. Lu et al. (2023) find that ChatGPT can earn abnormal returns for 30 days when selecting a portfolio in response to news announcements, though it struggles to perform well in out-of-sample testing. Chen et al. (2025) corroborate Lopez-Lira and Tang’s (2024) conclusion regarding stronger performance by more complex LLMs while also showing how ChatGPT does a better job at identifying good news over other models, including human investors. Pelster and Val (2024) find that ChatGPT performs well at predicting earnings and the future performance of stocks after earnings surprises and after both positive and negative news events.
Ko and Lee (2023) utilize ChatGPT to generate diversified portfolios in a Markowitz (1952) mean-variance framework by both maximizing the Sharpe ratio (generating the tangency portfolio) and minimizing the portfolio variance. They find generally greater diversification, higher returns, higher risk-adjusted returns as measured by the Sharpe ratio, and lower standard deviations in the ChatGPT selections compared to randomly determined portfolios. Romanko et al. (2023) alternatively assess ChatGPT’s ability to optimize stock weights in a Markowitz (1952) mean-variance framework. While the AI tool performs better than several benchmarks, including an equal-weighted portfolio, the ChatGPT-weighted portfolio is still under the efficient frontier.
Ahangar and Fietko (2023) attempt to get ChatGPT to tell them the best stock to invest in on a specific date (March 20, 2023, in their analysis); they could not induce it to do so despite using multiple iterations of their request. They find that the user needs to provide additional information even to get a suggestion about a specific stock. They conclude that “ChatGPT can be a helpful resource for financial experts but not for those with limited financial expertise and knowledge.” This conclusion is one of the motivating ideas for this study, as the author believes that unsophisticated investors will rely on the AI tool they hear most about, which has recently been ChatGPT. Yeo et al. (2025) focus on the importance of explainability in financial AI tools in general (not just ChatGPT), which can be summarized as the ability to explain the results of an AI response in terms that the average person could understand. Their concern arose from an apparent tendency of LLMs to make up answers that would be difficult for non-experts to recognize.
Ali and Aysan (2023) focus on ChatGPT’s usefulness in other areas before asking how it considers itself applicable in the financial industry. Specific to investments, portfolio management, and wealth management, it claims it can “provide financial advice to customers based on market data, investment portfolios, and customer preferences.” However, it needs an “enormous amount of data and ability to use deep learning and machine learning algorithms and networks to perform the task of stock price prediction, and also for providing advice for investments and asset management.” It considers itself superior to other chatbots and virtual assistants in use by financial institutions, due to its natural language understanding, knowledge base, continuous learning, customizability, and integrability. Relative to financial robo-advisors, ChatGPT cannot automate investment and asset management decisions, but is a more convenient tool for gathering inputs in less structured manners and is subject to less regulation. However, given the ability of robo-advisors to track and work with clients’ adjusting risk preferences, even when those adjustments are sub-optimal (Capponi et al., 2020), there may be room for ChatGPT, or AI in general, to improve clients’ interactions with robo-advisors. Kalyan (2024) reports mixed results in assessing ChatGPT’s performance in performing various financial tasks against established domain-specific models.
3. Prompting Methodology
This section details the development of the series of prompts used to gather the stocks included in each year’s portfolio. As shown in Appendix A, ChatGPT-4 appears to use a top-down approach to portfolio selection, beginning with considerations of macroeconomic conditions, followed by a focus on specific sectors of the economy.
3.1. Initial Prompting
The initial interaction with ChatGPT-4 used the following prompt:
“It is early in the morning of July 1, 1985. I have savings of $100,000 that I will invest as much of as I can in the US stock market as soon as trading opens (which might be a day or two away if the markets are not open today). Over the next two years, I am going to hold onto the stocks I select initially, and my desire is to beat the market on both a total return and a risk-adjusted return basis. Your knowledge set is limited to only information that would have been available when the markets open and transactions can occur. I am going to purchase 100 shares of each of the stocks you suggest. What should I buy?”
This prompt led to the listing of key factors used in the selection process, followed by a list of six “potential stock picks from 1985” (IBM, Merck, Coca-Cola, GE, Microsoft, and Intel).
3.2. Revisions
There was no attempt to come close to the $100,000 budget limit, because the tool could not access historical price data. When giving it a link to historical prices around July 1, 1985, for one of the stocks in the list1, it claimed an inability to pull the price data “due to the limitations of my browsing tool.” This limit led me to drop the budget constraint from my request and instead focus on a series of top picks each year (change in prompt is emphasized below). This also allows this study to more closely align with the approach that would be utilized by small, individual investors interested in direct stock ownership over funds and ETFs.
“It is early in the morning of July 1, 1985. I am going to invest in ten stocks available in the US stock market as soon as trading opens (which might be a day or two away if the markets will not be open today). Over the next two years, I am going to hold onto the stocks I select initially, and my desire is to beat the market on both a total return and a risk-adjusted return basis. Your knowledge set is limited to only information that would have been available when the markets open and transactions can occur. I am going to purchase 100 shares of each of the stocks you suggest. What should I buy?”
3.3. Attempts at Efficiency
This prompt was more successful, reiterating the key factors used in the selection process and providing ten stocks for investment, this time excluding the not-yet-public Microsoft from the list. Feeling comfortable with the results, I then ventured to request that the process be repeated for every year in this study, 1986-2022. ChatGPT-4 explained another of its limitations here.
“Providing detailed stock recommendations with justifications for each year from 1986 through 2022 is an extensive task that requires a deep and specific analysis of market conditions, economic factors, and company performances for each of those years. This level of detail would require access to a vast array of historical financial data and economic reports.”
While I appreciate the due diligence ChatGPT-4 is willing to undertake, this required another simple detour, requesting the recommendations one year at a time.
“Yes, I can certainly provide stock recommendations for one specific year, including the relevant economic factors and sector trends of that year. Please specify which year you’re interested in, and I will conduct an analysis based on the information and context available for that particular year.”
4. Data and Empirical Methodology
4.1. Data Sources
Prices, monthly total returns on individual stocks, and individual stock characteristics were extracted from S&P CapitalIQ. For month m, the open price used to determine weights was extracted using the IQ_LASTSALEPRICE field as of the end of the previous month (m − 1). Since this provided the actual sale price on that date, this value was multiplied by the IQ_CUMULATIVE_SPLIT_FACTOR to adequately compensate for the effects of splits that occurred during the holding period. The return for stock i during month m is calculated as the holding period return based on the same price field and the dividends per share during month m as determined by the IQ_DIV_SHARE field as:
(1) $$r_i=\frac{IQ_LASTSALEPRICE_m+IQ_DIV_SHARE_m}{IQ_LASTSALEPRICE_{m-1}}-1.$$
Market capitalizations used for characteristic analysis in Section 4.2 and for robustness checks in Section 5.4 are obtained using the IQ_MARKETCAP field. Book value of equity per share is extracted using IQ_BV_SHARE, and the book-to-market ratio is calculated as:
(2) $${B/M}_i=\frac{IQ_BV_SHARE_m}{IQ_LASTSALEPRICE_{m-1}}.$$
Independent variables used in regressions were pulled from Kenneth French’s website.2 Values for the monthly market premium (Mkt − Rf, hereafter rM − rf), size (SMB), value (HML), operating profitability (RMW), and investment (CMA) risk factors are pulled from the file contained at the “Fama/French 5 Factors (2x3)” line, while momentum (MOM) and long-term reversal (LTRev) risk factors are obtained from the links at “Momentum Factor (Mom)” and “Long-Term Reversal Factor (LT Rev)” lines, respectively.
4.2. Characteristics of Selected Stocks
While a direct analysis of the size and book-to-market ratios (B/M) of the firms selected would be difficult to draw any conclusions from due to the changing nature of the market over a nearly four-decade period, tendencies emerge when considering the stocks’ size and B/M relative to other securities at the time of portfolio formation. Using the five-percent breakpoints provided on Kenneth French’s website3, Figure 1 clearly shows a tendency for ChatGPT to select large, growth firms when prompted as described in Section 3.2. Note that these breakpoints are based only on NYSE stocks, whereas all U.S. stocks are eligible for inclusion in these portfolios. If a selected stock was below the minimum or above the maximum of all NYSE stocks in that period, it was placed in the bottom (less than 5th percentile) or top group (greater than 95th percentile), respectively.
Figure 1. NYSE Percentile Groups for Market Capitalization and Book-to-Market Securities Chosen by ChatGPT-4, 1985-2021. Note: Figure 1 shows the percentile groups of the size and book-to-market values of the stocks selected by ChatGPT-4 for each of the 37 portfolios formed on July 1 (or the next trading day) of each year from 1985 to 2021. Breakpoints for each percentile group were pulled from Kenneth French’s website.
4.3. Analysis of Portfolio Returns
The returns on the portfolios suggested by ChatGPT are assessed on both a raw and a risk-adjusted basis. The ten stocks selected each year were placed in a price-weighted portfolio in July of the selection year. Let ri be the stock’s return, rP be the portfolio’s return, the selection year be represented by t (where t = 1 represents 1985, t = 2 represents 1986, etc.), and the month of the holding period be represented by m.
If stock i earns return ri,t,m, then the portfolio return can be calculated as
(3) $$r_{P,t,m} = \sum_{i=1}^{10} w_{i,t,m}\, r_{i,t,m}$$
where wi,t,m represents the percent of the portfolio invested in stock i at the beginning of month m during the holding period defined by selection year t. For m = 1, this will simply be the price of stock i at the beginning of the first trading day on or after July 1 of the selection year. Since the portfolio will be held with no modifications, it will work out that the weights for each month thereafter can be determined in the same way based on the prices of the first trading day of that particular month.
For raw returns, let rM be the return on the market portfolio as found on Kenneth French’s website, and calculate the abnormal return as:
(4) $$AR_{t,m}=r_{P,t,m}-r_{M,t,m}$$
For a given investment horizon H, cumulative abnormal returns for a given portfolio and average cumulative abnormal returns across portfolios from formation years 1 through T could then be calculated as:
(5) $$CAR_{t,H}=\sum_{m=1}^{H}{AR_{t,m}},$$
(6) $$ACAR_{T,H}=\frac{1}{T}\sum_{t=1}^{T}{CAR_{t,H}}.$$
For risk-adjusted returns, the monthly excess returns (XRP,t,m = rP,t,m− rf,t,m) are set as the dependent variable for an OLS regression on a combination of k risk factors F.
(7) $$XR_P=\alpha_P+F\beta_P^\prime+\varepsilon_P$$
where XRP is a column vector with n = T × H monthly excess returns, αP is a column vector of size n containing the OLS intercept term that represents the risk- P adjusted abnormal return, F is the n × k matrix of monthly risk factors, β′ is a transposed column vector containing k regression coefficients that represent the risk factor sensitivities, and εP is a column vector containing n standard normal I.I.D. OLS residuals.
5. Results
5.1. Assessing ChatGPT’s Selections
ChatGPT-4’s approach to stock selection follows the pattern of a top-down approach. As shown in Appendix A, ChatGPT relies first on the “Economic Context” at the time of selection, followed by a focus on four “Key Sectors for Growth” presumably based on the economic context. In the example shown in the appendix, it is clear how the stocks chosen reflect the key sectors described, with some stocks crossing over multiple sectors: technology (IBM, GE, Intel), healthcare and pharmaceuticals (Merck, 3M), consumer goods (Coca-Cola, GE, Procter & Gamble, 3M, McDonald’s, Walmart), and energy (Exxon). Technology and Healthcare and/or Pharmaceuticals were key sectors in all 37 years; some form of consumer spending (goods, discretionary, or staples) was chosen in 31 out of the 37 years, and financial services appeared in 20 of the 37 years. This seems to indicate that while ChatGPT can modify its recommendations based on the context of the investing environment, there are certain sectors that it appears to be more comfortable with or more confident recommending. Table 1 shows each year’s stock selections and the number of new stocks in each portfolio relative to the previous year. While only 2 of the updated portfolios are identical to the previous year (1985-86 and 1997-98), as many as 4 of the 10 stocks were replaced in 9 of the 36 updates; the average annual turnover rate was about 26%. Figure 2 shows the number of times the various securities were included in one of the ChatGPT portfolios. The 17 most frequently selected stocks were from one of these four sectors. Microsoft (MSFT) led the way with 34 selections over 37 years.1-4
Year | Tickers | Turnover | |||||||||
1985 | GE | IBM | INTC | KO | MCD | MMM | MRK | PG | WMT | XOM | |
1986 | GE | IBM | INTC | KO | MCD | MMM | MRK | PG | WMT | XOM | 0 |
1987 | AAPL | BA | INTC | JPM | KO | MRK | MSFT | PG | WMT | XOM | 4 |
1988 | IBM | INTC | JPM | KO | MMM | MRK | MSFT | NIKE | PFE | PG | 4 |
1989 | DIS | GS | IBM | INTC | JNJ | KO | MRK | MSFT | PG | WMT | 4 |
1990 | INTC | JNJ | KO | LMT | MCD | MRK | MSFT | PEP | PG | WMT | 3 |
1991 | BA | GE | INTC | JNJ | KO | MRK | MSFT | PFE | PG | WMT | 3 |
1992 | DIS | INTC | JNJ | JPM | KO | MRK | MSFT | PFE | PG | WMT | 2 |
1993 | CSCO | DIS | INTC | KO | MRK | MSFT | PFE | PG | T | WMT | 2 |
1994 | CSCO | INTC | JNJ | JPM | KO | MRK | MSFT | PFE | T | WMT | 2 |
1995 | AAPL | CAT | CSCO | DIS | INTC | JNJ | MSFT | PFE | PG | WMT | 4 |
1996 | AAPL | AMGN | CSCO | GE | HD | INTC | MSFT | PFE | PG | WMT | 3 |
1997 | AAPL | AMZN | CAT | CSCO | GE | INTC | MSFT | PFE | QCOM | WMT | 3 |
1998 | AAPL | AMZN | CAT | CSCO | GE | INTC | MSFT | PFE | QCOM | WMT | 0 |
1999 | AMZN | AOL | CSCO | GE | INTC | MSFT | ORCL | PFE | QCOM | WMT | 2 |
2000 | CSCO | INTC | JNJ | MSFT | ORCL | PFE | PG | T | WMT | XOM | 4 |
2001 | CSCO | GE | INTC | JNJ | LMT | MSFT | PEP | PFE | PG | WMT | 3 |
2002 | BA | CSCO | INTC | JNJ | MSFT | PEP | PFE | PG | WMT | XOM | 2 |
2003 | AAPL | AMZN | BA | GOOGL | JNJ | MSFT | PFE | PG | V | XOM | 4 |
2004 | AAPL | AMZN | GOOGL | JNJ | JPM | META | MSFT | PFE | TSLA | V | 3 |
2005 | AAPL | AMZN | GOOGL | HON | JNJ | MSFT | PFE | PG | V | XOM | 3 |
2006 | AAPL | AMZN | BA | GOOGL | JNJ | JPM | MSFT | PFE | PG | V | 2 |
2007 | AAPL | AMZN | GOOGL | INTC | JNJ | KO | MSFT | PG | VZ | XOM | 4 |
2008 | AAPL | GOOGL | JNJ | KO | MCD | PFE | PG | VZ | WMT | XOM | 3 |
2009 | AAPL | AMZN | GOOGL | JNJ | META | MSFT | PFE | PG | V | XOM | 4 |
2010 | AAPL | AMZN | GOOGL | JNJ | JPM | MSFT | PFE | PG | V | WMT | 2 |
2011 | AAPL | AMZN | GOOGL | JNJ | META | MSFT | NKE | PFE | TSLA | V | 3 |
2012 | AAPL | AMZN | GOOGL | JNJ | META | MSFT | PFE | PG | V | XOM | 2 |
2013 | AAPL | AMZN | BA | GOOGL | JNJ | MSFT | PFE | PG | V | XOM | 1 |
2014 | AAPL | AMZN | DIS | GOOGL | JNJ | MSFT | PFE | PG | V | XOM | 1 |
2015 | AAPL | AMZN | GOOGL | JNJ | MSFT | NKE | PFE | PG | V | XOM | 1 |
2016 | AAPL | AMZN | GOOGL | JNJ | JPM | META | MSFT | PFE | PG | V | 2 |
2017 | AAPL | AMZN | GOOGL | JNJ | META | MSFT | NVDA | PFE | TSLA | V | 2 |
2018 | AAPL | ADBE | AMZN | GOOGL | JNJ | JPM | META | MSFT | PFE | V | 2 |
2019 | AAPL | AMZN | GOOGL | JNJ | META | MSFT | PFE | V | WMT | XOM | 2 |
2020 | AAPL | AMZN | GOOGL | JNJ | MA | MSFT | NFLX | NVDA | PFE | TSLA | 4 |
2021 | AAPL | AMZN | GOOGL | JNJ | META | MSFT | NEE | PFE | TSLA | V | 3 |
Average | 2.583 |
Figure 2. Securities Chosen by ChatGPT-4, 1985-2021. Note: Figure 2 shows the frequency of the stocks selected by ChatGPT-4 for each of the 37 portfolios formed on July 1 (or the next trading day) for each year from 1985 to 2021. Note: 10 stocks chosen only once during this period were omitted from this graph.
5.2. Raw Returns
Analysis of cumulative abnormal returns shows that the ChatGPT portfolios provide greater returns than the market early in and throughout the holding period. Table 2 shows that after 24 months, abnormal returns will accumulate on average to 27.242%, which is significant for p < 0.001. Figure 3 illustrates the mean AR and the ACAR for each of the 24 months of the portfolios’ holding periods, along with 95% confidence intervals for each. While 22 of the 24 months’ ARs are positive on average, only 7 of the 24 are significantly different than zero for p < 0.05. However, this figure also shows that starting with the second month, this portfolio earns positive ACAR at the p < 0.05 significance level.
Coefficient | N | Sig. at p < 0.05 | Mean | St. Dev. | Min | Median | Max |
Panel A: Average Monthly Returns | |||||||
---|---|---|---|---|---|---|---|
Portfolio | 888 | 2.077 | 6.776 | −21.332 | 2.065 | 56.661 | |
Market | 888 | 0.939 | 4.527 | −22.640 | 1.460 | 13.650 | |
CAR24 | 37 | YES | 27.242 | 27.418 | −6.592 | 23.642 | 128.999 |
Panel B: CAPM | |||||||
Intercept | 37 | 7 | 1.088 | 1.095 | −0.247 | 1.001 | 4.505 |
Beta | 37 | 35 (10) | 1.087 | 0.365 | 0.512 | 1.065 | 2.054 |
R2 | 37 | 57.73 | 21.08 | 14.93 | 59.45 | 87.86 | |
Adjusted R2 | 37 | 55.81 | 22.04 | 11.07 | 57.61 | 87.31 | |
Panel C: Fama-French Three-Factor Model | |||||||
Intercept | 37 | 9 | 1.183 | 1.377 | -0.450 | 0.751 | 5.646 |
Beta | 37 | 34 (11) | 1.022 | 0.342 | 0.429 | 1.004 | 1.984 |
SMB | 37 | 7 | -0.426 | 0.284 | -1.092 | -0.416 | 0.088 |
HML | 37 | 22 | -0.793 | 0.621 | -3.008 | -0.581 | -0.124 |
R2 | 37 | 70.05 | 18.80 | 25.30 | 76.71 | 91.91 | |
Adjusted R2 | 37 | 65.56 | 21.62 | 14.09 | 73.21 | 90.70 | |
Panel D: Fama-French-Five-Factor-Model | |||||||
Intercept | 37 | 8 | 1.102 | 1.388 | -0.737 | 0.723 | 4.681 |
Beta | 37 | 33 (12) | 1.049 | 0.353 | 0.393 | 1.036 | 2.200 |
SMB | 37 | 7 | -0.409 | 0.356 | -1.043 | -0.432 | 0.294 |
HML | 37 | 8 | -0.485 | 0.737 | -2.308 | -0.369 | 2.351 |
R2 | 37 | 75.90 | 14.86 | 36.35 | 80.04 | 92.89 | |
Adjusted R2 | 37 | 69.20 | 18.99 | 18.67 | 74.50 | 90.92 | |
Panel E: Fama-French Three-Factor Model with Momentum and Reversals | |||||||
Intercept | 37 | 9 | 1.097 | 1.337 | −0.476 | 0.780 | 5.292 |
Beta | 37 | 33 (13) | 1.067 | 0.393 | 0.391 | 1.026 | 2.076 |
SMB | 37 | 9 | −0.488 | 0.403 | −1.725 | −0.436 | 0.126 |
HML | 37 | 13 | −0.790 | 0.734 | −3.791 | −0.535 | 0.430 |
MOM | 37 | 2 | 0.161 | 0.316 | −0.652 | 0.156 | 0.960 |
LTRev | 37 | 2 | 0.008 | 0.661 | −1.309 | 0.024 | 2.438 |
R2 | 37 | 73.98 | 16.96 | 29.83 | 79.05 | 94.40 | |
Adjusted R2 | 37 | 66.75 | 21.67 | 10.33 | 73.23 | 92.84 |
Figure 3. Average Abnormal Returns and Average Cumulative Abnormal Returns Over 24 Months. Note: Figure 3 shows the average AR and ACAR of all 37 selection years as of the end of each of the 24 months of the portfolio holding periods. The shaded region represents a 95% confidence interval surrounding the values.
5.3. Risk-Adjusted Returns
Regressions were run on the monthly excess returns using CAPM, Fama and French (1993, 2015) three-factor (FF3) and five-factor (FF5) models, and FF3 with additional factors for momentum and long-term reversals (FF3m). Price-weighting was used to reflect the prompt’s direction that 100 shares of each stock would be purchased as mentioned in Section 3.2.1-5 Table 3 summarizes the results of these regressions while controlling for year fixed effects. Results were consistent across all four models. The monthly abnormal returns were positive, ranging from 0.483% (FF3m) to 0.939% (FF5), but none were statistically significant. All four betas were significant for p < 0.001, ranging from 1.020 (FF5) to 1.116 (FF3m); only the FF5 beta was indistinguishable from a value of 1 at p < 0.05. Both the size and book-to-market factors were negative and significant at p < 0.001 for all three models, indicating ChatGPT prefers large, growth stocks for this analysis. Given the use of the stock selection criteria explained in Appendix A (market leadership, financial health, and dividend yield being three of the four criteria) and the size and book-to-market statistics reported in section 4.2, this is not a surprising result. For the last two models (FF5 and FF3m), results indicate that ChatGPT prefers to recommend firms with aggressive investment (CMA = −0.544, p < 0.001) and higher prior intermediate-term returns (MOM = 0.164, p < 0.001), while robustness of profits (RMW = 0.127, p = 0.0619) and prior long-term returns were not a factor (LTRev = 0.088, p = 0.1912).
Coefficient | CAPM | FF3 | FF5 | FF3 w/ |
Panel A: Overlapping Two-Year Holding Periods | ||||
---|---|---|---|---|
Intercept | 0.798 | 0.643 | 0.939 | 0.483 |
Beta | 1.086∗∗∗ | 1.075∗∗∗ | 1.020∗∗∗ | 1.116∗∗∗ |
SMB | −0.307∗∗∗ | −0.257∗∗∗ | −0.339∗∗∗ | |
HML | −0.583∗∗∗ | −0.387∗∗∗ | −0.553∗∗∗ | |
RMW | 0.127 | |||
CMA | −0.544∗∗∗ | |||
MOM | 0.164∗∗∗ | |||
LTRev | 0.088 | |||
Year Fixed Effects | YES | YES | YES | YES |
Reject β = 1 | YES | YES | NO | YES |
N | 888 | 888 | 888 | 888 |
R2 | 55.66 | 63.81 | 65.46 | 64.99 |
Adjusted R2 | 53.73 | 62.15 | 63.78 | 63.29 |
Panel B: Portfolio Rebalanced Every July 1 | ||||
Intercept | 0.728 | 0.538 | 0.429 | 0.433 |
Beta | 1.093∗∗∗ | 1.078∗∗∗ | 1.037∗∗∗ | 1.127∗∗∗ |
SMB | −0.300∗∗∗ | −0.226∗∗ | −0.352∗∗∗ | |
HML | −0.582∗∗∗ | −0.426∗∗∗ | −0.554∗∗∗ | |
RMW | 0.191 | |||
CMA | −0.473∗∗ | |||
MOM | 0.200∗∗∗ | |||
LTRev | 0.143 | |||
Year Fixed Effects | NO | NO | NO | NO |
Reject β = 1 | NO | NO | NO | YES |
N | 444 | 444 | 444 | 444 |
R 2 | 56.28 | 63.29 | 64.68 | 64.96 |
Adjusted R2 | 52.30 | 59.75 | 61.08 | 61.39 |
Additionally, Table 2 provides descriptive statistics of each year’s analysis results. The average monthly return for the ChatGPT portfolios was 2.077%, compared to 0.939% for the market. While the average monthly abnormal return across all years averages 1.088% and 1.183% for CAPM and FF3, respectively, only 7 and 9 out of those 37 portfolios were significant at the p < 0.05 level. Of those significant portfolios, only 3 of the 16 significant results occurred from 2000 through 2021 (2003 for both models and 2004 for FF3). Similar results are seen in FF5 and FF3m.
It is also interesting to note that while there are very few years in which the portfolio will provide significantly positive abnormal returns, there are only 22 instances across all 148 model-years tested in which the abnormal returns are negative. None of these are significant (the smallest p-value is 0.3412), and the worst performer is only −0.737% per month in 2009 as determined by FF5. Of the 22 negative intercepts, the only year in which all four models agree on the sign of the intercept is 2009. So while it is debatable from these results whether ChatGPT-4 can do a good job picking stocks for an investor, it cannot be concluded that ChatGPT would do a poor job on a risk-adjusted basis. See Figure 4 for an illustration of all 37 intercepts for each model.
Figure 4. Monthly Abnormal Returns of Each Portfolio Chosen by ChatGPT-4, 1985- 2021. Note: Figure 4 shows the monthly abnormal returns from each of the 37 portfolios formed on July 1 (or the next trading day) for each year from 1985 to 2021 as determined by the intercepts resulting from regressing equal-weighted portfolios of securities selected by ChatGPT-4 using CAPM and the Fama-French three- and five-factor models and the Fama- French three-factor model with momentum and long-term reversals over 24 months. Circles represent intercepts that are significant at p < 0.05. Reference lines represent the mean of the 37 intercepts produced by that panel’s model.
All but 13 of the 148 beta coefficients significantly differ from zero at the p < 0.05 level, averaging anywhere from 1.022 to 1.087 within each model grouping. Given the closeness to market betas, 102 of the 148 betas were not significantly different from one at the same p < 0.05 level. Significant betas tended to be less than one through 2002, while all were greater than one after 2002. Figure 5 illustrates all 37 betas for each model.
Figure 5. Betas of Each Portfolio Chosen by ChatGPT-4, 1985-2021. Note: Figure 5 shows the betas from each of the 37 portfolios formed on July 1 (or the next trading day) for each year from 1985 to 2021 as determined by the intercepts resulting from regressing equal-weighted portfolios of securities selected by ChatGPT-4 using CAPM and the Fama-French three- and five-factor models and the Fama-French three-factor model with momentum and long-term reversals over 24 months. Circles represent betas that are significantly different than 1 at p < 0.05. Reference lines are set at β = 1.
Finally, the negative average values of the size (−0.426) and book-to-market (−0.793) factors reinforce the implication of Figure 1 that ChatGPT has focused on large, growth firms to construct the requested portfolios.
5.4. Robustness
5.4.1 Other Portfolio Weighting Methods
For robustness and to align with more commonly used weighting methods, the same raw and risk-adjusted analyses were performed on equal- and value-weighted portfolios (EW and VW, respectively). For the raw returns of both portfolios, 24-month ACAR was positive and significant at p < 0.001, earning 22.679% for EW and 12.798% for VW (as shown in Panel A of Tables 4 and 5, respectively). Figure 6 illustrates the accumulation of abnormal returns over the 24 months for all three weighting systems (the first panel, PW, duplicates the bottom half of Figure 3). Also, while not a significant point of analysis for this study, it can be shown that the ACAR for PW and EW are not significantly different than each other (Bonferroni-adjusted p = 0.3310), while both PW and EW ACARs are greater than the VW ACAR (Bonferroni-adjusted p < 0.001 and = 0.0258, respectively).
Coefficient | N | Sig. at p < 0.05 | Mean | St. Dev. | Min | Median | Max |
Panel A: Average Monthly Returns | |||||||
---|---|---|---|---|---|---|---|
Portfolio | 888 | 1.886 | 5.733 | −22.882 | 2.079 | 31.058 | |
Market | 888 | 0.939 | 4.527 | −22.640 | 1.46 | 13.65 | |
CAR24 | 37 | YES | 22.679 | 21.954 | −1.004 | 18.225 | 107.117 |
Panel B: CAPM | |||||||
Intercept | 37 | 9 | 0.937 | 0.864 | −0.102 | 0.775 | 3.887 |
Beta | 37 | 37 (8) | 1.04 | 0.279 | 0.547 | 1 | 1.805 |
R2 | 37 | 68.19 | 16.34 | 20.89 | 70.14 | 93.58 | |
Adjusted R2 | 37 | 66.74 | 17.08 | 17.29 | 68.78 | 93.29 | |
Panel C: Fama-French Three-Factor Model | |||||||
Intercept | 37 | 11 | 1.077 | 0.996 | −0.153 | 0.837 | 4.141 |
Beta | 37 | 37 (12) | 0.973 | 0.278 | 0.381 | 0.942 | 1.923 |
SMB | 37 | 14 | −0.373 | 0.241 | −0.821 | −0.389 | 0.202 |
HML | 37 | 22 | −0.567 | 0.479 | −1.653 | −0.498 | 0.185 |
R2 | 37 | 78.29 | 13.32 | 30.9 | 80.04 | 94.68 | |
Adjusted R2 | 37 | 75.03 | 15.32 | 20.53 | 77.04 | 93.89 | |
Panel D: Fama-French Five-Factor Model | |||||||
Intercept | 37 | 8 | 1.025 | 1.013 | −0.155 | 0.787 | 4.06 |
Beta | 37 | 35 (10) | 0.978 | 0.306 | 0.25 | 0.962 | 1.926 |
SMB | 37 | 13 | −0.358 | 0.293 | −0.979 | −0.323 | 0.192 |
HML | 37 | 8 | −0.375 | 0.528 | −1.371 | −0.422 | 1.562 |
RMW | 37 | 4 | 0.148 | 0.568 | −0.731 | 0.087 | 1.826 |
CMA | 37 | 5 | −0.401 | 0.563 | −1.678 | −0.392 | 0.733 |
R2 | 37 | 82.44 | 10.47 | 43.95 | 85.3 | 95.49 | |
Adjusted R2 | 37 | 77.56 | 13.37 | 28.38 | 81.21 | 94.23 | |
Panel E: Fama-French Three-Factor Model with Momentum and Reversals | |||||||
Intercept | 37 | 9 | 1.065 | 0.955 | 0.005 | 0.844 | 4.005 |
Beta | 37 | 37 (12) | 0.996 | 0.313 | 0.415 | 0.96 | 1.895 |
SMB | 37 | 11 | −0.392 | 0.274 | −1.021 | −0.383 | 0.15 |
HML | 37 | 16 | −0.572 | 0.606 | −2.260 | −0.481 | 0.472 |
MOM | 37 | 0 | 0.048 | 0.214 | −0.429 | 0.018 | 0.462 |
LTRev | 37 | 2 | 0.014 | 0.423 | −1.190 | 0.024 | 0.97 |
R2 | 37 | 81.08 | 11.91 | 41.63 | 85.25 | 95.18 | |
Adjusted R2 | 37 | 75.82 | 15.22 | 25.42 | 81.15 | 93.84 |
Coefficient | N | Sig. at p < 0.05 | Mean | St. Dev. | Min | Median | Max |
Panel A: Average Monthly Returns | |||||||
---|---|---|---|---|---|---|---|
Portfolio | 888 | 1.473 | 5.384 | −19.398 | 1.667 | 23.725 | |
Market | 888 | 0.939 | 4.527 | −22.640 | 1.46 | 13.65 | |
CAR24 | 37 | YES | 12.798 | 15.746 | −11.683 | 12.218 | 50.495 |
Panel B: CAPM | |||||||
Intercept | 37 | 3 | 0.581 | 0.641 | −0.543 | 0.617 | 1.882 |
Beta | 37 | 37 (12) | 0.952 | 0.253 | 0.39 | 0.914 | 1.463 |
37 | 64 | 19.31 | 17.64 | 70.67 | 90.32 | ||
Adjusted R2 | 37 | 62.36 | 20.18 | 13.9 | 69.34 | 89.88 | |
Panel C: Fama-French Three-Factor Model | |||||||
Intercept | 37 | 6 | 0.68 | 0.776 | −0.287 | 0.53 | 3.544 |
Beta | 37 | 36 (12) | 0.91 | 0.252 | 0.46 | 0.921 | 1.54 |
SMB | 37 | 15 | −0.457 | 0.256 | −1.112 | −0.427 | 0.07 |
HML | 37 | 19 | −0.508 | 0.473 | −1.734 | −0.421 | 0.157 |
R 2 | 37 | 76.76 | 14.47 | 32.17 | 79.57 | 92.1 | |
Adjusted R2 | 37 | 73.27 | 16.64 | 21.99 | 76.51 | 90.92 | |
Panel D: Fama-French Five-Factor Model | |||||||
Intercept | 37 | 4 | 0.59 | 0.772 | −0.466 | 0.464 | 3.455 |
Beta | 37 | 36 (8) | 0.925 | 0.267 | 0.445 | 0.923 | 1.495 |
SMB | 37 | 10 | −0.382 | 0.244 | −1.017 | −0.324 | 0.077 |
HML | 37 | 10 | −0.342 | 0.562 | −1.635 | −0.306 | 1.45 |
RMW | 37 | 7 | 0.389 | 0.546 | −0.497 | 0.275 | 2.344 |
CMA | 37 | 8 | −0.306 | 0.633 | −1.717 | −0.150 | 0.753 |
R 2 | 37 | 81.54 | 11.19 | 46.83 | 84 | 93.15 | |
Adjusted R2 | 37 | 76.41 | 14.3 | 32.05 | 79.56 | 91.25 | |
Panel E: Fama-French Three-Factor Model with Momentum and Reversals | |||||||
Intercept | 37 | 6 | 0.722 | 0.715 | −0.199 | 0.622 | 3.249 |
Beta | 37 | 35 (10) | 0.927 | 0.293 | 0.436 | 0.955 | 1.472 |
SMB | 37 | 14 | −0.430 | 0.274 | −1.089 | −0.409 | 0.075 |
HML | 37 | 9 | −0.449 | 0.496 | −1.563 | −0.393 | 0.618 |
MOM | 37 | 2 | 0.052 | 0.283 | −0.691 | 0.085 | 0.474 |
LTRev | 37 | 3 | −0.013 | 0.37 | −1.037 | −0.002 | 0.722 |
R 2 | 37 | 79.95 | 13.57 | 37.04 | 85.42 | 92.47 | |
Adjusted R2 | 37 | 74.38 | 17.36 | 19.55 | 81.37 | 90.37 |
Figure 6. Average Cumulative Abnormal Returns Over 24 Months by Weighting Scheme Note: Figure 6 shows the ACAR as of the end of each of the 24 months of the portfolio holding periods for the price-weighted approach (PW) primarily analyzed in this study along with the equal-weighted (EW) and value-weighted (VW) approaches used as robustness checks. The shaded region represents a 95% confidence interval surrounding the ACAR.
For both portfolios on a risk-adjusted basis, the intercepts and coefficients for the three main risk factors were the same in sign and significance as the price-weighted portfolio for all four models. For the equal-weighted portfolios (Table 6), year fixed effects were significant, and beta was not significantly different from one in all four models. For the value-weighted portfolios (Table 7), year-fixed effects were only significant in the FF3m model, while beta significantly differed from the FF5 and FF3m models.
5.4.2. Rolling One-Year Portfolios
As another robustness check, I assume the investor takes a longer-term perspective. Instead of holding the suggested stocks for two years then divesting, the investor will create their portfolio on July 1, 1985 based on ChatGPT’s initial suggestions, then rebalance their portfolio using ChatGPT’s new suggestions July 1 of every year, and holding the final portfolio mix until June 30, 2022 (one year after the final selection made on July 1, 2021). Panel B of Table 3 shows the results of performing the four regressions on this investment plan. The signs and significance of the intercepts and all risk factor coefficients in all four models are unchanged. However, fixed effects for years are no longer significant for any of the models, and only one of the betas is significantly different from 1 at p < 0.05 (FF3m: β = 1.127).
Coefficient | CAPM | FF3 | FF5 | FF3 w/ |
Panel A: Overlapping Two-Year Holding Periods | ||||
---|---|---|---|---|
Intercept | 0.936 | 0.780 | 0.938 | 0.768 |
Beta | 1.035∗∗∗ | 1.036∗∗∗ | 1.006∗∗∗ | 1.039∗∗∗ |
SMB | −0.277∗∗∗ | −0.236∗∗∗ | −0.321∗∗∗ | |
HML | −0.418∗∗∗ | −0.310∗∗∗ | −0.455∗∗∗ | |
RMW | 0.113∗ | |||
CMA | −0.320∗∗∗ | |||
MOM | 0.030 | |||
LTRev | 0.121∗ | |||
Year Fixed Effects | YES | YES | YES | YES |
Reject β = 1 | NO | NO | NO | NO |
N | 888 | 888 | 888 | 888 |
R2 | 69.45 | 75.96 | 76.88 | 76.23 |
Adjusted R2 | 68.12 | 74.86 | 75.76 | 75.08 |
Panel B: Portfolio Rebalanced Every July 1 | ||||
Intercept | 0.622 | 0.472 | 0.335 | 0.633 |
Beta | 1.041∗∗∗ | 1.033∗∗∗ | 1.019∗∗∗ | 1.048∗∗∗ |
SMB | −0.245∗∗∗ | −0.179∗∗ | −0.302∗∗∗ | |
HML | −0.433∗∗∗ | −0.367∗∗∗ | −0.463∗∗∗ | |
RMW | 0.176∗ | |||
CMA | −0.239∗ | |||
MOM | 0.079∗ | |||
LTRev | 0.157∗ | |||
Year Fixed Effects | YES | YES | YES | YES |
Reject β = 1 | NO | NO | NO | NO |
N | 444 | 444 | 444 | 444 |
R2 | 70.09 | 75.80 | 76.60 | 76.47 |
Adjusted R2 | 67.36 | 73.46 | 74.21 | 74.07 |
This same robustness check was performed on the equal- and value-weighted portfolios. For both the equal-weighted portfolio (Table 6, Panel B) and the value-weighted portfolio (Table 7, Panel B), the signs of the intercepts and all risk factor coefficients in all four models are unchanged, while the only change in significance is that the momentum factor for the equal-weighted portfolio in the FF3m model became significant at p < 0.05. For the equal-weighted portfolio, year fixed effects remain significant for all four models, and beta remains statistically indistinguishable from 1 at p < 0.05 for all four models. None of the year fixed effects are statistically significant for the value-weighted portfolio, and beta is indistinguishable from 1 at p < 0.05 for all four models.
Coefficient | CAPM | FF3 | FF5 | FF3 w/ |
Panel A: Overlapping Two-Year Holding Periods | ||||
---|---|---|---|---|
Intercept | 0.444 | 0.212 | 0.359 | 0.190 |
Beta | 0.959∗∗∗ | 0.975∗∗∗ | 0.948∗∗∗ | 0.981∗∗∗ |
SMB | −0.370∗∗∗ | −0.320∗∗∗ | −0.397∗∗∗ | |
HML | −0.411∗∗∗ | −0.308∗∗∗ | −0.428∗∗∗ | |
RMW | 0.136∗∗ | |||
CMA | −0.320∗∗∗ | |||
MOM | 0.032 | |||
LTRev | 0.072 | |||
Year Fixed Effects | NO | NO | NO | YES |
Reject β = 1 | NO | NO | YES | YES |
N | 888 | 888 | 888 | 888 |
R2 | 67.54 | 76.35 | 77.49 | 76.51 |
Adjusted R2 | 66.13 | 75.27 | 76.40 | 75.38 |
Panel B: Portfolio Rebalanced Every July 1 | ||||
Intercept | 0.107 | −0.082 | −0.215 | 0.035 |
Beta | 0.951∗∗∗ | 0.964∗∗∗ | 0.945∗∗∗ | 0.973∗∗∗ |
SMB | −0.353∗∗∗ | −0.285∗∗∗ | −0.392∗∗∗ | |
HML | −0.400∗∗∗ | −0.318∗∗∗ | −0.422∗∗∗ | |
RMW | 0.180∗∗ | |||
CMA | −0.281∗∗ | |||
MOM | 0.052 | |||
LTRev | 0.108 | |||
Year Fixed Effects | NO | NO | NO | NO |
Reject β = 1 | NO | NO | NO | NO |
N | 444 | 444 | 444 | 444 |
R2 | 68.36 | 75.91 | 77.04 | 76.26 |
Adjusted R2 | 65.48 | 73.58 | 74.70 | 73.84 |
6. Conclusion
ChatGPT’s portfolios demonstrate a cautious, top-down approach that prioritizes stability over aggressive overperformance, as evidenced by its consistent selection of large growth stocks in resilient sectors like technology and healthcare. This bias toward “market leaders” with strong financial health (Appendix A) likely stems from the model’s training on historical data emphasizing blue-chip firms, leading to positive but insignificant alphas across models. For instance, the negative size and book-to-market betas (−0.426 and −0.793 on average) suggest ChatGPT avoids small/value stocks prone to volatility, which insulated portfolios during downturns (e.g., no negative alphas in 2008-2009) but limited upside in bull markets. The 26% turnover reflects adaptive responses to economic contexts, shifting from energy in the 1980s to tech in the 1990s. Yet this conservatism meant that only about one out of every four years showed significant alphas, often in recovery periods like 2003-2004.
These patterns interpretively align with ChatGPT’s limitations as a generalist LLM. It excels at synthesizing broad trends but struggles with nuanced, data-driven predictions without real-time access, explaining the lack of consistent abnormality. Compared to similar studies, this retrospective test over 37 years reveals more tempered performance than short-term, news-based analyses. For example, Lopez-Lira and Tang (2024) and Chen et al. (2025) report more substantial returns from complex LLMs on headlines. Still, their focus on daily and 30-day horizons captures event-driven spikes absent in my two-year holding periods. Similarly, Lu et al. (2023) find positive alphas post-announcements but note out-of-sample weakness, mirroring my insignificant long-run results. In contrast to Ko and Lee (2023), who achieve higher Sharpe ratios via mean-variance optimization, ChatGPT’s unoptimized selections underperform the efficient frontier (Romanko et al., 2023), underscoring its role as a heuristic tool rather than a quantitative optimizer.
Overall, ChatGPT could serve as a viable, accessible stock-picking tool for individual investors, particularly those favoring price-weighted strategies, but it does not reliably generate abnormal returns. Practically, its portfolios’ avoidance of significant underperformance, earning positive CARs (27.242% over 24 months) without negative alphas, suggests utility for risk-averse, unsophisticated investors seeking diversified exposure to large growth sectors like technology and healthcare. Small investors might use it to democratize advice, bypassing costly professionals, though integration with tools for real-time data or optimization could enhance outcomes.
However, limitations temper these implications. Despite prompts restricting knowledge, potential look-ahead bias arises from GPT-4’s post-2021 training. The model may inadvertently draw on patterns learned from later data. Prompting sensitivity (Section 3) also biases selections toward familiar sectors, and the study assumes no transaction costs or taxes, understating real-world frictions. Results may not generalize beyond U.S. stocks or two-year horizons.
Future research could extend this by testing hybrid human-AI strategies as suggested by Fatouros et al. (2024), incorporating real-time APIs for dynamic rebalancing, or extending the works of Kalyan (2024) and Chen et al. (2025) by comparing GPT-4 to specialized financial LLMs across global markets. Exploring ethical implications and potential solutions, such as those suggested by Khan and Umer (2024) and Nie et al. (2024), would further highlight its transformative potential in Fintech.
Acknowledgements. The author would like to thank Zach McGurk, Mary Becker, Kevin Brady, Markus Schmid, anonymous reviewers, and the 2024 AEF Annual Meeting participants for their insightful and constructive comments.
Declarations. The author has no financial or non-financial interests directly or indirectly related to the work submitted for publication
Funding: This research received no external funding.
Data Availability Statement: The manuscript contains data that will be made available upon reasonable request.
Conflicts of Interest: The author declares no conflict of interest.
Appendix A
The key factors listed for 1985 were as follows.
Economic Context: The U.S. economy was strong, with policies under President Reagan focusing on tax cuts, deregulation, and increased defense spending. Inflation was under control, and interest rates were trending downward.
Key Sectors for Growth:
Technology: The tech industry was expanding rapidly, especially in areas like personal computing and software.
Healthcare and Pharmaceuticals: Driven by medical advancements and an aging population.
Consumer Goods: Increased consumer spending was boosting this sector.
Energy: The 1980s saw significant developments in the energy sector.
Stock Selection Criteria:
Market Leadership: Companies that were industry leaders typically offered stability.
Growth Potential: Firms in growing industries or those showing promising growth.
Financial Health: Companies with strong balance sheets and good cash flow.
Dividend Yield: An important factor for total return.
And the ten stocks selected for the initial portfolio (in no particular order) were:
IBM (International Business Machines Corp.): A leader in the burgeoning field of personal computers and technology.
Merck & Co.: A major player in the pharmaceutical industry.
Coca-Cola: Benefiting from increased consumer spending.
General Electric (GE): A diversified conglomerate with interests in technology and consumer goods.
Intel: A key player in the semiconductor industry.
Procter & Gamble: A leading company in consumer goods.
3M: Known for its innovation and diverse product line.
McDonald’s: Reflecting the growth in the fast-food industry.
Walmart: Rapidly expanding in the retail sector.
Exxon: Representing the energy sector, which was vital in the 1980s.
With each year’s request, ChatGPT-4 provided the economic context and key sectors for growth along with its top ten selections. However, the lack of stock selection criteria led me to inquire into its absence. ChatGPT-4 responded that “the same primary stock selection criteria are consistently applied” each year and that, “These criteria provide a balanced approach, aiming to ensure a mix of stability, growth potential, and diversification across various sectors.”
Disclaimer: All statements, viewpoints, and data featured in the publications are exclusively those of the individual author(s) and contributor(s), not of MFI and/or its editor(s). MFI and/or the editor(s) absolve themselves of any liability for harm to individuals or property that might arise from any concepts, methods, instructions, or products mentioned in the content.
Footnotes
https://finance.yahoo.com/quote/IBM/history?period1=488505600&period2=489542400&interval= 1d&filter=history&frequency=1d&includeAdjustedClose=true.
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data library.html.
Calculated by adding the variables labeled Rm − Rf and Rf..
Its IPO was in March 1986, explaining its exclusion from 1985 and 1986 portfolios. The only other year of exclusion was 2008, during the financial crisis of that period..
CAPM and FF3 are the main models being analyzed. FF5 and FF3m results are included as robustness checks on the main results.