Synthetic NYSE Investment History
Stock & Market Data
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
A rich collection of over 400,000 simulated investment transactions, designed specifically for training AI models in financial prediction. The underlying data integrates random stock investments from the New York Stock Exchange market over a decade with crucial financial indicators and calculated volatilities. Each record includes a binary classification indicating whether the transaction resulted in a "GOOD" or "BAD" investment outcome. This resource is excellent for developing sophisticated classification and regression models focused on identifying drivers of investment success and portfolio management strategies.
Columns
The dataset includes 25 distinct fields detailing transaction specifics and financial health metrics:
- ID: Unique identifier for the simulated investment record.
- company: The ticker acronym for the stock (e.g., AMZN, M).
- sector: The industry sector of the company (e.g., RETAIL, TECH).
- horizon (days): The elapsed time, in days, between the date of purchase and the date of sale.
- amount: The monetary value invested in pounds sterling.
- date_BUY_fix / date_SELL_fix: The specific purchase and sale dates for the transaction.
- price_BUY / price_SELL: The stock price at the time of purchase and sale, respectively.
- Volatility_Buy / Volatility_sell: Measures of price fluctuation observed when the stock was bought and sold.
- Sharpe Ratio: A measure of the investment's return adjusted for its risk.
- expected_return (yearly): The annual anticipated rate of return.
- inflation: The rate of inflation experienced during the investment horizon.
- nominal_return: The raw return achieved prior to adjusting for inflation.
- investment: A categorical label classifying the transaction outcome as 'GOOD' or 'BAD'.
- ESG_ranking: Environmental, Social, and Governance score, relevant for sustainable investing models.
- PE_ratio / EPS_ratio / PS_ratio / PB_ratio: Key valuation ratios (Price-to-Earnings, Earnings Per Share, Price-to-Sales, Price-to-Book).
- NetProfitMargin_ratio: The yearly net profit percentage of the company.
- current_ratio: A measure of liquidity derived from the most recent profit report.
- roa_ratio / roe_ratio: Profitability ratios (Return on Assets, Return on Equity).
Distribution
The data is delivered in a single file,
final_transactions_dataset.csv, with a total size of 97.17 MB. It contains 25 columns and captures over 400,000 individual investment records. The data is structured around buy and sell transactions, tracking investment amounts ranging from £50 up to £50,000, and covering horizons up to 720 days.Usage
This resource is perfectly suited for developing advanced algorithms in finance and quantitative analysis. Ideal use cases include:
- Building machine learning models to predict future investment outcomes based on volatility and fundamental financial ratios.
- Conducting deep statistical analysis on how financial health metrics (like ROE and PE ratio) correlate with short and long-term stock performance.
- Training AI systems designed for automated trading strategies or risk assessment in the stock market.
- Simulating the impact of investment horizon and initial volatility on nominal returns.
Coverage
The investment simulation is focused geographically on the New York Stock Exchange. The underlying market data spans a decade, with specific simulated purchase dates falling between October 2013 and October 2018. Sale dates extend through to September 2020. The included transactions represent various corporate sectors, including prominent representation from RETAIL and TECH industries. Data is expected to be updated annually.
License
CC0: Public Domain
Who Can Use It
- Data Scientists: Utilising the labelled 'investment' column to solve binary classification problems in finance.
- Quantitative Researchers: Analysing the influence of factors like inflation and Sharpe ratios on simulated portfolio returns.
- Financial Technology Developers: Creating proof-of-concept models for automated investment advising.
- Students and Academics: Exploring the fundamentals of stock valuation and market efficiency, specifically engaging with the core motivation of increasing youth involvement in investing.
Dataset Name Suggestions
- NYSE AI Investment Predictor
- Financial Ratios and Volatility Dataset
- AI Stock Transaction Simulator
- Synthetic NYSE Investment History
Attributes
Original Data Source: Synthetic NYSE Investment History
Loading...
