Key Metrics
Business Context
Problem & Objective
Business Question
Can a company's financial indicators help anticipate whether its stock will experience a positive price variation, while simultaneously revealing groups of companies with similar financial behaviors?
Business Value
This project can function as an initial financial screening tool. It does not aim to predict exact prices, but to classify companies based on financial signals that could be associated with positive or negative price variations. It supports investment analysis processes, company prioritization, and exploration of financial profiles by sector.
Methodology
STAR Framework
Situation
Database of 11,566 records with 173 financial variables of US stocks (2014–2018) across 11 sectors. Need to evaluate whether these indicators can classify price variations.
Task
Build an analytical pipeline to classify stocks with positive/negative variation using supervised models and complement with unsupervised segmentation.
Action
Data preparation, scaling, One-Hot encoding of Sector, temporal split 70/15/15. Comparison of Logistic Regression, Random Forest, SVM RBF, and XGBoost. Segmentation with PCA, t-SNE, UMAP, and K-Means.
Result
Random Forest achieved AUC ~0.62. XGBoost offered better F1 balance (~0.22). t-SNE/UMAP clusters revealed consistent groups of financial profiles.
Dataset
Data Overview
Dataset Information
- Records: 11,566
- Variables: 173 financial indicators
- Period: 2014–2018
- Sectors: 11 categories
- Target:
Class(positive/negative variation) - Reference:
Price Var(continuous)
Class Distribution
- Class 1 (Positive): ~56.3%
- Class 0 (Negative): ~43.7%
- Relatively balanced distribution
Note: Price Var behavior changes significantly by year — 2017 shows negative average, while 2015, 2016, and 2018 show positive averages.
Process
Methodology
Price Var and Class to avoid data leakage. Scaling of numerical variables. One-Hot encoding for Sector.
Price Var deciles.
Findings
Key Insights
Visualizations
Recommended Portfolio Visuals
Project Presentation
Google Slides Presentation
A detailed walkthrough of the analysis, methodology, and key findings using the STAR framework.
Limitations
- The stock market is influenced by external factors not always reflected in financial statements.
- AUC ~0.62 shows the model identifies signals but has moderate predictive capacity.
-
Sectorappears numerically encoded — final version should map to real sector names. - SHAP or feature importance recommended to improve model interpretability.
Next Steps
- Map real sector names.
- Add confusion matrix and ROC curve.
- Use SHAP for XGBoost explainability.
- Build interactive dashboard with Tableau or Power BI.
- Publish clean notebook on GitHub with README, dataset, and visuals.