Dividend Prediction with LSTM

Forecasting future dividends using a sequence model (LSTM) on historical market data. Emphasis on clean preprocessing and time-aware evaluation.

Role: Data Scientist Stack: Python, pandas, NumPy, TensorFlow/Keras Source: yfinance Task: Time-series regression

Overview

Built an end-to-end pipeline that fetches market data, prepares supervised sequences, and trains an LSTM to predict upcoming dividend values. The focus was on reliable preprocessing and honest, time-respecting validation.

Data Cleaning

  • Loaded price/dividend data with yfinance; unified to a single datetime index.
  • Handled missing values and ensured the series were sorted and aligned by date.
  • Constructed the target dividend series and filtered to the relevant ticker/time span.

Exploratory Analysis

  • Plotted dividend history and matching price series to understand payout patterns.
  • Reviewed basic descriptive statistics and missing-value patterns.

Modeling

  • Framed the problem as supervised sequences (look-back window → next dividend value).
  • Created training samples by sliding windows over the historical series.
  • Split data by time (train → validation/test) with no shuffling.
  • Implemented an LSTM in Keras and trained on the prepared sequences.
  • Evaluated on a held-out time segment and visualized predictions vs actuals.

What I Focused On

ReliabilityClean alignment and consistent indexing
ReproducibilityDeterministic splits and saved seeds
ReadabilityClear plots comparing predicted vs. true dividends