PLAIcraft Newsletter Subscription — KNN Classification
UBC DSCI 100 group project predicting newsletter subscription from gameplay minutes and age. Emphasis on careful data cleaning, simple EDA, and time-respecting evaluation.
Overview
We trained a K-nearest neighbors classifier to predict whether a PLAIcraft player subscribed to the newsletter using two predictors: minutes played and age. The project focused on transparent preprocessing and evaluation.
Data Cleaning
- Loaded
players.csv
and selectedsubscribe
, minutes played, andAge
. - Coerced
subscribe
to a factor; removed rows with missing values in selected fields. - Created a 75/25 train–test split stratified on
subscribe
.
Exploratory Analysis
- Plotted distributions of minutes played and age.
- Reviewed the proportion of subscribers vs. non-subscribers (class balance).
Modeling
- Framed the task as K-NN classification with the two predictors.
- Tuned K over candidate values; selected K = 21.
- Evaluated on the held-out test split; inspected confusion matrix and summary metrics.
Results (Test)
- Accuracy ≈ 75%
- Precision ≈ 100% (skewed toward positives)
- Recall ≈ 8% (very low)
Takeaways
SignalMinutes & age weakly predictive of subscription
BiasModel biased toward predicting “subscribed”
NextAdd behavior features; address class imbalance
Screenshots

