At Mutuka Automotive, our valuation engine is built on a Random Forest regression model trained on 201 real South African market vehicles. Here is how it works.
What is a Random Forest?
A Random Forest is an ensemble machine-learning technique that builds hundreds of decision trees on random subsets of the training data, then averages their predictions. This approach reduces overfitting and handles the non-linear relationships between vehicle specifications and price far better than a simple linear regression.
Our Feature Set
We train on 10 core features: engine displacement (cc), horsepower (hp), number of cylinders, fuel economy (city and highway MPG), drive wheels, body style, kerb weight, and bore ratio. These features capture the mechanical substance of a vehicle that drives its market value.
Brand Correction Layer
Raw RF predictions are then adjusted by a per-make percentage correction factor derived from residual analysis. German premium brands (BMW, Mercedes-Benz, Audi) attract a market premium in South Africa that the global training data partially understates. Our correction layer accounts for this.
Results
The final model achieves an R² of approximately 0.87 on the held-out test set, meaning it explains 87% of the variance in vehicle prices — well above the benchmark linear model (R² ≈ 0.71).