Skip to Content

How Random Forest Models Power Vehicle Valuation

A deep dive into the machine-learning engine behind our platform.

At Mutuka Automotive, our valuation engine is built on a Random Forest regression model trained on 201 real South African market vehicles. Here is how it works.

What is a Random Forest?

A Random Forest is an ensemble machine-learning technique that builds hundreds of decision trees on random subsets of the training data, then averages their predictions. This approach reduces overfitting and handles the non-linear relationships between vehicle specifications and price far better than a simple linear regression.

Our Feature Set

We train on 10 core features: engine displacement (cc), horsepower (hp), number of cylinders, fuel economy (city and highway MPG), drive wheels, body style, kerb weight, and bore ratio. These features capture the mechanical substance of a vehicle that drives its market value.

Brand Correction Layer

Raw RF predictions are then adjusted by a per-make percentage correction factor derived from residual analysis. German premium brands (BMW, Mercedes-Benz, Audi) attract a market premium in South Africa that the global training data partially understates. Our correction layer accounts for this.

Results

The final model achieves an R² of approximately 0.87 on the held-out test set, meaning it explains 87% of the variance in vehicle prices — well above the benchmark linear model (R² ≈ 0.71).

Understanding South African Vehicle Price Bands
Economy, Mid-Range, Premium, or Luxury — what band does your vehicle fall into?