Ensemble Model in SAS EM: Voltron, Defender of the Universe


Ensemble models are a combination of two or more models, which are likely to perform better than just one model. Ensemble models humorously reminded me of the Voltron (1980’s Animated TV series). In Voltron, individual astronauts operated lion-shaped robots with different features, and they’d combine to create a super robot known as ‘Voltron’. Each lion robot made up different parts of the super robot with its own force/strengths as an Ensemble.

As John Parsons mentioned in an earlier post, the flexibility of combining models truly stands out while developing ensembles. Multiple models could be generated from one algorithm with different data samples, or different algorithms on the same data samples, or one sample and algorithm with different tuning parameters. The options to manipulate the models seem unlimited. Maybe it was the kid in me (Or just “adult” self) because as I was working through the assignments, I enjoyed the aspect of the different models thought of as different lions that would join forces to create a super model. The flexibility of the SAS EM’s ensemble node allows different tuning parameters that made model development more engaging for a new user like myself.

The ensemble node combines posterior probabilities from multiple preceding models. The results from models are combined in three different methods on posterior probabilities; Average, Maximum and Voting. Even with all the different approaches to generating the model, an ensemble is only enhanced if the separate models have some type of variability or disagreement. In comparison to Voltron, each lion had its own strong feature or specialty such as aqua or terrain. One can easily visualize the models that are combined such as support vector machines, regressions, random forests, and neural networks.

Some areas of ensembles that could be improved by including better assessment and analysis. The model comparison node on ensembles could be optimized by offering automated gauges for some tuning parameters that are unsupervised. Imagine multiple neural networks with different hidden layers (3,10,and 100) as a user tries to obtain the optimal number of hidden layers. The model comparison node could offer an optimal number that helps the user narrow down the optimal hidden layers.

The process diagram provides a nice visual for a user but for someone who is new to this method, ensembles are somewhat of a ‘blackbox’ method when combining models. It would be great to see the process of voting or averaging posterior probabilities in a visual manner that represents even a grouped version of the process in stages or iterations. In essence, I’d like more insight in the actual computation taking place within the ensemble development. It would be great to see which models within the ensemble are foundational in delivering optimal performance value for the target variable.

Joining forces like to create a better force like Voltron is a concept we could all use at some point in our lives. Models may not agree with each other but ultimately generate a super model with less bias and variability. Overall ensemble modeling with SAS EM is done with ease of drop-down options and drag and drop compared to manually programming each algorithm and tuning parameter.

Regards,

Mwalimu Phiri

References

SAS Institute Inc. (2015). SAS.com. Retrieved from Ensemble Modeling For Machine Learning: https://www.sas.com/en_us/webinars/ensemble-model/download.html#formsuccess

Sunil, R. (2015, September 30). 5 Easy questions on Ensemble Modeling everyone should know. Retrieved from AnalyticsVidhya.com: https://www.analyticsvidhya.com/blog/2015/09/questions-ensemble-modeling/

Wujek, Brett. (2015, October 6). Ensemble-modeling-for-machine-learning-it-just-makes-sense. Retrieved from SAS.com/Content/Subconsciousmusings: http://blogs.sas.com/content/subconsciousmusings/2015/10/06/ensemble-modeling-for-machine-learning-it-just-makes-sense/

Comments

Popular Posts