Delong, Ł., Kozak, A., 2021, The use of autoencoders for training neural networks with mixed categorical and numerical features
Working Paper, 01-11-2021
We focus on modelling categorical features and improving predictive power of neural networks with mixed categorical and numerical features. First, we study regular and denoising autoencoders for categorical features in unsupervised learning problems. Second, we discuss possible architectures of neural networks in supervised learning problems which differ in the way categorical features are concatenated with numerical features. Third, we investigate a learning algorithm where we initialize parameters of a neural network in subsequent layers with representations of inputs learned with autoencoders for categorical and numerical data. We illustrate our techniques on a real data set with claim numbers. We conclude that our new architecture of a neural network initialized with parameters derived from autoencoders and a joint embedding for all categorical features performs better, in terms of predictive power, than the classical architecture with random initialization of parameters and separate entity embeddings for each categorical feature.