Abstract
Deep learning models are traditionally used in big data scenarios. When there is not enough training data to fit a large model, transfer learning re-purpose the learned features from an existing model and re-train the lower layers for the new task. Bayesian inference techniques can be used to capture the uncertainty of the new model but it comes with a high computational cost. In this paper, the run time performance of an Stochastic Gradient Markov Chain Monte Carlo method using two different architectures is compared, namely GPU and multi-core CPU. As opposed to the widely usage of GPUs for deep learning, significant advantages from using modern CPU architectures.