Training deep neural networks can be a challenging task, especially for very deep models. A major part of this difficulty is due to the instability of the gradients computed via backpropagation. In this post, we will learn how to create a self-normalizing deep feed-forward neural network using Keras. This will solve the gradient instability issue, speeding up training convergence, and improving model performance.
Disclaimer: This article is a brief summary with focus on implementation. Please read the cited papers for full details and mathematical justification (link in sources section).
In their 2010 landmark paper, Xavier Glorot and Yoshua Bengio provided…
Training a Deep Neural Network can be a challenging task. The large number of parameters to fit can make these models especially prone to overfitting. Training times in the range of days or weeks can be common, depending on the model complexity, available compute resources, and task to learn. If the available resources are limited, avoiding extra computations and long training times is a high priority. I will present a technique you can follow to ensure you initialize and adapt your learning rate correctly. This will help achieve strong task performance and avoid longer than necessary training times.
Training Machine Learning models for a web app with ML functionalities is only part of the entire project’s development scope. One often overlooked aspect is going beyond the sandbox and into a production environment. In this article, I will demonstrate how to easily serve a TensorFlow model via a prediction service using Google Cloud Platform (GCP) AI Platform and Cloud Functions. Afterward, I will show how to deploy and host the web client using Firebase to query the model using HTTP requests.
The final project architecture will look similar to the figure below:
Working with very large datasets is a common scenario in Machine Learning Engineering. Usually, these datasets will be too large to fit in memory. This means the data must be retrieved from disk on the fly during training.
Disk access is multiple orders of magnitude slower than memory access, so efficient retrieval is of high priority.
In this post, we will learn how to create a simple yet powerful input pipeline to efficiently load and preprocess a dataset using the
Hi, I’m Jonathan. I’m interested in Bayesian methods, causal reasoning, and Applied Machine Learning/Machine Learning Engineering. Feel free to reach out!