One of the most powerful tool that gets ignored

Amit Nikhade
6 min readMay 5, 2023

--

A powerful machine learning technique that merges neural networks and Bayesian statistics to make predictions that are both accurate and reliable.

Originally published on amitnikhade.com

Imagine you want to predict whether your friend will come to your party. You might base your prediction on past events, how well you know them, and whether they have any other plans for that day.

Similarly, a Bayesian neural network is a type of machine learning model that incorporates both data and prior beliefs or assumptions to make more accurate predictions. The “Bayesian” aspect of the name comes from Bayesian statistics, which uses probabilities to represent uncertainty.

For example, if you wanted to predict the price of a house based on factors like its size, location, and other features, a regular neural network would analyze the data and attempt to make a prediction based on patterns it discovers. In contrast, a Bayesian neural network would also consider prior beliefs, such as the average home prices in the area or the overall state of the real estate market.

By blending data with prior beliefs in this way, a Bayesian neural network can create more precise predictions than a regular neural network. Additionally, it can adjust its prior beliefs as it acquires new information, which enables it to adapt to changing circumstances and make even more accurate forecasts over time.

In a bit more technical way

Bayesian neural networks are a type of machine learning model that combines two approaches: neural networks and Bayesian inference. Neural networks are useful for modeling complex relationships in data, while Bayesian inference is used to incorporate prior knowledge and uncertainty into the model.

In a Bayesian neural network, each weight and bias parameter in the neural network is assigned a prior probability distribution. As new data is presented to the network, these distributions are updated to reflect the new knowledge gained from the data.

During prediction, the network generates a probability distribution over the output variable(s) that reflects the model’s uncertainty about the prediction. This distribution can be used to estimate the confidence of the prediction.

By incorporating prior knowledge and uncertainty into the model, Bayesian neural networks can produce more accurate predictions and provide a better understanding of the data. However, training and prediction with these models can be more complex and time-consuming than simpler models.

Differentiating between ANN and BNN

Artificial Neural Network
Bayesian Neural Network

Certainly! Bayesian Neural Networks (BNNs) and Artificial Neural Networks (ANNs) are two types of machine learning models that use neural networks. The key difference between them is how they handle uncertainty.

ANNs assume that the weights and biases of the network are fixed and learned through a process called backpropagation. In contrast, BNNs treat the weights and biases as random variables with probability distributions. This means that BNNs can make predictions that take into account the uncertainty in the weights and biases of the network.

To put it simply, ANNs have fixed values for their weights and biases, while BNNs have a range of possible values for these parameters. This allows BNNs to provide probabilistic predictions that come with a measure of uncertainty.

BNNs and ANNs also differ in their approach to training. ANNs are typically trained using backpropagation, while BNNs are trained using Bayesian inference. Bayesian inference involves finding the probability distribution over the weights and biases that best explains the observed data.

Another difference is that ANNs often use regularization techniques to prevent overfitting, while BNNs incorporate regularization naturally through the use of priors on the weights and biases. BNNs also tend to be more computationally expensive than ANNs, but recent advances in approximate Bayesian inference techniques have made BNNs more practical for real-world applications.

Advantages and limitations

BNNs are good at predicting uncertainty because they model uncertainty in the network’s weights and biases by assigning them probability distributions. This means that BNNs can provide probabilistic predictions and give estimates of uncertainty. This is useful in areas like medicine and finance, where uncertainty is an important factor.

BNNs also have the advantage of incorporating regularization naturally. This means that they can reduce overfitting and improve generalization without needing manual adjustments to their settings.

BNNs can also be more flexible than ANNs because they can incorporate prior knowledge about the distribution of weights and biases. This can help improve the accuracy of the model.

However, there are also some disadvantages to using BNNs. They can be computationally expensive because Bayesian inference requires sampling from probability distributions. This can make training and inference slower.

Another disadvantage of BNNs is that they can be harder to interpret than ANNs. This is because BNNs produce probability distributions rather than deterministic predictions. As a result, it can be more challenging to understand why a particular prediction was made.

Finally, BNNs can also become more difficult to scale to larger datasets or more complex models, as the computation required to sample from probability distributions can become prohibitively expensive.

Let’s try using it

pip install torchbnn        #install torchbnn package

#https://github.com/Harry24k/bayesian-neural-network-pytorch

Import packages

import numpy as np
from sklearn import datasets

import torch
import torch.nn as nn
import torch.optim as optim

import torchbnn as bnn

import matplotlib.pyplot as plt
%matplotlib inline

Load Iris Dataset

iris = datasets.load_iris()

X = iris.data
Y = iris.target

x, y = torch.from_numpy(X).float(), torch.from_numpy(Y).long()
print(x.shape, y.shape)

Define Model, Loss and optimizer

model = nn.Sequential(
bnn.BayesLinear(prior_mu=0, prior_sigma=0.1, in_features=4, out_features=100),
nn.ReLU(),
bnn.BayesLinear(prior_mu=0, prior_sigma=0.1, in_features=100, out_features=3),
)

ce_loss = nn.CrossEntropyLoss()
kl_loss = bnn.BKLLoss(reduction='mean', last_layer_only=False)
kl_weight = 0.01

optimizer = optim.Adam(model.parameters(), lr=0.01)

Training the Model

kl_weight = 0.1

for step in range(3000):
pre = model(x)
ce = ce_loss(pre, y)
kl = kl_loss(model)
cost = ce + kl_weight*kl

optimizer.zero_grad()
cost.backward()
optimizer.step()

_, predicted = torch.max(pre.data, 1)
total = y.size(0)
correct = (predicted == y).sum()
print('- Accuracy: %f %%' % (100 * float(correct) / total))
print('- CE : %2.2f, KL : %2.2f' % (ce.item(), kl.item()))

______________________________________________________________

#- Accuracy: 98.000000 %
#- CE : 0.09, KL : 1.29

Testing the trained model

def draw_plot(predicted) :
fig = plt.figure(figsize = (16, 5))

ax1 = fig.add_subplot(1, 2, 1)
ax2 = fig.add_subplot(1, 2, 2)

z1_plot = ax1.scatter(X[:, 0], X[:, 1], c = Y)
z2_plot = ax2.scatter(X[:, 0], X[:, 1], c = predicted)

plt.colorbar(z1_plot,ax=ax1)
plt.colorbar(z2_plot,ax=ax2)

ax1.set_title("REAL")
ax2.set_title("PREDICT")

plt.show()

pre = model(x)
_, predicted = torch.max(pre.data, 1)
draw_plot(predicted)
Real Data plot
Predicted data plot

When to use BNN’s

Bayesian Neural Networks (BNNs) are a type of neural network that can be particularly useful in scenarios where there is limited training data or where incorrect predictions could have serious consequences.

BNNs provide a way to incorporate prior knowledge into the model and make more robust predictions. This is especially important in situations where there is little data to work with. Additionally, BNNs provide a probabilistic output, which can help identify areas of high uncertainty, allowing for more informed decision-making.

Another scenario where BNNs can be useful is in safety-critical applications, such as in medical diagnosis or autonomous vehicles. Here, BNNs can provide a way to quantify the uncertainty in predictions and help to avoid catastrophic errors.

BNNs can also be useful in active learning scenarios, where the model is trained iteratively on small batches of data. In such situations, the uncertainty estimates provided by the BNN can guide the selection of new data points, maximizing the learning efficiency.

BNNs can be useful in detecting when inputs are outside of the distribution on which they were trained. This can help in identifying adversarial attacks or other scenarios where the model is being presented with data that is significantly different from what it was trained on.

Originally published on amitnikhade.com

Lets get connected: linkedIn

References

--

--