Visualize activations functions using keras

This post was kindly contributed by SAS Programming for Data Mining - go there to comment and to read the full post.

In keras, we can visualize activation functions‘ geometric properties using backend functions over layers of a model.

We all know the exact function of popular activation functions such as ‘sigmoid‘, ‘tanh‘, ‘relu‘, etc, and we can feed data to these functions to directly obtain their output. But how to do that via keras without explicitly specifying their functional forms?

This can be done following the four steps below:

1. define a simple MLP model with a one dimension input data, a one neuron dense network as the hidden layer, and the output layer will have a ‘linear‘ activation function for one neuron.
2. Extract layers’ output of the model (fitted or not) via iterating through model.layers
3. Using backend function K.function() to obtain calculated output for a given input data
4. Feed desired data to the above functions to obtain the output from appropriate activation function.

The code below is a demo:





from keras.layers import Dense, Activation
from keras.models import Sequential
import keras.backend as K
import numpy as np
import matplotlib.pyplot as plt

def NNmodel(activationFunc='linear'):
'''
Define a neural network which can be arbitrary but we keep it simple here
'''
model = Sequential()
model.add(Dense(1, input_shape=(1,), activation=activationFunc))

model.add(Dense(1, activation='linear'))
#model.summary()
model.compile(loss='mse', optimizer='adagrad', metrics=['mse'])
return model

def VisualActivation(activationFunc='relu', plot=True):
x = (np.arange(100)-50)/10
y = np.log(x+51/10)

model = NNmodel(activationFunc = activationFunc)
# optional to fit the model. If not fitted, initialized parameters will be used
model.fit(x, x, epochs=1, batch_size=128, verbose=0)

# define the computing process
inX = model.input
outputs = [layer.output for layer in model.layers]
functions = [K.function([inX], [out]) for out in outputs]

# compute based on original inputs
activationLayer={}
for i in range(100):
test = x[i].reshape(-1, 1)
layer_outs = [func([test]) for func in functions]
activationLayer[i] = layer_outs[0][0][0][0]

# process results
activationDf = pd.DataFrame.from_dict(activationLayer, orient='index')
result=pd.concat([pd.DataFrame(x), activationDf], axis=1)
result.columns=['X', 'Activated']
result.set_index('X', inplace=True)
if plot:
result.plot(title=f)

return result


# Now we can visualize them (assuming default settings) :
actFuncs = ['sigmoid', 'tanh', 'hard_sigmoid', 'softplus', 'selu', 'elu']

figure = plt.figure()
for i, f in enumerate(actFuncs):
figure.add_subplot(3, 2, i+1)
out=VisualActivation(activationFunc=f, plot=False)
plt.plot(out.index, out.Activated)
plt.title(f)


This figure is the output from above code. As we can see, the geometric property of each activation function is well captured.

This post was kindly contributed by SAS Programming for Data Mining - go there to comment and to read the full post.