CIRA GUIDE TO CUSTOM LOSS FUNCTIONS
Items to note:
•
The inputs to the loss function are always two tensors,
y_true
and
y_pred
, in that order.
y_true
represents
the correct output (sometimes called the “label” or “ground truth”), while
y_pred
represents the prediction
generated by the neural network for that sample. You can choose your own variable names for the input
tensors, but always make sure that the label (ground truth) is the first input and has a telling variable name, in
order to avoid confusion.
•
Note that
y_true
and
y_pred
represent the tensors for an entire batch! We discuss the implications of this
fact in detail in Section 4.2.
•
In the above examples the math operations
sqrt
,
square
, and
reduce_mean
are all TensorFlow functions.
We discuss the types of functions one can use in loss functions in Section 6.
Linking the loss function to a model
The loss function is linked to the model using the model.compile call, as shown in the example below.
model.compile(optimizer=keras.optimizers.Adam(), loss=loss_MSE, metrics=['accuracy'])
Note that there are no quotation marks placed around the function name,
loss_MSE
, above. The lack of quotes tells
Keras that this is a custom loss function, rather than a built-in loss function. In contrast, to use the built-in loss function
for MSE, we would call the corresponding function with quotes:
model.compile(optimizer=keras.optimizers.Adam(), loss='mean_squared_error', metrics=['accuracy'])
The metric assigned above, accuracy, also refers to a built-in function, as it is also called with quotes.
Custom loss function to help with class imbalance
The previous example was not very exciting, as MSE is available as a standard loss function anyway. However, we can
now create custom functions that help us with specific circumstances. For example, let us consider an application, such
as predicting rainfall, where the great majority of output values is small and only very few values are large. Since the
small values are much more common, the NN can achieve very high performance without ever getting the high values
correctly. In other words, because there are only few samples with high values, with a standard loss function the NN
might get away with always predicting low values.
There are many ways to deal with that problem, including creating a more balanced data set. Alternatively, we can
address this by a custom loss function that penalizes the NN more whenever it gets high values wrong. For example,
we can take the standard MSE function and multiply each individual error term by a weight factor that increases
exponentially with the true value. Here is an example that uses e
(5y
true
)
as the weight:
# Loss function with weights based on amplitude of y_true
def my_MSE_weighted(y_true,y_pred):
return K.mean(
tf.multiply(
tf.exp(tf.multiply(5.0, y_true)),
tf.square(tf.subtract(y_pred, y_true))
)
)
This loss function assigns different weights based on different amplitudes of y_true:
loss_M SE_weighted(y
true
, y
pred
) = mean
i∈I
e
(5 y
i
true
)
· (y
i
pred
− y
i
true
)
2
,
This is a very simple custom loss function but can already be quite useful.
3.4 How to save and load a model that has a custom loss or metric
When saving a NN model, unfortunately, custom metrics and loss functions are not stored in the model file. Thus, it is
necessary to supply the custom functions when loading the model.
Furthermore, parameters supplied to the custom functions are not automatically stored, either. There are two ways to
deal with this. One is a manual solution: keep track of the parameters (e.g., in configuration files) and supply them
explicitly after the model is loaded. The more elegant solution is to embed the loss function in a class, which makes the
7