A trick to ignore a data in categorical cross entropy of keras

When there’s a data point that you want to ignore in loss computation, you can use ignore_class parameter in tf.keras.losses.SparseCategoricalCrossentropy. But the same parameter doesn’t exist in tf.keras.losses.CategoricalCrossentropy. I don’t know why, but that’s troublesome when there’s needs. Even the SparseCategoricalCrossEntropy’s ignore_class isn’t easy to use either since it requires one to add a class just to ignore.

One trick that works for categorical cross entropy is using one hot encoding while setting all values of y_true to zero.

from tensorflow.keras.losses import CategoricalCrossentropy

loss = CategoricalCrossentropy()
print(loss([[1, 0, 0]], [[1.0, 0.0, 0.0]]).numpy())
print(loss([[1, 0, 0]], [[1.0, 0.0, 0.0]]).numpy())

Above code prints:

1.192093e-07
1.192093e-07

But if all values in y_true are zero, loss is zero:

print(loss([[0, 0, 0]], [[1.0, 0.0, 0.0]]).numpy())

It’s obvious if you think about it. Loss = sum(-p(x)log(q(x)) where p(x) is y_true and q(x) is y_pred. If all p(x) is zero, Loss = sum(-0 * log(q(x)) = 0. This works for any loss who’s using similar form of loss equation.

This is a neat trick, and it looks like many know about it. But this is hardly mentioned properly, so I’m adding a post to spread the knowledge.