compute_total_loss

You’re here because you can’t get the unit tests for the Coursera Improving Deep Neural Networks course, Week 3, exercise number 1, to pass. You are trying to correctly implement

def compute_total_loss(logits, labels):

And the unit test keeps failing with an off-by-a-large number (i.e. not the usual “shapes are incompatible” error).

Sorry, this post isn’t going to just give you the answer.

But, since Coursera (instructions and code) and Tensorflow (documentation) have come together to make an incredibly frustrating riddle, this will provide hints on where they went wrong, and thus where you went wrong.

Another guess on why you’re here: you entered “tf.keras.losses.categorical_crossentropy example” into a search engine, and followed the link tf.keras.losses.CategoricalCrossEntropy, and that page shows the first argument as “from_logits=False,”. And you just figured that was close enough.

Yet another guess on why you’re here: you read the instructions – “It’s important to note that the “y_pred” and “y_true” inputs of tf.keras.losses.categorical_crossentropy are expected to be of shape (number of examples, num_classes)”. And you noticed the code gives you “logits” and “labels”, in that order. So Coursera has said “y_pred/y_true”, then said “logits/labels”, and the first documentation link doesn’t even have the first two arguments. But even with all of these naming mis-matches, you still entered the arguments in the order given (implied?) by the instructions.

Final clue: use the link provided by the instructions – tf.keras.losses.categorical_crossentropy and pay attention to the order of the arguments.

Then, channel Jeff Atwood on the “hard things in computer science” and write your own post about how using poor names can cause a ton of problems…

This entry was posted in Machine Learning. Bookmark the permalink.