TensorFlow2 compute gradient w.r.t a matrix

I'm currently using TF2 to compute gradient. Say, my NN's output is a matrix of size 10x10, and I would like to know how to compute the gradient for single element of that matrix.

Here's an example:

 with tf.GradientTape() as g:
     g.watch(nn_model.trainable_variables)
     result = nn_model.output_result(target)
     # and I construct a matrix using the result we get
     matrix = foo(result)

And I know the gradient for entry, [0,0] for example, can be obtained by

 with tf.GradientTape(persistent=True) as g:
     g.watch(nn_model.trainable_variables)
     result = nn_model.output_result(target)
     # and I construct a matrix using the result we get
     matrix = foo(result)
     entry = matrix[0][0]
grad = g.gradient(entry,nn_model.trainable_variables)

However, for matrix of size 10x10, I have to use persistent option and my GPU memory is not enough. I can now get the whole gradient without my GPU memory excessed:

 with tf.GradientTape(persistent=True) as g:
     g.watch(nn_model.trainable_variables)
     result = nn_model.output_result(target)
     # and I construct a matrix using the result we get
     matrix = foo(result)
grad_ = g.gradient(matrix,nn_model.trainable_variables)

And this gives a list of 10, which matches the model's layers. The question now is, how can I obtain the gradient for all entrys from grad_?

I'm using TF2 btw.

🔴 No definitive solution yet