Tensorflow - part 2: Ragged tensors and tf.variable

In this post, we will tell about ragged tensors and tf.Variable.

First, supposing that you have installed Tensorflow, let's import it.

import numpy as np # Numpy array is often used together with tensors,
                   # so you need to include it as well. 
import tensorflow as tf
print(tf.__version__) # You can also use this line to check its installed version

Ragged tensors

All the tensors in the previous posts have a square or rectangular shape. In the real world, we do not just deal with these types of data but sometimes the elements in a tensor have variation in length (e.g.: a batch of sentences with different lengths,...). Ragged tensors are created for this problem.

Create an int ragged tensor and an string ragged tensor

Try creating two ragged tensors. One is a ragged tensor of type int and the other is of type string. tf.ragged.constant receives a list of sublists as input. Each sublist can contain integer elements, float elements, or string elements. Sublists can have different lengths.

tensor_ragged_int = tf.ragged.constant([[1, 2, 3, 4, 5], [6, 7], [], [8, 9, 10], [11]])
tensor_ragged_str = tf.ragged.constant([["This", "is", "a", "string"], ["no", "strings"], ["an", "another", "string"]])
print(tensor_ragged_int)
print(tensor_ragged_str)

Output

<tf.RaggedTensor [[1, 2, 3, 4, 5], [6, 7], [], [8, 9, 10], [11]]>
<tf.RaggedTensor [[b'This', b'is', b'a', b'string'], [b'no', b'strings'], [b'an', b'another', b'string']]>

You can see that all the sub tensors have different lengths.

Functions that work with normal tensor also work with ragged tensor

For an int ragged tensor, we can:

  • add or multiply it with a constant number
  • calculate the mean of its elements
  • concatenate it with another tensor
  • ...
print(tf.add(tensor_ragged_int, 2)) # Add each element by 2
print(tf.multiply(tensor_ragged_int, 10)) # Multiply each element by 10
print(tf.reduce_mean(tensor_ragged_int, axis=1)) # Calculate mean for each sub tensor
print(tf.concat([tensor_ragged_int, [[12, 13]]], axis=0)) # Concatenate the int ragged tensor with another tensor
print(tf.tile(tensor_ragged_int, [1, 2])) # Tiling the int ragged tensor
print(tf.map_fn(tf.math.square, tensor_ragged_int)) # Map each element to a square version of it
print(tensor_ragged_int[0]) # Get first row
print(tensor_ragged_int[:, :2]) # Get the 2 first values in each row of dimension 1

Output

<tf.RaggedTensor [[3, 4, 5, 6, 7], [8, 9], [], [10, 11, 12], [13]]>
<tf.RaggedTensor [[10, 20, 30, 40, 50], [60, 70], [], [80, 90, 100], [110]]>
tf.Tensor([ 3.   6.5  nan  9.  11. ], shape=(5,), dtype=float64)
<tf.RaggedTensor [[1, 2, 3, 4, 5], [6, 7], [], [8, 9, 10], [11], [12, 13]]>
<tf.RaggedTensor [[1, 2, 3, 4, 5, 1, 2, 3, 4, 5], [6, 7, 6, 7], [], [8, 9, 10, 8, 9, 10], [11, 11]]>
<tf.RaggedTensor [[1, 4, 9, 16, 25], [36, 49], [], [64, 81, 100], [121]]>
tf.Tensor([1 2 3 4 5], shape=(5,), dtype=int32)
<tf.RaggedTensor [[1, 2], [6, 7], [], [8, 9], [11]]>

For the string ragged tensor, we can:

  • Get substring from each element of each sublist
  • Split a sentence into tokens
  • ...
print(tf.strings.substr(tensor_ragged_str, 0, 1)) # Get a substring from each element: start at index 0 with size 1
print(tf.strings.substr(tensor_ragged_str, 0, 2)) # Get a substring from each element: start at index 0 with size 2
print(tf.strings.substr(tensor_ragged_str, 1, 1)) # Get a substring from each element: start at index 1 with size 1
print(tf.strings.split(['This is a string', 'another string'])) # Split strings into ragged tensors

Output

<tf.RaggedTensor [[b'T', b'i', b'a', b's'], [b'n', b's'], [b'a', b'a', b's']]>
<tf.RaggedTensor [[b'Th', b'is', b'a', b'st'], [b'no', b'st'], [b'an', b'an', b'st']]>
<tf.RaggedTensor [[b'h', b's', b'', b't'], [b'o', b't'], [b'n', b'n', b't']]>
<tf.RaggedTensor [[b'This', b'is', b'a', b'string'], [b'another', b'string']]>

tf.Variable

  • tf.Variable is used to create a tensor whose values can be updated.
  • There are specific ops are created to modify the values of this type of tensor.
  • tf.Variable is used by tf.keras to store the parameters/weights of a neural network model.

Set this command to True if you want to know which device your Tensorflow operations are executed on.

tf.debugging.set_log_device_placement(True)

Create tensorflow variable

Tensorflow variable can easily be created by tf.Variable whose input can be a list or nested list.

tensor_int = tf.Variable([[1, 2], [3, 4]])
tensor_bool = tf.Variable([True, False, True, False])
tensor_complex = tf.Variable([1 + 2j, 3 + 4j])

Output

<tf.Variable 'Variable:0' shape=(2, 2) dtype=int32, numpy=
array([[1, 2],
       [3, 4]], dtype=int32)>

<tf.Variable 'Variable:0' shape=(4,) dtype=bool, numpy=array([ True, False,  True, False])>

<tf.Variable 'Variable:0' shape=(2,) dtype=complex128, numpy=array([1.+2.j, 3.+4.j])>

And because we have set tf.debugging.set_log_device_placement to True, there will be some more lines in the output space like this:

Output

Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:0

At the end of each line is the device (GPU:0 or CPU:0) that the corresponding operation is executed on. The number after GPU or CPU identifies the index of the device.

If you want to know the shape, type, or convert it to NumPy array; we can do it in the same way as we do with tensors.

We can use .shape, .dtype and .numpy() as did in the previous posts.

print(tensor_int.shape)
print(tensor_int.dtype)
print(tensor_int.numpy())

Output

(2, 2)

<dtype: 'int32'>

[[1 2]
 [3 4]]

Most of the operations used by tensor can also be used by variables, except the reshape operation.

print(tf.convert_to_tensor(tensor_int)) # convert a tf.Variable to tensor 
print(tf.argmax(tensor_int)) 
print(tf.reshape(tensor_int, [1, 4])) # not reshape the variable, but create a new reshaped tensor from the variable
print(tf.expand_dims(tensor_int, axis=0)) # every operation that changes the shape of the variable will convert it into a tensor
tensor_int_2 = tf.Variable([[5, 6], [7, 8]]) # create a new tensor to concatenate with tensor_int
print(tf.concat([tensor_int, tensor_int_2], 0))
print(tf.stack([tensor_int, tensor_int_2], 0))

Output

tf.Tensor(
[[1 2]
 [3 4]], shape=(2, 2), dtype=int32)

tf.Tensor([1 1], shape=(2,), dtype=int64)

tf.Tensor([[1 2 3 4]], shape=(1, 4), dtype=int32)

tf.Tensor(
[[[1 2]
  [3 4]]], shape=(1, 2, 2), dtype=int32)

tf.Tensor(
[[1 2]
 [3 4]
 [5 6]
 [7 8]], shape=(4, 2), dtype=int32)

tf.Tensor(
[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]], shape=(2, 2, 2), dtype=int32)

Note that all the operations changing the shape of the tf.Variable will create a new tensor with a new shape, because we cannot change the shape of a tf.Variable.

Assign a tensor to a variable

As mentioned above, the values of a Tensorflow variable can be modified. One of the ways to do this is to assign the Tensorflow variable with a new tensor, but there is one condition that is this new tensor must have the same shape as the old shape of the Tensorflow variable.

tensor_a = tf.Variable([1, 2])
print(tensor_a)
tensor_a.assign([2, 3]) # the existing memory is used instead of allocating new one
print(tensor_a)
tensor_a.assign([1, 2, 3]) # cannot assign because the new tensor has different shape with the original variable

Output

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([1, 2], dtype=int32)>

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([2, 3], dtype=int32)>

ValueError: Cannot assign to variable Variable:0 due to variable shape (2,) and value shape (3,) are incompatible

Create a new variable from an original variable

Below, we define tensor_b on the basis of tensor_a. Let's try to assign new values to tensor_b. Will that also change the values in the original variable tensor_a?

tensor_a = tf.Variable([1, 2])
tensor_b = tf.Variable(tensor_a)
tensor_b.assign([3, 4])
print(tensor_a)
print(tensor_b)

Output

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([1, 2], dtype=int32)>

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([3, 4], dtype=int32)>

No, when creating a new variable from an original variable, this new variable is allocated with new memory. So changing its values will not affect the original one.

Other versions of assign:

  • assign_add: add directly to a tensor
  • assign_sub: subtract directly from a tensor
print(tensor_a.assign_add([10, 10]))
print(tensor_b.assign_sub([1, 1]))

Output

<tf.Variable 'UnreadVariable' shape=(2,) dtype=int32, numpy=array([11, 12], dtype=int32)>
<tf.Variable 'UnreadVariable' shape=(2,) dtype=int32, numpy=array([2, 3], dtype=int32)>

Add 2 elements of tensor_a by 10. Subtract 2 elements of tensor_b by 1.

Naming Tensorflow variable

Create two variables tensor_a and tensor_b from a tensor tensor_const.

tensor_const = tf.constant([1, 2])
tensor_a = tf.Variable(tensor_const, name="tensor")
tensor_b = tf.Variable(tensor_const + 2, name="tensor")

Output

<tf.Variable 'tensor:0' shape=(2,) dtype=int32, numpy=array([1, 2], dtype=int32)>

<tf.Variable 'tensor:0' shape=(2,) dtype=int32, numpy=array([3, 4], dtype=int32)>

Both tensor_a and tensor_b have the name "tensor", but they are two different tensors.

Variable names are also saved when a model is saved. Therefore, when loading a model, the variable names are kept. If we don't manually specify names, the variables in models will be automatically set with unique names.

Parameter trainable

As you have known that neural network models require computations with differentiation and gradients, the Tensorflow variables are specifically used for this reason. There is a setting trainable in tf.Variable to set if you want a variable to be differentiated or not differentiated.

tensor_c = tf.Variable([1, 2, 3], trainable=False)

Output

<tf.Variable 'Variable:0' shape=(3,) dtype=int32, numpy=array([1, 2, 3], dtype=int32)>

Choosing which device variables and tensors are executed by

By default, most variables are executed by a GPU if there is one available. You can also manually set a device for use as below.

with tf.device('CPU:0'):
  tensor_a = tf.Variable([[1, 2], [3, 4]])

print(tensor_a)

Output

<tf.Variable 'Variable:0' shape=(2, 2) dtype=int32, numpy=
array([[1, 2],
       [3, 4]], dtype=int32)>

Computations on another device

When a variable is set on one device but the computations with it happen on another device, this variable is copied to the new device before computing.

with tf.device('CPU:0'):
  tensor_a = tf.Variable([[11, 12, 13], [14, 15, 16]])
  tensor_b = tf.Variable([2, 4, 6])

with tf.device('GPU:0'):
  tensor_c = tensor_a * tensor_b

print(tensor_c)

Output

tf.Tensor(
[[22 48 78]
 [28 60 96]], shape=(2, 3), dtype=int32)

tensor_a and tensor_b are first initialized on the CPU. Then, the multiplication of them is in GPU. Both of them must be first copied to the GPU before the multiplication.

The end