Tensorflow - part 2: Ragged tensors and tf.variable
In this post, we will tell about ragged tensors and tf.Variable
.
First, supposing that you have installed Tensorflow, let's import it.
import numpy as np # Numpy array is often used together with tensors,
# so you need to include it as well.
import tensorflow as tf
print(tf.__version__) # You can also use this line to check its installed version
Ragged tensors
All the tensors in the previous posts have a square or rectangular shape. In the real world, we do not just deal with these types of data but sometimes the elements in a tensor have variation in length (e.g.: a batch of sentences with different lengths,...). Ragged tensors are created for this problem.
Create an int ragged tensor and an string ragged tensor
Try creating two ragged tensors. One is a ragged tensor of type int and the other is of type string. tf.ragged.constant
receives a list of sublists as input. Each sublist can contain integer elements, float elements, or string elements. Sublists can have different lengths.
tensor_ragged_int = tf.ragged.constant([[1, 2, 3, 4, 5], [6, 7], [], [8, 9, 10], [11]])
tensor_ragged_str = tf.ragged.constant([["This", "is", "a", "string"], ["no", "strings"], ["an", "another", "string"]])
print(tensor_ragged_int)
print(tensor_ragged_str)
Output
<tf.RaggedTensor [[1, 2, 3, 4, 5], [6, 7], [], [8, 9, 10], [11]]>
<tf.RaggedTensor [[b'This', b'is', b'a', b'string'], [b'no', b'strings'], [b'an', b'another', b'string']]>
You can see that all the sub tensors have different lengths.
Functions that work with normal tensor also work with ragged tensor
For an int ragged tensor, we can:
- add or multiply it with a constant number
- calculate the mean of its elements
- concatenate it with another tensor
- ...
print(tf.add(tensor_ragged_int, 2)) # Add each element by 2
print(tf.multiply(tensor_ragged_int, 10)) # Multiply each element by 10
print(tf.reduce_mean(tensor_ragged_int, axis=1)) # Calculate mean for each sub tensor
print(tf.concat([tensor_ragged_int, [[12, 13]]], axis=0)) # Concatenate the int ragged tensor with another tensor
print(tf.tile(tensor_ragged_int, [1, 2])) # Tiling the int ragged tensor
print(tf.map_fn(tf.math.square, tensor_ragged_int)) # Map each element to a square version of it
print(tensor_ragged_int[0]) # Get first row
print(tensor_ragged_int[:, :2]) # Get the 2 first values in each row of dimension 1
Output
<tf.RaggedTensor [[3, 4, 5, 6, 7], [8, 9], [], [10, 11, 12], [13]]>
<tf.RaggedTensor [[10, 20, 30, 40, 50], [60, 70], [], [80, 90, 100], [110]]>
tf.Tensor([ 3. 6.5 nan 9. 11. ], shape=(5,), dtype=float64)
<tf.RaggedTensor [[1, 2, 3, 4, 5], [6, 7], [], [8, 9, 10], [11], [12, 13]]>
<tf.RaggedTensor [[1, 2, 3, 4, 5, 1, 2, 3, 4, 5], [6, 7, 6, 7], [], [8, 9, 10, 8, 9, 10], [11, 11]]>
<tf.RaggedTensor [[1, 4, 9, 16, 25], [36, 49], [], [64, 81, 100], [121]]>
tf.Tensor([1 2 3 4 5], shape=(5,), dtype=int32)
<tf.RaggedTensor [[1, 2], [6, 7], [], [8, 9], [11]]>
For the string ragged tensor, we can:
- Get substring from each element of each sublist
- Split a sentence into tokens
- ...
print(tf.strings.substr(tensor_ragged_str, 0, 1)) # Get a substring from each element: start at index 0 with size 1
print(tf.strings.substr(tensor_ragged_str, 0, 2)) # Get a substring from each element: start at index 0 with size 2
print(tf.strings.substr(tensor_ragged_str, 1, 1)) # Get a substring from each element: start at index 1 with size 1
print(tf.strings.split(['This is a string', 'another string'])) # Split strings into ragged tensors
Output
<tf.RaggedTensor [[b'T', b'i', b'a', b's'], [b'n', b's'], [b'a', b'a', b's']]>
<tf.RaggedTensor [[b'Th', b'is', b'a', b'st'], [b'no', b'st'], [b'an', b'an', b'st']]>
<tf.RaggedTensor [[b'h', b's', b'', b't'], [b'o', b't'], [b'n', b'n', b't']]>
<tf.RaggedTensor [[b'This', b'is', b'a', b'string'], [b'another', b'string']]>
tf.Variable
- tf.Variable is used to create a tensor whose values can be updated.
- There are specific ops are created to modify the values of this type of tensor.
- tf.Variable is used by tf.keras to store the parameters/weights of a neural network model.
Set this command to True if you want to know which device your Tensorflow operations are executed on.
tf.debugging.set_log_device_placement(True)
Create tensorflow variable
Tensorflow variable can easily be created by tf.Variable
whose input can be a list or nested list.
tensor_int = tf.Variable([[1, 2], [3, 4]])
tensor_bool = tf.Variable([True, False, True, False])
tensor_complex = tf.Variable([1 + 2j, 3 + 4j])
Output
<tf.Variable 'Variable:0' shape=(2, 2) dtype=int32, numpy=
array([[1, 2],
[3, 4]], dtype=int32)>
<tf.Variable 'Variable:0' shape=(4,) dtype=bool, numpy=array([ True, False, True, False])>
<tf.Variable 'Variable:0' shape=(2,) dtype=complex128, numpy=array([1.+2.j, 3.+4.j])>
And because we have set tf.debugging.set_log_device_placement
to True, there will be some more lines in the output space like this:
Output
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:0
At the end of each line is the device (GPU:0 or CPU:0) that the corresponding operation is executed on. The number after GPU or CPU identifies the index of the device.
If you want to know the shape, type, or convert it to NumPy array; we can do it in the same way as we do with tensors.
We can use .shape
, .dtype
and .numpy()
as did in the previous posts.
print(tensor_int.shape)
print(tensor_int.dtype)
print(tensor_int.numpy())
Output
(2, 2)
<dtype: 'int32'>
[[1 2]
[3 4]]
Most of the operations used by tensor can also be used by variables, except the reshape operation.
print(tf.convert_to_tensor(tensor_int)) # convert a tf.Variable to tensor
print(tf.argmax(tensor_int))
print(tf.reshape(tensor_int, [1, 4])) # not reshape the variable, but create a new reshaped tensor from the variable
print(tf.expand_dims(tensor_int, axis=0)) # every operation that changes the shape of the variable will convert it into a tensor
tensor_int_2 = tf.Variable([[5, 6], [7, 8]]) # create a new tensor to concatenate with tensor_int
print(tf.concat([tensor_int, tensor_int_2], 0))
print(tf.stack([tensor_int, tensor_int_2], 0))
Output
tf.Tensor(
[[1 2]
[3 4]], shape=(2, 2), dtype=int32)
tf.Tensor([1 1], shape=(2,), dtype=int64)
tf.Tensor([[1 2 3 4]], shape=(1, 4), dtype=int32)
tf.Tensor(
[[[1 2]
[3 4]]], shape=(1, 2, 2), dtype=int32)
tf.Tensor(
[[1 2]
[3 4]
[5 6]
[7 8]], shape=(4, 2), dtype=int32)
tf.Tensor(
[[[1 2]
[3 4]]
[[5 6]
[7 8]]], shape=(2, 2, 2), dtype=int32)
Note that all the operations changing the shape of the tf.Variable
will create a new tensor with a new shape, because we cannot change the shape of a tf.Variable
.
Assign a tensor to a variable
As mentioned above, the values of a Tensorflow variable can be modified. One of the ways to do this is to assign the Tensorflow variable with a new tensor, but there is one condition that is this new tensor must have the same shape as the old shape of the Tensorflow variable.
tensor_a = tf.Variable([1, 2])
print(tensor_a)
tensor_a.assign([2, 3]) # the existing memory is used instead of allocating new one
print(tensor_a)
tensor_a.assign([1, 2, 3]) # cannot assign because the new tensor has different shape with the original variable
Output
<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([1, 2], dtype=int32)>
<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([2, 3], dtype=int32)>
ValueError: Cannot assign to variable Variable:0 due to variable shape (2,) and value shape (3,) are incompatible
Create a new variable from an original variable
Below, we define tensor_b
on the basis of tensor_a
. Let's try to assign new values to tensor_b
. Will that also change the values in the original variable tensor_a
?
tensor_a = tf.Variable([1, 2])
tensor_b = tf.Variable(tensor_a)
tensor_b.assign([3, 4])
print(tensor_a)
print(tensor_b)
Output
<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([1, 2], dtype=int32)>
<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([3, 4], dtype=int32)>
No, when creating a new variable from an original variable, this new variable is allocated with new memory. So changing its values will not affect the original one.
Other versions of assign:
assign_add
: add directly to a tensorassign_sub
: subtract directly from a tensor
print(tensor_a.assign_add([10, 10]))
print(tensor_b.assign_sub([1, 1]))
Output
<tf.Variable 'UnreadVariable' shape=(2,) dtype=int32, numpy=array([11, 12], dtype=int32)>
<tf.Variable 'UnreadVariable' shape=(2,) dtype=int32, numpy=array([2, 3], dtype=int32)>
Add 2 elements of tensor_a
by 10. Subtract 2 elements of tensor_b
by 1.
Naming Tensorflow variable
Create two variables tensor_a
and tensor_b
from a tensor tensor_const
.
tensor_const = tf.constant([1, 2])
tensor_a = tf.Variable(tensor_const, name="tensor")
tensor_b = tf.Variable(tensor_const + 2, name="tensor")
Output
<tf.Variable 'tensor:0' shape=(2,) dtype=int32, numpy=array([1, 2], dtype=int32)>
<tf.Variable 'tensor:0' shape=(2,) dtype=int32, numpy=array([3, 4], dtype=int32)>
Both tensor_a
and tensor_b
have the name "tensor", but they are two different tensors.
Variable names are also saved when a model is saved. Therefore, when loading a model, the variable names are kept. If we don't manually specify names, the variables in models will be automatically set with unique names.
trainable
Parameter As you have known that neural network models require computations with differentiation and gradients, the Tensorflow variables are specifically used for this reason. There is a setting trainable
in tf.Variable
to set if you want a variable to be differentiated or not differentiated.
tensor_c = tf.Variable([1, 2, 3], trainable=False)
Output
<tf.Variable 'Variable:0' shape=(3,) dtype=int32, numpy=array([1, 2, 3], dtype=int32)>
Choosing which device variables and tensors are executed by
By default, most variables are executed by a GPU if there is one available. You can also manually set a device for use as below.
with tf.device('CPU:0'):
tensor_a = tf.Variable([[1, 2], [3, 4]])
print(tensor_a)
Output
<tf.Variable 'Variable:0' shape=(2, 2) dtype=int32, numpy=
array([[1, 2],
[3, 4]], dtype=int32)>
Computations on another device
When a variable is set on one device but the computations with it happen on another device, this variable is copied to the new device before computing.
with tf.device('CPU:0'):
tensor_a = tf.Variable([[11, 12, 13], [14, 15, 16]])
tensor_b = tf.Variable([2, 4, 6])
with tf.device('GPU:0'):
tensor_c = tensor_a * tensor_b
print(tensor_c)
Output
tf.Tensor(
[[22 48 78]
[28 60 96]], shape=(2, 3), dtype=int32)
tensor_a
and tensor_b
are first initialized on the CPU. Then, the multiplication of them is in GPU. Both of them must be first copied to the GPU before the multiplication.