# Tensorflow - part 2: Ragged tensors and tf.variable

In this post, we will tell about ragged tensors and `tf.Variable`

.

First, supposing that you have installed Tensorflow, let's import it.

```
import numpy as np # Numpy array is often used together with tensors,
# so you need to include it as well.
import tensorflow as tf
print(tf.__version__) # You can also use this line to check its installed version
```

## Ragged tensors

All the tensors in the previous posts have a square or rectangular shape. In the real world, we do not just deal with these types of data but sometimes the elements in a tensor have variation in length (e.g.: a batch of sentences with different lengths,...). Ragged tensors are created for this problem.

### Create an int ragged tensor and an string ragged tensor

Try creating two ragged tensors. One is a ragged tensor of type int and the other is of type string. `tf.ragged.constant`

receives a list of sublists as input. Each sublist can contain integer elements, float elements, or string elements. Sublists can have different lengths.

```
tensor_ragged_int = tf.ragged.constant([[1, 2, 3, 4, 5], [6, 7], [], [8, 9, 10], [11]])
tensor_ragged_str = tf.ragged.constant([["This", "is", "a", "string"], ["no", "strings"], ["an", "another", "string"]])
print(tensor_ragged_int)
print(tensor_ragged_str)
```

Output

```
<tf.RaggedTensor [[1, 2, 3, 4, 5], [6, 7], [], [8, 9, 10], [11]]>
<tf.RaggedTensor [[b'This', b'is', b'a', b'string'], [b'no', b'strings'], [b'an', b'another', b'string']]>
```

You can see that all the sub tensors have different lengths.

### Functions that work with normal tensor also work with ragged tensor

For an int ragged tensor, we can:

- add or multiply it with a constant number
- calculate the mean of its elements
- concatenate it with another tensor
- ...

```
print(tf.add(tensor_ragged_int, 2)) # Add each element by 2
print(tf.multiply(tensor_ragged_int, 10)) # Multiply each element by 10
print(tf.reduce_mean(tensor_ragged_int, axis=1)) # Calculate mean for each sub tensor
print(tf.concat([tensor_ragged_int, [[12, 13]]], axis=0)) # Concatenate the int ragged tensor with another tensor
print(tf.tile(tensor_ragged_int, [1, 2])) # Tiling the int ragged tensor
print(tf.map_fn(tf.math.square, tensor_ragged_int)) # Map each element to a square version of it
print(tensor_ragged_int[0]) # Get first row
print(tensor_ragged_int[:, :2]) # Get the 2 first values in each row of dimension 1
```

Output

```
<tf.RaggedTensor [[3, 4, 5, 6, 7], [8, 9], [], [10, 11, 12], [13]]>
<tf.RaggedTensor [[10, 20, 30, 40, 50], [60, 70], [], [80, 90, 100], [110]]>
tf.Tensor([ 3. 6.5 nan 9. 11. ], shape=(5,), dtype=float64)
<tf.RaggedTensor [[1, 2, 3, 4, 5], [6, 7], [], [8, 9, 10], [11], [12, 13]]>
<tf.RaggedTensor [[1, 2, 3, 4, 5, 1, 2, 3, 4, 5], [6, 7, 6, 7], [], [8, 9, 10, 8, 9, 10], [11, 11]]>
<tf.RaggedTensor [[1, 4, 9, 16, 25], [36, 49], [], [64, 81, 100], [121]]>
tf.Tensor([1 2 3 4 5], shape=(5,), dtype=int32)
<tf.RaggedTensor [[1, 2], [6, 7], [], [8, 9], [11]]>
```

For the string ragged tensor, we can:

- Get substring from each element of each sublist
- Split a sentence into tokens
- ...

```
print(tf.strings.substr(tensor_ragged_str, 0, 1)) # Get a substring from each element: start at index 0 with size 1
print(tf.strings.substr(tensor_ragged_str, 0, 2)) # Get a substring from each element: start at index 0 with size 2
print(tf.strings.substr(tensor_ragged_str, 1, 1)) # Get a substring from each element: start at index 1 with size 1
print(tf.strings.split(['This is a string', 'another string'])) # Split strings into ragged tensors
```

Output

```
<tf.RaggedTensor [[b'T', b'i', b'a', b's'], [b'n', b's'], [b'a', b'a', b's']]>
<tf.RaggedTensor [[b'Th', b'is', b'a', b'st'], [b'no', b'st'], [b'an', b'an', b'st']]>
<tf.RaggedTensor [[b'h', b's', b'', b't'], [b'o', b't'], [b'n', b'n', b't']]>
<tf.RaggedTensor [[b'This', b'is', b'a', b'string'], [b'another', b'string']]>
```

## tf.Variable

- tf.Variable is used to create a tensor whose values can be updated.
- There are specific ops are created to modify the values of this type of tensor.
- tf.Variable is used by tf.keras to store the parameters/weights of a neural network model.

Set this command to True if you want to know which device your Tensorflow operations are executed on.

`tf.debugging.set_log_device_placement(True)`

### Create tensorflow variable

Tensorflow variable can easily be created by `tf.Variable`

whose input can be a list or nested list.

```
tensor_int = tf.Variable([[1, 2], [3, 4]])
tensor_bool = tf.Variable([True, False, True, False])
tensor_complex = tf.Variable([1 + 2j, 3 + 4j])
```

Output

```
<tf.Variable 'Variable:0' shape=(2, 2) dtype=int32, numpy=
array([[1, 2],
[3, 4]], dtype=int32)>
<tf.Variable 'Variable:0' shape=(4,) dtype=bool, numpy=array([ True, False, True, False])>
<tf.Variable 'Variable:0' shape=(2,) dtype=complex128, numpy=array([1.+2.j, 3.+4.j])>
```

And because we have set `tf.debugging.set_log_device_placement`

to True, there will be some more lines in the output space like this:

Output

```
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:0
```

At the end of each line is the device (GPU:0 or CPU:0) that the corresponding operation is executed on. The number after GPU or CPU identifies the index of the device.

### If you want to know the shape, type, or convert it to NumPy array; we can do it in the same way as we do with tensors.

We can use `.shape`

, `.dtype`

and `.numpy()`

as did in the previous posts.

```
print(tensor_int.shape)
print(tensor_int.dtype)
print(tensor_int.numpy())
```

Output

```
(2, 2)
<dtype: 'int32'>
[[1 2]
[3 4]]
```

### Most of the operations used by tensor can also be used by variables, except the reshape operation.

```
print(tf.convert_to_tensor(tensor_int)) # convert a tf.Variable to tensor
print(tf.argmax(tensor_int))
print(tf.reshape(tensor_int, [1, 4])) # not reshape the variable, but create a new reshaped tensor from the variable
print(tf.expand_dims(tensor_int, axis=0)) # every operation that changes the shape of the variable will convert it into a tensor
tensor_int_2 = tf.Variable([[5, 6], [7, 8]]) # create a new tensor to concatenate with tensor_int
print(tf.concat([tensor_int, tensor_int_2], 0))
print(tf.stack([tensor_int, tensor_int_2], 0))
```

Output

```
tf.Tensor(
[[1 2]
[3 4]], shape=(2, 2), dtype=int32)
tf.Tensor([1 1], shape=(2,), dtype=int64)
tf.Tensor([[1 2 3 4]], shape=(1, 4), dtype=int32)
tf.Tensor(
[[[1 2]
[3 4]]], shape=(1, 2, 2), dtype=int32)
tf.Tensor(
[[1 2]
[3 4]
[5 6]
[7 8]], shape=(4, 2), dtype=int32)
tf.Tensor(
[[[1 2]
[3 4]]
[[5 6]
[7 8]]], shape=(2, 2, 2), dtype=int32)
```

Note that all the operations changing the shape of the `tf.Variable`

will create a new tensor with a new shape, because we cannot change the shape of a `tf.Variable`

.

### Assign a tensor to a variable

As mentioned above, the values of a Tensorflow variable can be modified. One of the ways to do this is to assign the Tensorflow variable with a new tensor, but there is one condition that is this new tensor must have the same shape as the old shape of the Tensorflow variable.

```
tensor_a = tf.Variable([1, 2])
print(tensor_a)
tensor_a.assign([2, 3]) # the existing memory is used instead of allocating new one
print(tensor_a)
tensor_a.assign([1, 2, 3]) # cannot assign because the new tensor has different shape with the original variable
```

Output

```
<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([1, 2], dtype=int32)>
<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([2, 3], dtype=int32)>
ValueError: Cannot assign to variable Variable:0 due to variable shape (2,) and value shape (3,) are incompatible
```

### Create a new variable from an original variable

Below, we define `tensor_b`

on the basis of `tensor_a`

. Let's try to assign new values to `tensor_b`

. Will that also change the values in the original variable `tensor_a`

?

```
tensor_a = tf.Variable([1, 2])
tensor_b = tf.Variable(tensor_a)
tensor_b.assign([3, 4])
print(tensor_a)
print(tensor_b)
```

Output

```
<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([1, 2], dtype=int32)>
<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([3, 4], dtype=int32)>
```

No, when creating a new variable from an original variable, this new variable is allocated with new memory. So changing its values will not affect the original one.

Other versions of assign:

`assign_add`

: add directly to a tensor`assign_sub`

: subtract directly from a tensor

```
print(tensor_a.assign_add([10, 10]))
print(tensor_b.assign_sub([1, 1]))
```

Output

```
<tf.Variable 'UnreadVariable' shape=(2,) dtype=int32, numpy=array([11, 12], dtype=int32)>
<tf.Variable 'UnreadVariable' shape=(2,) dtype=int32, numpy=array([2, 3], dtype=int32)>
```

Add 2 elements of `tensor_a`

by 10. Subtract 2 elements of `tensor_b`

by 1.

### Naming Tensorflow variable

Create two variables `tensor_a`

and `tensor_b`

from a tensor `tensor_const`

.

```
tensor_const = tf.constant([1, 2])
tensor_a = tf.Variable(tensor_const, name="tensor")
tensor_b = tf.Variable(tensor_const + 2, name="tensor")
```

Output

```
<tf.Variable 'tensor:0' shape=(2,) dtype=int32, numpy=array([1, 2], dtype=int32)>
<tf.Variable 'tensor:0' shape=(2,) dtype=int32, numpy=array([3, 4], dtype=int32)>
```

Both `tensor_a`

and `tensor_b`

have the name "tensor", but they are two different tensors.

Variable names are also saved when a model is saved. Therefore, when loading a model, the variable names are kept. If we don't manually specify names, the variables in models will be automatically set with unique names.

`trainable`

Parameter As you have known that neural network models require computations with differentiation and gradients, the Tensorflow variables are specifically used for this reason. There is a setting `trainable`

in `tf.Variable`

to set if you want a variable to be differentiated or not differentiated.

`tensor_c = tf.Variable([1, 2, 3], trainable=False)`

Output

`<tf.Variable 'Variable:0' shape=(3,) dtype=int32, numpy=array([1, 2, 3], dtype=int32)>`

### Choosing which device variables and tensors are executed by

By default, most variables are executed by a GPU if there is one available. You can also manually set a device for use as below.

```
with tf.device('CPU:0'):
tensor_a = tf.Variable([[1, 2], [3, 4]])
print(tensor_a)
```

Output

```
<tf.Variable 'Variable:0' shape=(2, 2) dtype=int32, numpy=
array([[1, 2],
[3, 4]], dtype=int32)>
```

### Computations on another device

When a variable is set on one device but the computations with it happen on another device, this variable is copied to the new device before computing.

```
with tf.device('CPU:0'):
tensor_a = tf.Variable([[11, 12, 13], [14, 15, 16]])
tensor_b = tf.Variable([2, 4, 6])
with tf.device('GPU:0'):
tensor_c = tensor_a * tensor_b
print(tensor_c)
```

Output

```
tf.Tensor(
[[22 48 78]
[28 60 96]], shape=(2, 3), dtype=int32)
```

`tensor_a`

and `tensor_b`

are first initialized on the CPU. Then, the multiplication of them is in GPU. Both of them must be first copied to the GPU before the multiplication.