I have a NLP tensor such as train:(22k, 170, 300) val: (2k, 170, 300), test: (25k, 170,300) where the last dim 300 are FastText embs, also I have one GPU Tesla 32GB. I'm doing model selection on a RNN untrained an the batch/buffer_size is 64 the layers are 5:
for config in param_grid:
model = self.create_model(config)
train = model.forward(embedded_training_data)
val = model.forward(embedded_val_data)
test= model.forward(embedded_test_data)
The model created is a Keras sequential where each layers are keras.layers.Bidirectional and sequence=True
hence output is 3D (batch, timesteps, features), the forward method is the following and uses a batches computation called compute_states:
def compute_states(self, x):
x_train_states = []
for i, layer in enumerate(self.layers):
outputs, r, b = layer(x)
x_train_states.append(outputs) # Aggiungi alla lista
x = outputs # Update input for next layer
return tf.concat(x_train_states, axis=2) if x_train_states else None
@tf.function
def forward(self, data):
total_samples = tf.shape(data)[0]
buffer_size = tf.constant(self.buffer_size, dtype=tf.int32)
num_batches = tf.cast(tf.math.ceil(total_samples / buffer_size), tf.int32)
states_array = tf.TensorArray(dtype=tf.float32, size=num_batches)
for i in tf.range(num_batches):
start_idx = i * buffer_size
end_idx = tf.minimum((i + 1) * buffer_size, total_samples)
batch = data[start_idx:end_idx]
states = selfpute_states(batch)
states_array = states_array.write(i, states)
states = states_array.concat()
return states
These functions are good and very speed on CPU but on GPU I receive OOM error on the concatenation of batches (states_array.concat()). I'd like to know if there is any issues in my code, and therefore I could optimize it or if the tensors dimensions are intractable.