The index to the articles in this series is found here.
OK, so we’re about to get to the actual writing of the program that will train our neural network. There are going to be many things going on there, and I wanted to start with a basic view of the network topology, with the accompanying Keras code.
Here’s a sample python program, keras-trial.py. It won’t be part of the final solution, but it demonstrates the way we’ll put pieces together in the rain predictor training code. I omit things like noise layers, pooling layers, and other modifiers that can be inlined trivially into this code.
#! /usr/bin/python3
# Figure out how to make our through-time siamese network with shared
# weights
import keras
from keras.layers import Input, Dense, Concatenate, LSTM
from keras.models import Sequential, Model
import sys
import numpy as np
ring0_pixels = 5
ring1_pixels = 4
timesteps = 2
batch_size = 128
ring0_module_nodes_0 = 4
ring0_module_nodes_1 = 3
ring1_module_nodes_0 = 4
ring1_module_nodes_1 = 3
synth_layer_nodes = 3
num_outputs = 3
ring00 = Input(batch_shape=(batch_size, timesteps, ring0_pixels))
ring01 = Input(batch_shape=(batch_size, timesteps, ring0_pixels))
ring10 = Input(batch_shape=(batch_size, timesteps, ring1_pixels))
ring11 = Input(batch_shape=(batch_size, timesteps, ring1_pixels))
ring0_model = Sequential()
ring0_model.add(Dense(ring0_module_nodes_0))
ring0_model.add(Dense(ring0_module_nodes_1))
ring1_model = Sequential()
ring1_model.add(Dense(ring1_module_nodes_0))
ring1_model.add(Dense(ring1_module_nodes_1))
scanned00 = ring0_model(ring00)
scanned01 = ring0_model(ring01)
scanned10 = ring1_model(ring10)
scanned11 = ring1_model(ring11)
aggregated = Concatenate()([scanned00, scanned01, scanned10, scanned11])
time_layer = LSTM(3, stateful=False, return_sequences=True)(aggregated)
synth_layer = Dense(synth_layer_nodes)(time_layer)
output_layer = Dense(num_outputs)(synth_layer)
model = Model(inputs=[ring00, ring01,
ring10, ring11],
outputs=[output_layer])
# model.compile(optimizer='SGD', loss=keras.losses.mean_squared_error)
I wanted to show that I can have multiple neural networks, each operating independently on multiple partitions of the input data, these modules then feed up into a recurrent network, on to an interpretation layer, and output.
For simplicity in this demo implementation, I’ve divided my input data into four partitions, ring00, ring01, ring10, and ring11. I create a two-layer dense neural network called ring0_model that acts separately on ring00 and ring01, and a second two-layer dense neural network, ring1_model, that acts on ring10, and ring11. The four output tensors of these two models are then concatenated into a list of values which feed into the recurrent LSTM layer. This produces a set of outputs that are processed by a dense hidden layer, and then an output layer.
In reality, I’ll have 34 different rings feeding 10 different models. Each of those 10 models will produce some number of outputs that are passed to the LSTM layer for through-time analysis, then on to the synthesis layers for output. There will be 6 time steps passed through LSTM before output is collected for error estimation.
The batch_size is a count of the number of training candidates that will be passed through the network before back-propagation and weight updating occurs. This may be smaller than the total number of training candidates, so there will be multiple weight updates over the course of completing a single pass through the training data (called an epoch).
And there we have the bare bones of our Keras implementation. Next, we write the actual code that feeds image data into the network, and start experimenting with settings and parameters.
UPDATE #1 (2019-08-23): Included a link to an index of articles in this series.