The index to the articles in this series is found here.
I ran a few more experiments to try tweaks to the network configuration. I defined a measure of how well the network is performing on failed predictions. Essentially, the more the network was confident in a prediction that was incorrect, the larger the penalty. With a few obvious tweaks, I saw no marked improvement. Now, I didn’t run them a dozen times each to get statistics, but the results for all experiments are in a fairly narrow range anyway, I don’t think I’m going to see much improvement with these approaches.
The first experiment was to turn the activation function on the LSTM layer from ReLU to sigmoid. This is a good thing to try in any case, because ReLU is known to cause bad behaviour on recurrent layers, of which this is one. This didn’t result in any clear improvement on the failed-prediction measure.
Next, I switched the LSTM layer to tanh activation. Still no improvement.
Following this, I changed the activation on the synthesis layer from ReLU to Leaky ReLU. This is done by changing its activation to linear and then adding a LeakyReLU() layer above it. Still no improvement.
The last thing I tried was to make the network deeper, and add a second synthesis layer on top of the first. This also did not improve my measure any.
So, I’m going to leave refinement aside for now, and focus on collecting more training data and writing a little graphical widget that can sit on my desktop and give me a quick at-a-glance status of the network’s predictions. I think I’ll find that useful.