The index to the articles in this series is found here.
The prediction system is complete, and working quite well. I made many slight changes to the network configuration, none really performed much better than the others. The network layout that I finally used is as follows:
- Input images are processed into 400 sectors laid out similarly to the patches on a dartboard.
- Each sector has three scalars associated with it. They are:
- The fraction of pixels in the sector that has any rain at all
- The mean intensity of pixels with non-zero rain values
- The RMS intensity of pixels with non-zero rain values
- These preprocessed images are passed, in groups of 6, into a dense layer of 80 nodes called the geometry layer, with a ‘relu’ activation function.
- Output from the geometry layer enters the time layer, an LSTM network of 45 nodes, with a ‘relu’ activation function.
- These outputs are then concatenated with a vector of scalars that represent the time of the year. This is a number that starts a 0 in February, steps up by intervals of 1/6 to 1 in August, and then steps down by 1/6 in each month following, to return to zero the following February.
- Output from the time layer passes into the synthesis layer, a dense layer of 10 nodes with ‘relu’ activation.
- Output from the synthesis layer passes into the output layer, a dense layer of 10 nodes, 10 outputs, and sigmoid activation.
The loss function was binary_crossentropy, and I used a validation binary accuracy monitor for checkpointing, rather than validation loss. This is because I’m more interested in training against getting the distinction between 0 and 1 correct, than I am in making the zeroes smaller and the ones larger. A network that, for true values of [0, 0, 1, 1], produces results of [ 0.45, 0.45, 0.55, 0.55] is better, for my purposes, than one that produces true values of [ 0.1, 0.6, 0.9, 0.9], even if the latter might have a better validation loss.
I also used classweights to emphasize the quantities I was most interested in predicting. Recall that the network produces 10 binary outputs. 5 of them are the likelihood of any rain in the next 0-60 minutes, 61-120 minutes, 121-180 minutes, etc. The other 5 are the likelihood of heavy rain in the same intervals. I decided that I’m most interested in the near future, if it’s going to rain soon I want to make sure the neural network will tell me so. The light rain/heavy rain distinction is less interesting. The weights I gave, then, on an arbitrary scale, were:
- 1.0 for the probability of rain in the next hour
- reduce the weight by 0.1 for each hour interval after the first hour
- the weight of the heavy/light output is half the weight of the any rain output for the same time interval
All of the code is in the git repository.
I’m happy with the network now, and I’m unlikely to find the need to retrain it in the near future, so this will close off my thoughts on this project.
I mentioned early on in this long series of posts that I was unexpectedly out of work. I started a new job a month ago, so life is back to its normal, moderately busy state.