May 4, 2018, midnight
By : Eric A. Scuccimarra
I have been working on a project to detect abnormalities in mammograms. I have been training it on Google Cloud instances with Nvidia Tesla K80 GPUs, which allow a model to be trained in days rather than weeks or months. However when I tried to do online data augmentation it became a huge bottleneck because it did the data augmentation on the CPU.
I had been using tf.image.random_flip_left_right and tf.image.random_flip_up_down but since those operations were run on the CPU the training slowed down to a crawl as the GPU sat idle waiting for the queue to be filled.
I found this post on Medium, Data Augmentation on GPU in Tensorflow, which uses tf.contrib.image instead of tf.image. tf.contrib.image is written to run on the GPU, so using this code allows the data augmentation to be performed on the GPU instead of the CPU and thus eliminates the bottleneck.
This has been a life saver for me. Adding it to my graph allows me to train for longer without overfitting and this get better results.
There are no comments for this article.