DeConvolution Artifacts

June 14, 2018, 9:21 a.m.

If you have ever used deconvolutions to upsample layers of convnets you have probably seen artifacts and possibly checkerboard patterns. This article explains why and gives some useful tips as to how to avoid the problem. I have implemented some of the suggestions and, while it's a bit early to evaluate their efficacity, so far they seem to be helping.

 

Labels: coding , machine_learning

No comments

As I continue to work on my mammography project I save a lot of time by re-using weights from models I have already trained rather than training every iteration of every model from scratch, which would be very time consuming. However a drawback to this method is that if I add a new layer or change a layer when I continue training the model the layers which have not changed are prone to overfit as they have been trained for substantially longer than the new layers.

I tried only training certain variables, but when the checkpoint is saved only the trained variables are included in it, which means that the checkpoint can not be restored as it is missing many variables. This could be overcome by restoring certain variables from one checkpoint and others from a different checkpoint, but that is overly complicated and not very convenient.

Earlier today, I had added another deconvolution layer to my model. When I trained just that layer the accuracy of the model went very high very quickly, much more quickly than training all of the layers. But then I couldn't continue training all of the layers because the checkpoint only contained the layer trained. I don't have the time to retrain the entire monstrosity from scratch, so I found an ugly hack that allows me to train mostly the layers I want to train while saving all of the weights in the checkpoint.

I create two training ops - one for all variables (train_op_1) and one for the variables I want to train (train_op_2). I run train_op_2 most of the time. But right before I save the checkpoint I do one iteration of train_op_1 which updates all layers, so all variables are saved in the checkpoint. It's not pretty, but it works and best of all, the code doesn't have to be changed depending on what I want to train. I specify whether I want to train all vars or just the subset as a command line arg and if I want to train all vars, then set train_op_2 = train_op_1.

I just ran a few quick tests with no issues, hopefully this will continue to work.

Labels: python , data_science , machine_learning , tensorflow

No comments

Linux on Windows 10

June 12, 2018, 1:34 p.m.

In my opinion, the one major advantage of developing on a Mac vs Windows was that OS X was built on top of FreeBSD so you could easily run Linux commands from a shell. To run Linux on Windows meant installing a virtual machine or some other complicated and annoying software. Apparently Windows now has a Linux Subsystem that is easy to install and use. I just installed it and it was fast and easy and I've had no problems so far. I don't think it will be as integrated into the OS as the Mac shell is, but it's nice to be able to run Linux commands.

Labels: coding

No comments

Intelligent Conversation About AI

June 7, 2018, 9:24 a.m.

Most of the videos I see on YouTube discussing the dangers of AI and machine learning are by people who really have no idea about the subject. I recently started to watch a TED talk by a guy who said that machine learning programs wrote themselves. I had to turn it off about 30 seconds in because the guy obviously had learned about AI from watching Terminator or some other Hollywood movie. I also get pissed off when I hear Elon Musk talk about the "dangers" of AI. It amazes me that someone who is obviously intelligent is so clueless about the subject.

I just watched a discussion about AI from the World Science Festival which was notable for being an intelligent discussion of the subject by people who are actually familiar with the technology. It is rather long, but it touches on many subjects and every subject is discussed intelligently.

 

Labels: machine_learning

No comments

Archives