Janice Kim – Dev-Notes

October 1, 2021October 1, 2021

Useful Anaconda Commands on Ubuntu

Example YAML environment file:

name: my_env
channels:
- menpo
- defaults
dependencies:
- python=3.7
- numpy
- scipy
- matplotlib

Creating an environment from a YAML file:

conda env create -f my_env.yml

Updating an environment after updating the YAML environment file:

conda env update --file my_env.yml --prune

Activating an environment:

conda activate my_env

Deactivating an environment:

conda deactivate

Removing an environment:

conda remove --name my_env --all

Show existing environments:

conda info --envs

Here’s some real documentation for more information: https://docs.anaconda.com/

November 19, 2019November 19, 2019

Coloring B&W Images with Machine Learning

I’m finishing up my Machine Learning Nanodegree from Udacity, and I decided to train models to color black and white images for my final project. While this may sound daunting to some, this particular task has been done by many folks already. I’m following a tutorial posted by Emil Wallner with data that I graciously received from WW2DB. The tutorial is in turn based on another existing project called Deep Koalarization. Google is already beta testing their own colorization functionality for the masses, so chances are people won’t be banging on my door to use my models. However, I’m thrilled that I got a chance to try this out for myself.

Here’s code from Emil’s tutorial:
https://github.com/emilwallner/Coloring-greyscale-images/blob/master/Full-version/full_version.ipynb

Here’s a gist of what I tried:
https://gist.github.com/shabububu/95fbbed0e6ef4024f1c6d123bd25328f

While it’s not perfect, I’m pretty happy with the results right out of the box.

Original Iwo Jima photo from WW2DB, modified to be 256×256

Resulting image colored by the model I trained

I didn’t change much from the original tutorial. It’s still using the soon-to-be-deprecated TensorFlow 1.x, but I added a validation set to minimize the mean squared error along with a few other small things. The training data consists of 1,600 color images related to WWII. At least 75% of them were pre-1950. All of the images I received from WW2DB were modified to be 256×256 beforehand using ImageMagick. Of those 1600, about 5% were used for validation. Finally, 122 colored images were set aside for testing at the end. My script ran through the data about 250 times, and it took under 8 hours to train.

Here’s an example from my test set — the original colors, the colors stripped, and recolored with my model:

Here’s one where there is no ground truth:

Black and White WWII Iwo Jima Photo (scaled to 256×256)

The mind-blowing thing about this project is how easy it was for me to start training machine learning models for myself. A simple Google search found previous work from so many other people. I took a Udacity course for more structure in my pursuit of knowledge, but I could have easily gone through this tutorial without it and still have produced good results. I didn’t pay for compute time because I used Google Colab, which gave me access to a free Jupyter notebook environment hooked up to a GPU. I ended up being lucky enough to get a machine with 25+ GB of RAM and 300+ GB of storage. I did pay for extra Google Drive space ($20 per year), but that’s about it. It’s honestly amazing what we can do these days.

March 13, 2019March 13, 2019

Redirecting My Career Back into Machine Learning

Hi. I’m Janice. I am a software engineer. I quit my job this year to shift my career towards Machine Learning. My husband runs dev-notes, and I plan to use it to chronicle my journey. I’m hoping this will also help keep myself accountable to my goal.

My Background

So, why am I doing this?

I’m not new to this field. I have a bachelor’s degree in computer science from 2002. I’ve been programming since I was in high school. I worked for nearly 8 years at IBM Research as a software engineer on speaker recognition, speech recognition, gender detection, keyword search, etc. You’d think this would be easy for me, right? Don’t I know what I need already?

Not quite. (Or, maybe I lack the confidence?)

I paused my career to start a family right around when the state of the art for these sorts of systems shifted from HMM-based models to neural networks. When I worked at IBM, I interacted daily with proprietary code and tools. While I know how to solve problems in general, I don’t have the confidence yet to find a job. I have no portfolio. All of my experience is dated and very specific. I need to address these shortcomings. Artificial intelligence has become so accessible these days that folks with less experience have already found their place. I’m confident that someone like me should be able to do this. (Do I sound confident?)

When I decided to go back to work again after my career break, I had to relearn how to work with the constraints of having a young family. For me, this was a big deal. I timidly took roles at universities in hopes of maintaining a good work-life balance. However, I was doing primarily web development in Ruby on Rails with bits of python thrown in, and it wasn’t fulfilling. Looking back at my career, I was happiest when I was working on machine learning problems. I want to be back there.

My Plan

I just enrolled in the Udacity Machine Learning Nanodegree, with a start date of April 16, 2019. The time commitment is 2 three-month terms for about 10 hours a week. I’m hoping that I’ll be able to complete the course in less time, but I’m a mom to two young kids. We’ll see how it goes. I chose Udacity because I do much better when there’s a structure to my learning. They stress that the course will help build my portfolio, which I’m sure will be useful once I start applying for jobs. This will cost me $999 for two terms, so basically $2k.

I’ve already subscribed to a year of datacamp, so I’m making my way through as many of those courses as I can. A year subscription cost me $180. I think it’s worth it so far.

I hope to at least try a Kaggle competition. Even if I fail, it looks like a great way to get my hands dirty and learn.

Finally, I plan to take the famous Coursera course with Andrew Ng. I hear that this is a great foundational course. I’ve only just started it.

Anyway, thanks for reading. I plan to have regular updates in the coming months.

August 10, 2018November 19, 2019

Docker 101 Training Notes

I attended a four part Docker 101 Training course at Virginia Tech. This post contains notes on part 3 of 4.

Understanding what’s in an image:

$ docker ps

CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                          NAMES
6027bb44a67d        zookeeper           "/docker-entrypoint.…"   7 weeks ago         Up 2 days           2181/tcp, 2888/tcp, 3888/tcp   hyku_zoo1_1
6a9585f4b0bc        zookeeper           "/docker-entrypoint.…"   7 weeks ago         Up 2 days           2181/tcp, 2888/tcp, 3888/tcp   hyku_zoo2_1
5bd742a7314c        zookeeper           "/docker-entrypoint.…"   7 weeks ago         Up 2 days           2181/tcp, 2888/tcp, 3888/tcp   hyku_zoo3_1

$ docker image history zookeeper

IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
397be0d8fa45        7 weeks ago         /bin/sh -c #(nop)  CMD ["zkServer.sh" "start…   0B                  
<missing>           7 weeks ago         /bin/sh -c #(nop)  ENTRYPOINT ["/docker-entr…   0B                  
<missing>           7 weeks ago         /bin/sh -c #(nop) COPY file:5cb6c695778a88d6…   941B                
<missing>           7 weeks ago         /bin/sh -c #(nop)  ENV PATH=/usr/local/sbin:…   0B                  
<missing>           7 weeks ago         /bin/sh -c #(nop)  EXPOSE 2181/tcp 2888/tcp …   0B                  
<missing>           7 weeks ago         /bin/sh -c #(nop)  VOLUME [/data /datalog]      0B                  
<missing>           7 weeks ago         /bin/sh -c #(nop) WORKDIR /zookeeper-3.4.12     0B                  
<missing>           7 weeks ago         |2 DISTRO_NAME=zookeeper-3.4.12 GPG_KEY=586E…   60MB                
<missing>           7 weeks ago         /bin/sh -c #(nop)  ARG DISTRO_NAME=zookeeper…   0B                  
<missing>           7 weeks ago         /bin/sh -c #(nop)  ARG GPG_KEY=586EFEF859AF2…   0B                  
<missing>           7 weeks ago         /bin/sh -c set -ex;     adduser -D "$ZOO_USE…   4.82kB              
<missing>           7 weeks ago         /bin/sh -c #(nop)  ENV ZOO_USER=zookeeper ZO…   0B                  
<missing>           7 weeks ago         /bin/sh -c apk add --no-cache     bash     s…   4.21MB              
<missing>           7 weeks ago         /bin/sh -c set -x  && apk add --no-cache   o…   77.8MB              
<missing>           7 weeks ago         /bin/sh -c #(nop)  ENV JAVA_ALPINE_VERSION=8…   0B                  
<missing>           7 weeks ago         /bin/sh -c #(nop)  ENV JAVA_VERSION=8u171       0B                  
<missing>           2 months ago        /bin/sh -c #(nop)  ENV PATH=/usr/local/sbin:…   0B                  
<missing>           2 months ago        /bin/sh -c #(nop)  ENV JAVA_HOME=/usr/lib/jv…   0B                  
<missing>           2 months ago        /bin/sh -c {   echo '#!/bin/sh';   echo 'set…   87B                 
<missing>           2 months ago        /bin/sh -c #(nop)  ENV LANG=C.UTF-8             0B                  
<missing>           7 months ago        /bin/sh -c #(nop)  CMD ["/bin/sh"]              0B                  
<missing>           7 months ago        /bin/sh -c #(nop) ADD file:093f0723fa46f6cdb…   4.15MB

Create a tar of a Docker image:

$ docker image save zookeeper | tar x -
$ ls -latr

total 32
-rw-r--r--   1 janicekim  staff    92 Dec 31  1969 repositories
-rw-r--r--   1 janicekim  staff   667 Dec 31  1969 manifest.json
drwxr-xr-x   5 janicekim  staff   160 Jun 16 05:16 f489fb9bc024e38efcd28c3d574b04568ad8db2ecf4e04f850e230b52970c5b4
drwxr-xr-x   5 janicekim  staff   160 Jun 16 05:16 bc689ee72c370143bedbce7354c4ab6d875c9ff0770fa5fc30767e43eefe5f69
drwxr-xr-x   5 janicekim  staff   160 Jun 16 05:16 b7c8de9af3495335f18f0ed1dde5f34a589d06d84e25c7fe75eb2987a8e5205a
drwxr-xr-x   5 janicekim  staff   160 Jun 16 05:16 8c3cedd6ec82ada9c5ffbbdce49e2e004bf3e9f8af1d0c986af084ee555e6f5b
drwxr-xr-x   5 janicekim  staff   160 Jun 16 05:16 498654318d0999ce36c7b90901ed8bd8cb63d86837cb101ea1ec9bb092f44e59
-rw-r--r--   1 janicekim  staff  7338 Jun 16 05:16 397be0d8fa458782ea28b39e9a7480272519c522f6416ee3cae2efe204f324d3.json
drwxr-xr-x   5 janicekim  staff   160 Jun 16 05:16 26353606c6538cd3386157cf987e132c9809e0b8fa413fe9a6f35b20a0884ccc
drwxr-xr-x   5 janicekim  staff   160 Jun 16 05:16 146c29a4165a3791184d9b4631a0f0488f8eb4a282d606ed53b484fbf14de7fe
drwxr-xr-x   7 janicekim  staff   224 Aug 10 12:17 ..
drwxr-xr-x  12 janicekim  staff   384 Aug 10 14:33 .

Look at the generated manifest:

$ cat manifest.json | python -m json.tool 

[
    {
        "Config": "397be0d8fa458782ea28b39e9a7480272519c522f6416ee3cae2efe204f324d3.json",
        "Layers": [
            "498654318d0999ce36c7b90901ed8bd8cb63d86837cb101ea1ec9bb092f44e59/layer.tar",
            "f489fb9bc024e38efcd28c3d574b04568ad8db2ecf4e04f850e230b52970c5b4/layer.tar",
            "26353606c6538cd3386157cf987e132c9809e0b8fa413fe9a6f35b20a0884ccc/layer.tar",
            "8c3cedd6ec82ada9c5ffbbdce49e2e004bf3e9f8af1d0c986af084ee555e6f5b/layer.tar",
            "b7c8de9af3495335f18f0ed1dde5f34a589d06d84e25c7fe75eb2987a8e5205a/layer.tar",
            "146c29a4165a3791184d9b4631a0f0488f8eb4a282d606ed53b484fbf14de7fe/layer.tar",
            "bc689ee72c370143bedbce7354c4ab6d875c9ff0770fa5fc30767e43eefe5f69/layer.tar"
        ],
        "RepoTags": [
            "zookeeper:latest"
        ]
    }
]

- Each layer shows actual filesystem changes
- The layers are unioned together, they provide a full filesystem
  - Each layer can add files as needed
  - Files in “higher” layers replace the same file in “lower” layers
- Deleted files are represented in a layer as a “whiteout” file
- Whiteout files are only used by the filesystem driver and not visible in the merged filesystem

Popular Base Distributions

Ubuntu/Debian – larger and familiar, results in larger image size
Alpine Linux – smaller results in smaller image size

March 17, 2017March 11, 2018

How to Install Rails on CentOS 7

Install prerequisite dependencies

$ sudo yum install -y git-core zlib zlib-devel gcc-c++ patch readline readline-devel libyaml-devel libffi-devel openssl-devel make bzip2 autoconf automake libtool bison curl sqlite-devel

Install rbenv

$ git clone https://github.com/rbenv/rbenv.git ~/.rbenv
$ cd ~/.rbenv && src/configure && make -C src
$ echo 'export PATH=$HOME/.rbenv/bin:$PATH' >> ~/.bash_profile

Run rbenv init, and follow the instructions.

$ rbenv init
# Load rbenv automatically by appending
# the following to ~/.bash_profile:

eval "$(rbenv init -)"

Restart your shell.

Install ruby-build plugin to get access to rbenv install.

$ git clone https://github.com/rbenv/ruby-build.git ~/.rbenv/plugins/ruby-build

Finally, install ruby, and confirm version.

$ rbenv install 2.2.2
...
$ ruby -v
ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-linux]

Confirm sqlite3 is installed

$ sqlite3 --version
3.7.17 2013-05-20 00:56:22 118a3b35693b134d56ebd780123b7fd6f1497668

Install rails using gem.
NOTE: I originally had some trouble running this command because the version of ruby kept going back to 2.0.0, which is the default version for my installation of CentOS. I had to manually add a .ruby-version file into my working directory with the contents 2.2.2 in order for the gem command to proceed with the rails installation. I don’t know if this is the correct way of doing this.

$ gem install rails
...
$ rails --version
Rails 5.0.0.1

March 17, 2017March 11, 2018

Manage Docker Images

List docker images

$ docker images -a

Remove docker images

$ docker rmi

Listing and Removing dangling docker images

$ docker images -f dangling=true
$ docker rmi $(docker images -f dangling=true -q)

March 17, 2017March 11, 2018

Generate SSL key and certificate using openssl

$ mkdir ~/gencerts
$ cd ~/gencerts
$ openssl genrsa -des3 -passout pass:x -out server.pass.key 2048
$ openssl rsa -passin pass:x -in server.pass.key -out server.key
$ openssl req -new -key server.key -out server.csr

You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [XX]:US
State or Province Name (full name) []:Massachusetts
Locality Name (eg, city) [Default City]:Boston
Organization Name (eg, company) [Default Company Ltd]:MyOrg
Organizational Unit Name (eg, section) []:MyUnit
Common Name (eg, your name or your server's hostname) []: my.server.com
Email Address []:foobar@my.server.com

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:

$ openssl x509 -req -sha256 -days 365 -in server.csr -signkey server.key -out server.crt
$ mkdir ../ssl
$ cp server.key ../ssl/dev.key
$ cp server.crt ../ssl/dev.crt

March 17, 2017March 11, 2018

Install local docker and docker-compose on CentOS 7

Configure the docker repository with yum

$ sudo tee /etc/yum.repos.d/docker.repo <<-'EOF' > [dockerrepo]
> name=Docker Repository
> baseurl=https://yum.dockerproject.org/repo/main/centos/7/
> enabled=1
> gpgcheck=1
> gpgkey=https://yum.dockerproject.org/gpg
> EOF

Install and start local docker

$ sudo yum install docker-engine
$ sudo systemctl enable docker.service
$ sudo systemctl start docker

Run a sanity check to see if docker is working properly

$ sudo docker run --rm hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
c04b14da8d14: Pull complete 
Digest: sha256:0256e8a36e2070f7bf2d0b0763dbabdd67798512411de4cdcf9431a1feb60fd9
Status: Downloaded newer image for hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker Hub account:
 https://hub.docker.com

For more examples and ideas, visit:
 https://docs.docker.com/engine/userguide/

To allow the user to run docker without sudo, create a “docker” group and add yourusername to it.

$ sudo groupadd docker
$ sudo usermod -aG docker yourusername

Log out and log back in for the group changes to go into effect. Try it out:

$ docker run --rm hello-world

Enable docker to be run after reboot

$ sudo systemctl enable docker

Install docker-compose

$ sudo -i
# curl -L https://github.com/docker/compose/releases/download/1.9.0/docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   600    0   600    0     0   2848      0 --:--:-- --:--:-- --:--:--  2870
100 7857k  100 7857k    0     0  15.5M      0 --:--:-- --:--:-- --:--:-- 15.5M
# chmod +x /usr/local/bin/docker-compose
# exit
$ logout

Check that docker is running

$ docker ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                                      NAMES

December 31, 2012June 26, 2018

Google Maps with input events

This example code updates a set of input fields with latitude-longitude values in response to mouse clicks on an instance of Google Maps. It can also redraw map markers upon typing in new coordinates.


<html>
  <head>
    <style type="text/css">
      html { height: 100% }
      body { height: 100%; margin: 10; padding: 10 }
      #map_canvas { height: 100% }
    </style>
    <script type="text/javascript" src="https://maps.googleapis.com/maps/api/js?sensor=true"></script>
    <script type="text/javascript">
      var map;
      var currMarker = new google.maps.Marker();

      function initialize() {
        var myLatLng = new google.maps.LatLng(0, 0);
        var mapOptions = {
          center: myLatLng,
          zoom: 1,
          mapTypeId: google.maps.MapTypeId.ROADMAP
        };
        map = new google.maps.Map(document.getElementById("map_canvas"),
            mapOptions);

        google.maps.event.addListener(map, 'click', function(event) {
          placeMarker(event.latLng);
        });
      }

      function placeMarker(location) {
        currMarker.setMap(null);
        var marker = new google.maps.Marker({
            position: location,
            map: map
        });    
        currMarker=marker;

        document.getElementById("lat").value=location.lat();
        document.getElementById("lng").value=location.lng();
      }

      function updateMarker() {
        currMarker.setMap(null);
        var lat = document.getElementById("lat").value;
        var lng = document.getElementById("lng").value;
        var newLatLng = new google.maps.LatLng(lat, lng);
        var marker = new google.maps.Marker({
            position: newLatLng,
            map: map
        });    
        currMarker=marker;
      }
    </script>
  </head>
  <body onload="initialize()">
  <p>
    <div id="map_canvas" style="width:400px; height:400px"></div>
    <div id="coordinates">
      lat: <input id="lat" type="text" onchange="updateMarker()" value="0" />
      lng: <input id="lng" type="text" onchange="updateMarker()" value="0" />
    </div>
  </p>
  </body>
</html>

Below is a screenshot of what the code creates: