gocha124の日記

ごちゃごちゃ書きます

10分間チュートリアル 深層学習モデルをトレーニング をやってみた

やってみた。

ssh -L localhost:8888:localhost:8888 -i ./txcdbxxx.pem ubuntu@54.238.183.142

The authenticity of host '54.238.183.142 (54.238.183.142)' can't be established.
ECDSA key fingerprint is SHA256:IM/fUO2YHtfLV2ykb4wyitm7/1uyuW4/28TvnL+s40o.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '54.238.183.142' (ECDSA) to the list of known hosts.
=============================================================================
       __|  __|_  )
       _|  (     /   <span style="color: #0000cc">Deep Learning Base AMI (Ubuntu 16.04) Version 21.0</span>
      ___|\___|___|
=============================================================================

Welcome to Ubuntu 16.04.6 LTS (GNU/Linux 4.4.0-1101-aws x86_64v)

Nvidia driver version: 418.87.01
CUDA versions available: cuda-10.0 cuda-10.1 cuda-8.0 cuda-9.0 cuda-9.2 
Default CUDA version is 10.0 
Libraries: cuDNN, NCCL, Intel MKL-DNN

AWS Deep Learning AMI Homepage: https://aws.amazon.com/machine-learning/amis/
Developer Guide and Release Notes: https://docs.aws.amazon.com/dlami/latest/devguide/what-is-dlami.html
Support: https://forums.aws.amazon.com/forum.jspa?forumID=263
For a fully managed experience, check out Amazon SageMaker at https://aws.amazon.com/sagemaker
When using INF1 type instances, please update regularly using the instructions at: https://github.com/aws/aws-neuron-sdk/tree/master/release-notes
=============================================================================

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

 * Multipass 1.0 is out! Get Ubuntu VMs on demand on your Linux, Windows or
   Mac. Supports cloud-init for fast, local, cloud devops simulation.

     https://multipass.run/

  Get cloud support with Ubuntu Advantage Cloud Guest:
    http://www.ubuntu.com/business/services/cloud

22 packages can be updated.
0 updates are security updates.


ubuntu@ip-172-31-41-190:~$ 

ubuntu@ip-172-31-41-190:~$ aws configure
AWS Access Key ID [****************6YBF]: 
AWS Secret Access Key [****************qSEG]: 
Default region name [None]: 
Default output format [None]: 
ubuntu@ip-172-31-41-190:~$ $(aws ecr get-login --region us-east-1 --no-include-email --registry-ids 763104351884)
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
WARNING! Your password will be stored unencrypted in /home/ubuntu/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded
ubuntu@ip-172-31-41-190:~$ pwd
/home/ubuntu

ubuntu@ip-172-31-41-190:~$ docker run -it 763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-training:1.13-cpu-py36-ubuntu16.04
Unable to find image '763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-training:1.13-cpu-py36-ubuntu16.04' locally
1.13-cpu-py36-ubuntu16.04: Pulling from tensorflow-training
35b42117c431: Pull complete 
ad9c569a8d98: Pull complete 
293b44f45162: Pull complete 
0c175077525d: Pull complete 
1365b8b5a858: Pull complete 
48bc23b4c956: Pull complete 
a0eb117d191e: Pull complete 
66b02ff5f427: Pull complete 
b379b7a5ac86: Pull complete 
25835df539e7: Pull complete 
42fc20c5c0db: Pull complete 
0971bc98571d: Pull complete 
367408011efb: Pull complete 
ee5e0f5ff66e: Pull complete 
acae1d207982: Pull complete 
Digest: sha256:ab0d1015d08ee0323d98f77bf00dec2f0b98d90a665b0f2fad5e461759cbb9cc
Status: Downloaded newer image for 763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-training:1.13-cpu-py36-ubuntu16.04
root@5f64fd0b0055:/# pwd
/

root@5f64fd0b0055:/# git clone https://github.com/fchollet/keras.git
Cloning into 'keras'...
remote: Enumerating objects: 32987, done.
remote: Total 32987 (delta 0), reused 0 (delta 0), pack-reused 32987
Receiving objects: 100% (32987/32987), 13.11 MiB | 6.29 MiB/s, done.
Resolving deltas: 100% (24101/24101), done.
Checking connectivity... done.
root@5f64fd0b0055:/# python keras/examples/mnist_cnn.py
Using TensorFlow backend.
Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
11493376/11490434 [==============================] - 2s 0us/step
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
2020-02-10 13:14:34.544621: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX512F
2020-02-10 13:14:34.548961: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3000000000 Hz
2020-02-10 13:14:34.549112: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x55de7f0 executing computations on platform Host. Devices:
2020-02-10 13:14:34.549333: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2020-02-10 13:14:34.549506: I tensorflow/core/common_runtime/process_util.cc:71] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
60000/60000 [==============================] - 57s 948us/step - loss: 0.2741 - acc: 0.9148 - val_loss: 0.0602 - val_acc: 0.9805
Epoch 2/12
60000/60000 [==============================] - 56s 930us/step - loss: 0.0922 - acc: 0.9723 - val_loss: 0.0408 - val_acc: 0.9856
Epoch 3/12
60000/60000 [==============================] - 56s 934us/step - loss: 0.0700 - acc: 0.9793 - val_loss: 0.0336 - val_acc: 0.9890
Epoch 4/12
60000/60000 [==============================] - 56s 930us/step - loss: 0.0581 - acc: 0.9830 - val_loss: 0.0331 - val_acc: 0.9887
Epoch 5/12
60000/60000 [==============================] - 56s 930us/step - loss: 0.0516 - acc: 0.9845 - val_loss: 0.0341 - val_acc: 0.9879
Epoch 6/12
60000/60000 [==============================] - 56s 930us/step - loss: 0.0446 - acc: 0.9863 - val_loss: 0.0286 - val_acc: 0.9909
Epoch 7/12
60000/60000 [==============================] - 56s 931us/step - loss: 0.0395 - acc: 0.9880 - val_loss: 0.0281 - val_acc: 0.9904
Epoch 8/12
60000/60000 [==============================] - 56s 930us/step - loss: 0.0380 - acc: 0.9879 - val_loss: 0.0284 - val_acc: 0.9911
Epoch 9/12
60000/60000 [==============================] - 56s 931us/step - loss: 0.0354 - acc: 0.9890 - val_loss: 0.0276 - val_acc: 0.9912
Epoch 10/12
60000/60000 [==============================] - 56s 931us/step - loss: 0.0300 - acc: 0.9905 - val_loss: 0.0307 - val_acc: 0.9909
Epoch 11/12
60000/60000 [==============================] - 56s 928us/step - loss: 0.0284 - acc: 0.9910 - val_loss: 0.0277 - val_acc: 0.9902
Epoch 12/12
60000/60000 [==============================] - 56s 930us/step - loss: 0.0275 - acc: 0.9913 - val_loss: 0.0280 - val_acc: 0.9908
Test loss: 0.0280409380890409
Test accuracy: 0.9908
root@5f64fd0b0055:/#