Colab study notes
Install commonly used packages
Although Colab has already installed some packages such as Tensorflow Matplotlib .etc, there are lots of commonly ised packages:
- Keras:
pip install keras
- OpenCV:
!apt-get -qq install -y libsm6 libxext6 && pip install -q -U opencv-python
- Pytorch:
!pip install -q http://download.pytorch.org/whl/cu75/torch-0.2.0.post3-cp27-cp27mu-manylinux1_x86_64.whl torchvision
- tqdm:
!pip install tqdm
Authorized to log in
1
2
3
4
5
6
7
8
9
10
11
12# 安装 PyDrive 操作库,该操作每个 notebook 只需要执行一次
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
# 授权登录,仅第一次的时候会鉴权
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)File IO
Read file from Google Drive
1
2
3
4
5
6
7# Get the file by id
downloaded = drive.CreateFile({'id':'yourfileID'}) # replace the id with id of file you want to access
# Download file to colab
downloaded.GetContentFile('yourfileName')
# Read file as panda dataframe
import pandas as pd
xyz = pd.read_csv('yourfileName')Write file to Google Drive
1
2
3
4
5
6
7
8
9# Create a Content file as Cache
xyz.to_csv('over.csv')
# Create & upload a text file.
uploaded = drive.CreateFile({'title': 'OK.csv'})
# You will have a file named 'OK.csv' which has content of 'over.csv'
uploaded.SetContentFile('over.csv')
uploaded.Upload()
# checkout your upload file's ID
print('Uploaded file with ID {}'.format(uploaded.get('id')))
Tensorflow commonly used
tf
cast
1 | # cast a tensor[x] to a new type[dtype] |
expand_dims
Inserts a dimension of 1 into a tensor’s shape.1
2
3
4
5
6
7
8
9
10
11
12
13tf.expand_dims(
input,
axis=None
)
# 't' is a tensor of shape [2]
tf.shape(tf.expand_dims(t, 0)) # [1, 2]
tf.shape(tf.expand_dims(t, 1)) # [2, 1]
tf.shape(tf.expand_dims(t, -1)) # [2, 1]
# 't2' is a tensor of shape [2, 3, 5]
tf.shape(tf.expand_dims(t2, 0)) # [1, 2, 3, 5]
tf.shape(tf.expand_dims(t2, 2)) # [2, 3, 1, 5]
tf.shape(tf.expand_dims(t2, 3)) # [2, 3, 5, 1]
read_file
1 | tf.read_file( |
device
- manual mode
with tf.device('/cpu:0')
: cpuwith tf.device('/gpu:0')
orwith tf.device('/device:GPU:0')
- GPU config
import os
os.environ['CUDA_VISIBLE_DEVICES']='0, 1'
1 | tf.device(device_name_or_function) |
random_normal
Outputs random values from a normal distribution.1
2
3
4
5
6
7
8
9tf.random_normal(
shape,
mean=0.0,
stddev=1.0,
dtype=tf.float32,
seed=None,
name=None
)
tf.random_normal((100, 100, 100, 3))
ConfigProto
allowing GPU memory growth by the process.1
2
3config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
reduce_sum/reduce_mean
1 | tf.reduce_sum( |
Returns: The reduced tensor
tf.app
Generic entry point
flag
module
process command line parameters. Just like argparse
run(...)
1 | # run program with an optional 'main' function and 'argv' list |
tf.contrib
eager
- Saver: A tf.train.Saver adapter for use when eager execution is enabled.
tf.data
Dataset
1 | # usage example |
- from_tensor_slices(tensors): Creates a Dataset whose elements are slices of the given tensors. Returns: A dataset
- map(map_func,num_parallel_calls=None)
- batch(batch_size,drop_remainder=False)
- prefetch(buffersize): Creates a Dataset that prefetches elements from this dataset.
tf.image
decode_jpeg
1 | tf.image.decode_jpeg( |
resize_images
tf.layers
conv2d
1 | tf.layers.conv2d( |
Returns: Output tensor.
tf.test
- gpu_device_name(): Check out GPU whether can be found.
tf.train
- Saver
scikit-learn(sklearn)
utils
- shuffle(*array):Shuffle arrays or sparse matrices in a consistent way
model_selection
- train_test_split(*array): Split arrays or matrices into random train and test subsets
- Parameters
- arrays_data
- arrays_label
- test_size
- random_state
- Parameters
Keras
A high-API to build and train deep learning models.
applications
inception_v3
- InceptionV3(…): Instantiates the Inception v3 architecture.
1
2
3
4
5
6
7
8tf.keras.applications.InceptionV3(
include_top=True, # whether to include the fully-connected layer at the top of the network.
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000
) - decode_predictions(…): Decodes the prediction of an ImageNet model.
- preprocess_input(…): Preprocesses a numpy array encoding a batch of images.
backend
layers
- Dense: regular densely-connected NN layer
- Arguments:
- units:
- input_shape:
- Arguments:
- GRU/CuDNNGRU
- Arguments:
- units: Positive integer, dimensionality of the output space.
- return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.
- return_state: Boolean. Whether to return the last state in addition to the output.
- recurrent_activation: Default is
hard sigmoid
.sigmoid
is avaliable.- hard sigmoid: a combination of sigmoid and relu
- recurrent_initializer: Defaul is
orthogonal
.
- Arguments:
preprocessing
image
sequence
- pad_sequences:
- hashing_trick
- one_hot
- text_to_word_sequence
- Tokenizer(vetorize a text corpus)
- Arguments:
- num_words: the maximum number of words to keep, based on word frequency.
- oov_token: if given, it will be added to word_index and used to replace out-of-vocabulary words during text_to_sequence calls
- filters: a string where each element is a character that will be filtered from the texts.
- Methods:
- fit_on_texts: Updates internal vocabulary based on a list of texts.
- texts_to_sequences: Transforms each text in texts in a sequence of integers.
- Arguments:
utils
- get_file: Downloads a file from a URL if it not already in the cache.
Reference: