Constructing Future

AI, Deep Learning Basics/Methodology

Methodology skeletons

AI skeletons Supervised model Self-supervised model Unsupervised model Generative models Autoregressive models RNN & Transformer language models, NADE, PixelCNN, WaveNet Latent variable models Tractable: e.g. invertible / flow-based models (RealNVP, Glow, etc.) Intractable: e.g. Markov Chain Monte Carlo, Variational Autoencoders series Implicit models Generative Adversarial Networks (GANs) and v..

→2022.03.19

(작성중) [Generative Model] Generative Adversarial Networks(GAN)

이 글은 GAN에 대한 필자의 이해를 높이고자 작성된 글입니다. 참고자료는 자료1 입니다. Concept of Generative Adversarial Networks(GAN) field of(Generative model-latent variable model), A neural net that maps noise vectors to observations Training: use the learning signal from a classifier trained to discriminate between samples from the model and the training data Pros Can generative very realistic images Conceptually simple imple..

→2022.03.17

Others

블로그 글에 latex 넣기

글 작성 전에 HTML 버전에 다음을 삽입: 출처: https://m.blog.naver.com/PostView.nhn?blogId=psh951120&logNo=221491106060&targetKeyword=&targetRecommendationCode=1

→2022.03.17

[Basic] Probabilistic model

Data is treated as a random variable 🌰 Deterministic Neural Network Model weights are assumed to have a true value that is just unknown All weights are having a single fixed value as is the norm Often the absence of a statistical flavor to such an analysis is prone to overfitting on selected examples and in general, presents challenges to draw confident conclusions. Softmax Model can be uncertai..

→2022.03.11

[Probability] Bayesian Neural Network

이 글은 최성준 교수님의 Bayesian Deep Learning 강좌와 Yarin Gal의 논문을 참조한 글로, 필자의 이해를 위해 작성된 글입니다. 📟 Bayesian Neural Network Replace the deterministic network's weight parameters with distributions over these parameters average over all possible weightes (referred to as marginalisation) Given a training dataset $\mathcal{D}=(\mathbf{X},\mathbf{Y})={(x_i, y_i)}_{i=1}^N$, we would like to estimate a function $\..

→2022.03.05

[Probability] 3. Gaussian process, Gaussian Process Latent Variable Model(GPLVM)

🌡️ Gaussian Process Gaussian Process is a collection of random variables, any finite number of which have a joint Gaussain Distribution

→2022.03.05

AI, Deep Learning Basics/Methodology

[Probability] 2. Random Process, Random Variable, Functional analysis, Kernel function

이 글은 최성준 교수님의 Bayesian Deep Learning 강좌를 요약한 글로, 필자의 이해를 위해 작성된 글입니다. 🦉 전체 흐름 Random Process를 이해하기에 앞서 Random Variable을 이해하고, RV는 sigma-field에서 정의되는 function이므로 이 일련의 과정을 이해하고자 한다. 🐻 Set, sigma-field, Measure --> Probability Set Set function: a function assigning a number of a set Measure is a set function sigma-field: a collection of subsets of U such that axioms(Sigma-field is designed to defi..

→2022.03.05

Uncertainty

🐶 용어 정리 Prediction, Confidence, Probability 🐶Why uncertainty is important? Status Before: using the prediction Now: using prediction, uncertainty Purpose Uncertainty inherent in inductive inference Incorrect model assumptions noisy or imprecise data “... a weather forecaster can be very certain that the chance of rain is 50 %; or her best estimate at 20 % might be very uncertain due to lack of d..

→2022.02.20

[Logger] TensorboardX 사용하기

본 글은 필자의 이해를 돕기 위해 작성된 글로 TensorboardX를 Pytorch에서 구동하는데 일련의 과정을 적은 글입니다. TensorboardX: 모델의 파라미터나 accuracy, loss를 기록하는데 유용한 도구 Process Pytorch 모델에 기록한 파라미터 기록하기: 밑의 코드 출처와 같이 writer를 불러서 하는 경우도 있지만 구현된 모델들에서 사용할 때는 Callback으로 간단하게 파라미터를 추가만 해도 처리가 되도록 하는 경우가 대부분이다. x = torch.arange(-5, 5, 0.1).view(-1, 1) y = -5 * x + 0.1 * torch.randn(x.size()) model = torch.nn.Linear(1, 1) criterion = torch.nn...

→2022.02.19

[Probability] 1. Probability Distribution: Gaussian Distribution

Distributions Gaussain Distribution(Normal Distribution) Bernoulli Distribution Binomial Distribution Cauchy Distribution Gaussian Distribution(Normal Distribution) Univariate Gaussian distribution Multivariate Gaussian distribution Conditional Gaussian distribution etc Central limit theorem Bernoulli Distribution

→2022.02.19

Training tip 정리

Learning rate 부터 정리 Training 시간이 왜 이렇게 오래/짧게 걸리는가 Nvidia-smi (GPU 실시간 확인) Profiler (GPU 사용 history 정리) 어떤걸 고려해야 하는가 Multi-processing: Data loader 의 num_worker Multi-threading Training 되는지 어떻게 확인하는가 Loss이 내려가는지 확인. 특히, Loss 각 항목에 관해서 처리. Parameter나 buffer의 mean, variance 확인 Etc. 결과 확인시 전체 데이터에 대해서 해야지 나오는 batch에 따른 결과를 보면 안된다. Training 때 고려 GPU resource Dataset Training time Etc. Multi-processing..

→2022.02.16

Mathematics/Linear Algebra

[Basic] Activation Function/Loss Function/Evaluation metric

본 글은 필자의 이해를 돕기 위해 작성된 글입니다. 참고 링크: 링크1 🐤Loss Function 과 Evaluation Metric 차이점 간단히 말해서 Loss function은 딥러닝 모델 학습시 성능을 높이기 위해 minimize/maximize 시켜야 하는 지표이고, Evaluation metric은 여러 딥러닝 모델들 중에 좋은 성능을 확인하기 위해 쓰이는 지표입니다. 예를 들어 classification 문제라 하면 모델의 loss function은 대체적으로 crossentropyloss 으로 분류의 지표를 표시한다면 모델 들 간의 성능을 확인하기 위해서는 evaluation metric이 accuracy가 되어야 한다. 딥러닝 모델의 parameter estimation method 중 ..

→2022.02.12

[Linear Algebra] L1/L2 Norm, Loss

이 글은 필자가 이해한 부분을 정리하고자 작성된 글입니다. 참고한 블로그 글은 링크1 입니다. 🧘‍♂️ Norm, L1/L2 Norm Norm: 두 벡터 사이의 길이/크기를 나타내는 방법 ${∥ x ∥}_{p} := {(\sum_{i = 1}^{n} {∣ x_{i} ∣}^{p})}^{1 / p}$ 대표적인 Norm인 L1, L2 norm $p = (p_{1}, p_{2}, . . ., p_{n}) a n d q = (q_{1}, q_{2}, . . ., q_{n})$ L1 Norm: 절댓값의 합 $d_{1} (p, q) =∥ p - q ∥_{1} = \sum_{i = 1}^{n} ∣ p_{i} - q_{i} ∣$ L2 Norm: 흔히 알고 있는 유클리디안 distance로 unique..

→2022.02.12

[Probability] Gaussian, Bayesian 용어 정리

🐤 Gaussian Gaussian distribution $N (μ, σ)$ Gaussain Process: A collection of random variables, any finite number of which have a joint Gaussian distribution Gaussain Process regression 🐤 Bayesian Bayes' rule $P (B | A) = \frac{P (A | B) P (B)}{P (A)}$ Bayesian probability Bayesian Inference Bayesian Neural Network 🐤 내가 Gaussian과 Bayesian 이 헷갈리는 이유 Gaussian Process와 Bayesian Neural Ne..

→2022.02.12

[NLP] 4. Modern Recurrent Neural Networks: Seq2Seq

🐶 Encoder-Decoder structure 🐶 Sequence to Sequence

→2022.02.12

Robotics & Perception/Basic

[NLP] 3. Modern Recurrent Neural Networks: GRU, LSTM

🎲 Gated Recurrent Units (GRU) 🎲 Long Short Term Memory (LSTM)

→2022.02.05

Others

2021 겨울) 인턴 생활

배운 것 프로그래밍에 필요한 강력한 도구 import pdb pdb.set_trace() Critical thinking 모든 원리를 꿰뚫는 능력(principle of loss function) CNN(batch norm, activation function) 논리적인 이해에 따라 개념을 받아들이기 개념 Generative model(especially VAE) KL divergence, ELBO Loss function, CNN basics 발전해야 할 점 완고한 태도 조금 더 신중하게 생각하고 뛰어들기 다른 것을 되돌아보면서 생각하는 자세: 되돌아보는 자세를 줄이자

→2022.02.05

[Graphics] Obj

Object 구성 Obj: Texture + Mesh format) Prefab, obj Obj 여러개 = URDF Mesh: Texture 안 입힌 거 format) fbx Texture/Material MAT: 단색 설정, 밀도, 거칠기 Image format) mat, png On pybullet 하나의 obj VisualShape(보이는 대로의 형태, 복잡) CollisionShape(덜 복잡) Multiobj Ex. 로봇팔(링크로 여러 Obj가 연결)

→2022.02.02

[NLP] 2. RNN Basics: Language Model

이 글은 필자가 Dive into Deep Learning을 읽고 정리한 글입니다. 🏉 Language Model Given a text sequence that consists of tokens$(x_1, x_2, \cdots, x_T)$ in a text sequence of length $T$, the goal of language model is to estimate the joint probability of the sequence $P(x_1, x_2, \cdots, x_T)$. We should know how to model a document or even a sequence of tokens. 🏉 Learning a language Model Let us start by applying..

→2022.02.01

[NLP] 1. Introduction of NLP, Word2vec

🏓 NLP: Natural Language Processing 자연어를 처리하는 분야, 우리의 말을 컴퓨터에게 이해시키기 위한 분야를 의미합니다. 자연어는 살아있는 언어이며 그 안에는 '부드러움'이 있습니다. 🏓 '단어의 의미'를 잘 파악하는 표현방법 시소러스(Thesaurus, 유의어 사전) 활용 단어 네트워크(사람의 손으로 만든 유의어 사전)를 이용하는 방법이다. 단어 사이의 '상위와 하위' 혹은 '전체와 부분' 등 더 세세한 관계까지 정의해둔다. ex. Car = auto, automobile, machine, motorcar 대표적인 시소러스는 WordNet(NLTK 모듈)이 존재한다. Cons: 사람이 수작업으로 레이블링하는 번거로움/시대 변화에 대응하기 어렵다./단어의 미묘한 차이를 표현할 수..

→2022.01.22

AI, Deep Learning Basics/Computer Vision

[Probability] Gaussian Process

🪴 Task 하나의 함수에서 나오는 context dataset 이 다음과 같이 주어져있을 때 $C o n t e x t : {(x_{1}, y_{1}), (x_{2}, y_{2}), . . ., (x_{c}, y_{c})}$ 이 주어졌을 때 $T a r g e t : x_{1}^{*}, x_{2}^{*}, . . ., x_{t}^{*}$ 의 y값을 예측하는 방법 🪴 Gaussian Process Gaussian Process에서는 예측값을 Normal distribution으로, distribution over target 값들을 나타낸다. 즉, ${\hat{y}}_{1}^{*} \sim N (μ_{y_{1}^{*}}, σ_{y_{1}^{*}})$ $$\hat{y}_2^* \sim \mathcal{N}(\mu_{y_2^*}, \sigma_{y_..

→2022.01.21

[기초] 이미지 classification 기본 모델: VGG, GoogLeNet, ResNet

이 글은 필자가 "밑바닥부터 시작하는 딥러닝 1"을 보고 헷갈리는 부분이나 다시 보면 좋을만한 부분들을 위주로 정리한 글입니다. 헷갈렸던 부분 CNN: (Conv - Relu) - Pooling 거침 Pooling layer는 크기만 줄어들뿐 깊게 만들기 위해서는 Conv layer가 필수적이다. 1x1 Conv vs. 3x3 Conv (1x1 Conv): 주로 채널 개수를 줄이기 위해 사용 (3x3 Conv, stride =1) 1. VGG 합성곱 계층과 풀링 계층으로 구성되는 기본적인 CNN Receptive field를 통한 (5x5) Conv보다 2개의 (3x3) Conv가 더 효율적임을 이야기 (Max pooling) - N개의 3x3 Convolution 을 거친후 FC Max pooling으..

→2022.01.16

[기초] 딥러닝 성능 높이기: 층을 깊게 하는 것에 대하여

정확도를 높일 수 있는 방법 데이터 확장Data augmentation 이미지 회전/세로 이동 등의 미세한 변화 이미지 일부를 잘라내는 crop나 좌우를 뒤집는 flip 밝기 등의 외형 변화나 확대 축소 등의 스케일 변화 층을 깊게 하기 '층을 깊게 하는 것'의 중요성 신경망의 매개변수가 줄어든다. 층을 깊게 한 신경망은 깊지 않은 경우보다 적은 매개변수로 같은 (혹은 그 이상) 수준의 표현력 달성 매개변수를 줄여 넓은 수용영역 소화: ex. 5x5 합성곱 연산 vs. 3x3 합성곱 연산을 2회 반복 학습해야 할 문제를 계층적으로 분해 각 층이 학습 해야 할 문제를 더 단순한 문제로 대체 정보를 계층적으로 전달 가능 '층을 깊게 하는 것'의 영향력 : single layer를 추가하는 것에 대한 파라미터..

→2022.01.15

[Deep Learning] 헷갈리는 기본 용어 모음집 (1)

본 글은 필자가 자꾸 헷갈려하는 용어들을 모아놓은 글입니다. 글의 순서가 매끄럽지 않을 수 있다는 점 참고해주세요. 헷갈리는 용어들을 생각날 때마다 업데이트 한 글입니다. CNN이란? : (Convolution + Subsampling) 의 연속 + (Fully-Connected) (Convolution + Subsampling) 을 통해 Feature Extraction을 수행한 후, (Fully-Connected)를 통해 분류를 실행하게 됩니다. 이런 일련의 구조를 통해 계층구조를 형성하는데, 이를 compositionality라 칭합니다. Convolution 과정을 통해 Image * filter를 통해 Convolved Feature를 뽑게 된다. tf.nn.conv2d(input, filter..

→2022.01.11

(작성중) [기초] 3. Overfitting/Underfitting 및 Regularization일반화/일반화 기법

이 글은 필자가 "밑바닥부터 시작하는 딥러닝 1"을 보고 헷갈리는 부분이나 다시 보면 좋을만한 부분들을 위주로 정리한 글입니다. 추가 참고: 링크1 🏈 Overfitting vs. Underfitting Overfitting과적합 Underfitting 🏈 모델의 Bias와 Variance https://gaussian37.github.io/machine-learning-concept-bias_and_variance/ Bias편향: 모델의 예측값과 실제값이 얼마나 떨어져있는가 Bias가 크면 underfitting Inductive bias General bias Variance분산: 예측 모델의 복잡도, variance가 크면 overfitting 🏈 Regularization 규제(Regulariza..

→2022.01.09

AI, Deep Learning Basics/Computer Vision

[Probability] MLE를 통한 MAP 추론: Posterior/Prior/Likelihood -Bayes rule/Bayesian Equation

Posterior/Prior/Likelihood 우리가 가장 궁금한 Posterior P(c_i|x). 어떤 x가 주어졌을 때 이 x가 어떤 class에 속할지 구해야 한다. 이는 Prior, Likelihood를 이용한 Bayes rule로 추론이 가능한다. Prior: 일반적으로 가지고 있는 상식. class에 속할 확률을 이야기한다. ex. 피부 밝기(x)에 관계없이 농어와 연어의 비율이 얼마나 되는지의 값. 보통 사전 정보로 주어지거나, 주어지지 않는다면 연구자의 사전 지식을 통해 정해줘야 하는 값이다. $P (c_{i}), P (θ)$ Posterior: 궁극적으로 구해야 하는 성질. 어떤 x가 주어졌을 때 이 x가 어떤 class에 속하는가. ex. 피부 밝기(x)가 주어졌을 때 그 물고..

→2022.01.08

[Generative Model] Variational AutoEncoder 3. Variational Inference

이 글은 VAE 모델을 학습하는 데 있어 이미지를 생성하는 Decoder 쪽에서 일어나는 Variational Inference 부분을 수식을 통해 자세히 이해하고자 만든 글입니다. 참고자료는 블로그, 블로그2, 블로그3 입니다. -220319. z에 관한 설명, 수식 전체적으로 latex 처리 🔦 시작하기 전 단어 정리 Variational Inference/Varational Bayesain Method: 변분 추론 KL-Divergence: 두 확률분포의 차이를 나타내는 지표 ELBO(Evidence LowerBOund): Loss function을 추론하며 나타나는, 함수가 학습되면서 학습할 방향을 지정한다. 🧨 VAE Concept Recap Probabilistic Encoder $p(z|x)..

→2022.01.08

[Information theory] Information/Entropy/Cross Entropy/KL Divergence

본 글은 위 링크를 참조하여 정리한 글입니다. 🪄Entropy/Cross Entropy/KL Divergence 특정 stochastic 확률을 띄는 사건 X 일어날 Probability확률 P(X) Information정보량: 주어진 이벤트에서 발생하는 놀라움의 양 (ex. 동전 던지기) I(X) $I n f o r m a t i o n I (X) = - l o g_{2} P (X)$ Entropy엔트로피: 임의의 이벤트에서 발생한 Information(놀라움)의 평균/기댓값 H(X) $E n t r o p y H (X) = E (I (X)) = - \sum P (X) l o g_{2} P (X)$ 사건 P가 확률분포를 가질 때 그 분포를 근사적으로 표현하는 확률분포 Q를 대신 사용할 경우: Cross Entropy크로쓰 엔트로피: 두 확률분포..

→2022.01.08