Training tip 정리

Learning rate 부터 정리

Training 시간이 왜 이렇게 오래/짧게 걸리는가

Nvidia-smi (GPU 실시간 확인)
Profiler (GPU 사용 history 정리)
어떤걸 고려해야 하는가
1. Multi-processing: Data loader 의 num_worker
2. Multi-threading

Training 되는지 어떻게 확인하는가

Loss이 내려가는지 확인. 특히, Loss 각 항목에 관해서 처리.
Parameter나 buffer의 mean, variance 확인
Etc.
- 결과 확인시 전체 데이터에 대해서 해야지 나오는 batch에 따른 결과를 보면 안된다.

Etc.

Multi-processing vs. Multi-threading

Process vs. Thread
Multi-processing vs. Multi-threading

Buffers are tensors, which are registered in the module and will thus be inside the state_dict. These tensors do not require gradients and are thus not registered as parameters. This is useful e.g. to track the mean and std in batch norm layers etc. which should be stored and loaded using the state_dict of the module.

저작자표시 비영리 동일조건

'AI, Deep Learning Basics > Basic' 카테고리의 다른 글

[Basic] Probabilistic model (0)	2022.03.11
[Logger] TensorboardX 사용하기 (0)	2022.02.19
[Basic] Activation Function/Loss Function/Evaluation metric (0)	2022.02.12
[기초] 딥러닝 성능 높이기: 층을 깊게 하는 것에 대하여 (0)	2022.01.15
[Deep Learning] 헷갈리는 기본 용어 모음집 (1) (0)	2022.01.11

Etc.

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Training tip 정리

Training 시간이 왜 이렇게 오래/짧게 걸리는가

Training 되는지 어떻게 확인하는가

Training 때 고려

Etc.

Multi-processing vs. Multi-threading

Buffer vs. Parameter

'AI, Deep Learning Basics > Basic' 카테고리의 다른 글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역