*TECHNICAL ARTICLES*

*TECHNICAL ARTICLES*

*Quick Links:*

- [Github] https://github.com/createmomo (You can also find the Wechat Public Account here)
**[2021 - Scheduled]**Graphical Neural Network in NLP**[2020 - Working On]**Main Points of Interesting Papers**[2020 - Coming Soon]**Fantastic Trees (Decision Trees, Random Forest, Adaboost, Gradient Boosting DT, XGBoost)- [2020] Improving Your English Communication Skills (Writing Emails, Speaking English and Building ePortfolio)
**[2019 - Working On]**Probabilistic Graphical Models Revision Notes- [2018] Super Machine Learning Revision Notes
- [2017] CRF Layer on the Top of BiLSTM
- CRF Layer on the Top of BiLSTM - 1 Outline and Introduction
- CRF Layer on the Top of BiLSTM - 2 CRF Layer (Emission and Transition Score)
- CRF Layer on the Top of BiLSTM - 3 CRF Loss Function
- CRF Layer on the Top of BiLSTM - 4 Real Path Score
- CRF Layer on the Top of BiLSTM - 5 The Total Score of All the Paths
- CRF Layer on the Top of BiLSTM - 6 Infer the Labels for a New Sentence
- CRF Layer on the Top of BiLSTM - 7 Chainer Implementation Warm Up
- CRF Layer on the Top of BiLSTM - 8 Demo Code

*Detailed Links:***(Last Updated: 2020.02.23) Probabilistic Graphical Models Revision Notes**

**Representations****Bayesian Network (directed graph)****Markov Network (undirected graph)**

**Inference****Learning**

**(Last Updated: 2019.01.06) Super Machine Learning Revision Notes**

**Activation Functions****Gradient Descent**- Computation Graph
- Backpropagation
- Gradients for L2 Regularization (weight decay)
- Vanishing/Exploding Gradients
- Mini-Batch Gradient Descent
- Stochastic Gradient Descent
- Choosing Mini-Batch Size
- Gradient Descent with Momentum (always faster than SGD)
- Gradient Descent with RMSprop
- Adam (put Momentum and RMSprop together)
- Learning Rate Decay Methods
- Batch Normalization

**Parameters****Regularization****Models**- Logistic Regression
- Multi-Class Classification (Softmax Regression)
- Transfer Learning
- Multi-task Learning
- Convolutional Neural Network (CNN)
- Sequence Models
- Transformer (Attention Is All You Need)
- Bidirectional Encoder Representations from Transformers (BERT)

**Practical Tips**

**[2017] CRF Layer on the Top of BiLSTM (BiLSTM-CRF)**

- CRF Layer on the Top of BiLSTM - 1 Outline and Introduction
- CRF Layer on the Top of BiLSTM - 2 CRF Layer (Emission and Transition Score)
- CRF Layer on the Top of BiLSTM - 3 CRF Loss Function
- CRF Layer on the Top of BiLSTM - 4 Real Path Score
- CRF Layer on the Top of BiLSTM - 5 The Total Score of All the Paths
- CRF Layer on the Top of BiLSTM - 6 Infer the Labels for a New Sentence
- CRF Layer on the Top of BiLSTM - 7 Chainer Implementation Warm Up
- CRF Layer on the Top of BiLSTM - 8 Demo Code

*My Life*

*My Life*