Table of Contents


TECHNICAL ARTICLES

Quick Links:

Detailed Links:
(Last Updated: 2020.02.23) Probabilistic Graphical Models Revision Notes

(Last Updated: 2019.01.06) Super Machine Learning Revision Notes

Reviewing...

[2017] CRF Layer on the Top of BiLSTM (BiLSTM-CRF)

The dog needs to find the best path to get his favorite bone toy and return home following the way he came


My Life

Main Points of Interesting Papers


This page lists the notes of interesting papers at different research topics of Natural Language Processing (NLP). Each note briefly describes the main points of each paper. Hope this would be helpful for you to quickly get the ideas of them. (Please be free to correct me if you found mistakes.)

Read More

Super Machine Learning Revision Notes

[Last Updated: 06/01/2019]

This article aims to summarise:

  • basic concepts in machine learning (e.g. gradient descent, back propagation etc.)
  • different algorithms and various popular models
  • some practical tips and examples were learned from my own practice and some online courses such as Deep Learning AI.

If you a student who is studying machine learning, hope this article could help you to shorten your revision time and bring you useful inspiration. If you are not a student, hope this article would be helpful when you cannot recall some models or algorithms.

Moreover, you can also treat it as a “Quick Check Guide”. Please be free to use Ctrl+F to search any key words interested you.

Any comments and suggestions are most welcome!

Read More

My Life


CRF Layer on the Top of BiLSTM - 8

3.4 Demo

In this section, we will make two fake sentences which only have 2 words and 1 word respectively. Moreover, we will also randomly generate their true answers. Finally, we will show how to train the CRF Layer by using Chainer v2.0. All the codes including the CRF layer are avaialbe from GitHub.

Read More

CRF Layer on the Top of BiLSTM - 6

2.6 Infer the labels for a new sentence

In the previous sections, we learned the structure of BiLSTM-CRF model and the details of CRF loss function. You can implement your own BiLSTM-CRF model by various opensource frameworks (Keras, Chainer, TensorFlow etc.). One of the greatest things is the backpropagation of on your model is automatically computed on these frameworks, therefore you do not need to implement the backpropagation by yourself to train your model (i.e. compute the gradients and to update parameters). Moreover, some frameworks have already implemented the CRF layer, so combining a CRF layer with your own model would be very easy by only adding about one line code.

In this section, we will explore how to infer the labels for a sentence during the test when our model is ready.

Read More

CRF Layer on the Top of BiLSTM - 5

2.5 The total score of all the paths

In the last section, we learned how to calculate the label path score of one path that is $e^{S_i}$. So far, we have one more problem which is needed to be solved, how to obtain the total score of all the paths ($ P_{total} = P_1 + P_2 + … + P_N = e^{S_1} + e^{S_2} + … + e^{S_N} $).

The simplest way to measure the total score is that: enumerating all the possible paths and sum their scores. Yes, you can calculate the total score in that way. However, it is very inefficient. The training time will be unbearable.

Read More