#### 2.4 Real path score

In section 2.3, we supposed that every possible path has a score $ P_{i} $ and there are totally $ N $ possible paths, the total score of all the paths is $ P_{total} = P_1 + P_2 + … + P_N = e^{S_1} + e^{S_2} + … + e^{S_N} $, $ e $ is the mathematical constant $ e $.

Obviously, there must be a path is the real one among all the possible paths. For exmaple, the real path of the sentence in section 1.2 is **“START B-Person I-Person O B-Organization O END”**. The others are incorrect such as “START B-Person B-Organization O I-Person I-Person B-Person”. $ e^{S_i} $ is the score of $ i^{th} $ path.

During the training process, the crf loss function only need two scores: the score of the real path and the total score of all the possbile paths. **The proportion of the real path score among the scores of all the possible paths will be increased gradually.**

The calculation of a real path score, $e^{S_i}$, is very straightforward.

Here we focus on the calculation of $ S_i $.

Take the real path, **“START B-Person I-Person O B-Organization O END”**, we used before, for example:

- We have a sentence which has 5 words, $w_1, w_2, w_3, w_4, w_5$
- We add two more extra words which denote the start and the end of a sentence, $w_0, w_6$
- $S_i$ consists of 2 parts: $S_i = EmissionScore + TransitionScore $ (The emission and transition score are expanined in section 2.1 and 2.2)

**Emission Score:**

$EmissionScore=x_{0,START}+x_{1,B-Person}+x_{2,I-Person}+x_{3,O}+x_{4,B-Organization}+x_{5,O}+x_{6,END}$

$ x_{index,label} $ is the score if the $index^{th}$ word is labelled by $ label $

These scores $ x_{1,B-Person} $ $ x_{2,I-Person} $ $ x_{3,O} $ $ x_{4,Organization} $ $ x_{5,O} $ are from the previous BiLSTM output.

As for the $ x_{0,START} $ and $ x_{6,END} $, we can just set them zeros.

**Transition Score:**

$TransitionScore=$

$t_{START->B-Person} + t_{B-Person->I-Person} + $

$t_{I-Person->O} + t_{0->B-Organization} + t_{B-Organization->O} + t_{O->END}$

- $t_{label1->label2}$ is the transition score from $label1$ to $label2$
- These scores come from the CRF Layer. In other words, these transition scores are actually the parameters of CRF Layer.

To sum up, now we can calculate $S_i$ and as well as the path score $e^{S_i}$. The next step is **how to calculate the total score of all the possible paths?**

### Next

#### 2.5 The total score of all the possible paths

How to calculate the total score of all the possible paths of a sentence with a step-by-step toy example.

This section would be one of the most important and a bit difficult part. But DO NOT worry. The toy example given in this section will explain the details as simple as possible.

**(Sorry for my late update, I will try my best to squeeze time for updating the following sections.)**

## References

[1] Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K. and Dyer, C., 2016. Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360.

https://arxiv.org/abs/1603.01360

## Note

Please note that: The **Wechat Public Account** is available now! If you found this article is useful and would like to found more information about this series, please subscribe to the public account by your Wechat! (2020-04-03)

When you reprint or distribute this article, please include the original link address.