最近在系统地接触学习NER,但是发现这方面的小帖子还比较零散。所以我把学习的记录放出来给大家作参考,其中汇聚了很多其他博主的知识,在本文中也放出了他们的原链。希望能够以这篇文章为载体,帮助其他跟我一样的学习者梳理、串起NER的各个小知识点,最后上手NER的主流模型(Bilstm+CRF)(文中讲的是pytorch,但是懂了pytorch去看keras十分容易相信我哈)
-
电脑:联想小新Air 13 pro
-
CPU:i5 ,4G运行内存
-
显卡:NVIDIA GeForce 940MX,2G显存
-
系统:windows10 64位系统
-
软件:Anaconda 5.3.0 python 3.6.6 Pytorch1.0
一、NER资料
参考:NLP之CRF应用篇(序列标注任务)(CRF++的详细解析、Bi-LSTM+CRF中CRF层的详细解析、Bi-LSTM后加CRF的原因、CRF和Bi-LSTM+CRF优化目标的区别)
CRF++完成的是学习和解码的过程:训练即为学习的过程,预测即为解码的过程。
参考:Bilstm+crf中的crf详解(这份资料对后面代码的理解是有帮助的)
-
BiLSTM中的参数
-
转移概率矩阵A
-
反向传播不需要一定使用forward(),而且不需要定义loss=nn.MSError()等,直接score1 - score2 (neg_log_likelihood函数),就可以反向传播了。
-
使用self.transitions = nn.Parameter(torch.randn(self.tagset_size, self.tagset_size)) 将想要更新的矩阵,放入到module的参数中,然后两个矩阵无论怎么操作,只要满足 y = f(x, w),就能够反向传播
-
从代码看出每个循环里只是去了转移矩阵A的一行,或者就是一个值,进行操作,转移矩阵就能够更新。至于为什么能够更新,作者也不知道,这涉及到pytorch的机制。
-
发射矩阵(emit score)是 BiLSTM算出来的。转移矩阵是单独定义的,要学习的。初始矩阵是 [-1000,-1000,-1000,0,-1000],固定的。因为当加了开始符号后,第一个位置是开始符号的概率是100%。
-
显式的加入了start标记,隐式的使用了end标记(总是最后多一步转移到end)的分数
-
pytorch中bilstm-crf部分code解析(也很良心了,作者画了草图帮助理解)
-
pytorch版的bilstm+crf实现sequence label(比较粗的注解)
AttributeError: 'str' object has no attribute 'decode'
.decode("*")
那部分删除即可from compiler.ast import flatten
的flatten函数
<pre style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="margin-right: 8px;margin-left: 8px;max-width: 100%;letter-spacing: 0.544px;white-space: normal;color: rgb(0, 0, 0);font-family: -apple-system-font, system-ui, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;widows: 1;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;letter-spacing: 0.5px;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><strong style="max-width: 100%;font-size: 16px;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;letter-spacing: 0.5px;box-sizing: border-box !important;overflow-wrap: break-word !important;">—</span></strong>完<strong style="max-width: 100%;font-size: 16px;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;letter-spacing: 0.5px;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><strong style="max-width: 100%;font-size: 16px;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;letter-spacing: 0.5px;box-sizing: border-box !important;overflow-wrap: break-word !important;">—</span></strong></span></strong></span></strong></section><section style="max-width: 100%;letter-spacing: 0.544px;white-space: normal;font-family: -apple-system-font, system-ui, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;widows: 1;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section powered-by="xiumi.us" style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="margin-top: 15px;margin-bottom: 25px;max-width: 100%;opacity: 0.8;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="max-width: 100%;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section powered-by="xiumi.us" style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="margin-top: 15px;margin-bottom: 25px;max-width: 100%;opacity: 0.8;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="margin-right: 8px;margin-bottom: 15px;margin-left: 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;color: rgb(127, 127, 127);font-size: 12px;font-family: sans-serif;line-height: 25.5938px;letter-spacing: 3px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(0, 0, 0);box-sizing: border-box !important;overflow-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;font-size: 16px;font-family: 微软雅黑;caret-color: red;box-sizing: border-box !important;overflow-wrap: break-word !important;">为您推荐</span></strong></span></section><section style="margin: 5px 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;font-family: sans-serif;letter-spacing: 0px;opacity: 0.8;line-height: normal;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(87, 107, 149);font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">人工智能领域最具影响力的十大女科学家</span><br style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;" /></section><section style="margin: 5px 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;font-family: sans-serif;letter-spacing: 0px;opacity: 0.8;line-height: normal;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(87, 107, 149);font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">MIT最新深度学习入门课,安排起来!</span></section><section style="margin: 5px 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;font-family: sans-serif;letter-spacing: 0px;opacity: 0.8;line-height: normal;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(87, 107, 149);font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">有了这个神器,轻松用 Python 写个 App</span></section><section style="margin: 5px 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;font-family: sans-serif;letter-spacing: 0px;opacity: 0.8;line-height: normal;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(87, 107, 149);font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">「最全」实至名归,NumPy 官方早有中文教程</span><br style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;" /></section><section style="margin: 5px 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;font-family: sans-serif;letter-spacing: 0px;opacity: 0.8;line-height: normal;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">10个必会的 PyCharm 技巧</span></section></section></section></section></section></section></section></section></section>
本篇文章来源于: 深度学习这件小事
本文为原创文章,版权归知行编程网所有,欢迎分享本文,转载请保留出处!
内容反馈