1
mingyun 2018-10-31 22:19:04 +08:00
win 安装失败了
Failed building wheel for pyahocorasick Running setup.py clean for pyahocorasick Failed to build pyahocorasick Installing collected packages: pyahocorasick, cnt.rulebase Running setup.py install for pyahocorasick ... error Complete output from command d:\python3\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\ADMINI~1\\AppData\\Local\\Temp\\pip-build-wot4whvz\\pyahocorasick\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record C:\Users\ADMINI~1\AppData\Local\Temp\pip-_6l5x87u-record\install-record.txt --single-version-externally-managed --compile: running install running build running build_ext building 'ahocorasick' extension error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": http://landinghub.visualstudio.com/visual-cpp-build-tools |
2
huntzhan OP @mingyun
这个是依赖项 `pyahocorasick` 报错。 > error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": http://landinghub.visualstudio.com/visual-cpp-build-tools 装 `Microsoft Visual C++ 14.0` 应该可以解决问题。我的实现应该是可以在 Windows 跑的。 |
3
NCZkevin 2018-11-01 00:18:12 +08:00 1
中文工具竟然没有中文文档。。感觉分词效果很一般
|
4
NCZkevin 2018-11-01 00:33:11 +08:00
看了下源码,感觉现在功能还不完善,经常用这方面的库,先 star 关注后续更新
|
5
huntzhan OP @NCZkevin 没做分词(在做了,目前还没开源),你是指分句效果不行对吧?
如果有见过更好的支持中文分句的库,希望可以推荐一下。核心的问题是,中文分句没有标注数据,我也只能上规则做这个事情了。 |
6
dezhou 2018-11-01 12:07:02 +08:00 via Android
分句的意思是根据句号分?
|
7
huntzhan OP @dezhou 用一个 list 来判断,见 [sentence_segmentation.const.EM_SENTENCE_ENDINGS]( https://cnt-rulebase.readthedocs.io/en/latest/cnt.rulebase.rules.sentence_segmentation.html#cnt.rulebase.rules.sentence_segmentation.const.EM_SENTENCE_ENDINGS)
|
8
huntzhan OP |