Python jieba分词简单实例
python extract_tags.py yttlj.txt -k 20
- import sys
- sys.path.append('../')
- import jieba
- import jieba.analyse
- from optparse import OptionParser
- USAGE = "usage: python extract_tags.py [file name] -k [top k]"
- parser = OptionParser(USAGE)
- parser.add_option("-k", dest="topK")
- opt, args = parser.parse_args()
- if len(args) < 1:
- print(USAGE)
- sys.exit(1)
- file_name = args[0]
- if opt.topK is None:
- topK = 10
- else:
- topK = int(opt.topK)
- content = open(file_name, 'rb').read()
- tags = jieba.analyse.extract_tags(content, topK=topK)
- print("\n".join(tags))
来源: https://www.oschina.net/code/snippet_105637_58640