参考博客: http://www.cnblogs.com/jasonhaven/p/7355006.html
- import jieba
- import nltk
- nltk.download()
- from nltk.book import *
- text1
- text2
- text1.concordance('monstrous')
- text2.concordance("affection")
- text3.concordance("lived")
- text4
- text3.index("talk")
- text3.index("lived")
- text1.similar("monstrous")
- text2.similar("monstrous")
- text2.common_contexts("monstrous","very"))
- text2.common_contexts("monstrous","very")
- text2.common_contexts(["monstrous","very"])
- text4.dispersion_plot(['citizens','democracy','freedom','duties','America'])
- text3.generate()
- text3.generate?
- help(text3.generate)
- dir(text3)
- len(text3)
- set(text3)
- sorted(set(text3))
- len(sorted(set(text3)))
- len(text3)
- len(text3)/len(set(text3))
- text3.count("smote")
- 100*text3.count("a")/len(text3)
- 100*text4.count("a")/len(text4)
- sentence1=['Call','me','Ishmael','.']
- sentence1
- sent1
- sent2
- sent3
- len(sent1)
- sent1.index('me')
- sent2
- sent2[-2:]
- '-'.join(sent2)
- ' '.join(sent2)
- s=' '.join(sent2)
- fdist1=FreqDist(text1)
- fdist1
- vocabulary1=fdist1.keys()
- vocabulary1
- vocabulary1[:50]
- type(fdist1)
- type(vocabulary1)
- vocabulary
- vocabulary1
- vocabulary1
- len(vocabulary1)
- fdist1.freq?
- fdist1['whale']
- fdist1.plot?
- fdist1.plot(50,cumulative=True)
- fdist1.plot(20,cumulative=True)
- fdist1['.']
- fdist1.hapaxes()
- V=set(text1)
- long_words=[w for w in V if len(w)>15]
- long_words
- sorted(long_words)
- fdist5=FreqDist(text5)
- sorted([w for w in set(text5) if len(w)>7 and fdist5[w]>7])
- text4.collocations()
- text4.collocations?
- fdist=FreqDist([len(w) for w in text1])
- fdist
- fdist.keys
- fdist.keys()
- fdist.items()
- fdist.max()
- fdist[3]
- fdist.min()
- fdist.freq()
- fdist.freq
- fdist.freq(3)
- fdist.freq?
- fdist.plot(50,cumulative=True)
- fdist.plot()
- sent=['she','shell','sea','by']
- [w for w in sent if w.startswith('sh')]
- [w for w in sent if len(w)>4]
- %hist
分析出特定上下文中的词被赋予的是那个意思。思考存在歧义的词 serve 和 dish。
检测动词的主语和宾语,确定代词或者名词短语指的是什么
确定名词短语如何与动词相关联
自动解决语言理解等问题,如自动问答和机器翻译
图灵测试
来源: http://www.jianshu.com/p/5b0a99710ebc