1,下载
https://github.com/scikit-learn/scikit-learn
官网:http://scikit-learn.org/stable/
2,安装
参考官网文档,需要 numpy,scipy,我直接尝试在文件目录下
sudo python setup.py install
出现错误,提示如下:
>>> import sklearn Traceback(most recent call last) : File "<stdin>",
line 1,
in<module > File "sklearn/__init__.py",
line 37,
in<module > from.import __check_build File "sklearn/__check_build/__init__.py",
line 46,
in<module > raise_build_error(e) File "sklearn/__check_build/__init__.py",
line 41,
inraise_build_error % s """ % (e, local_dir, ''.join(dir_content).strip(), msg))
ImportError: No module named _check_build
___________________________________________________________________________
Contents of sklearn/__check_build:
__init__.py __init__.pyc _check_build.c
_check_build.pyx setup.py setup.pyc
___________________________________________________________________________
It seems that scikit-learn has not been built correctly.
If you have installed scikit-learn from source, please do not forget
to build the package before using it: run `python setup.py install` or
`make` in the source directory.
If you have used an installer, please check that it is suited for your
Python version, your operating system and your platform."
尝试着重新安装 numpy scipy 才发现 Mac 系统自己已经自带了许多类库了,如下:
CoreGraphics / OpenSSL / PyObjC / Twisted - 12.2.0 - py2.7.egg - info / altgraph / altgraph - 0.10.1 - py2.7.egg - info / bdist_mpkg / bdist_mpkg - 0.4.4 - py2.7.egg - info / bonjour / dateutil / macholib / macholib - 1.5 - py2.7.egg - info / matplotlib / modulegraph / modulegraph - 0.10.1 - py2.7.egg - info / mpl_toolkits / numpy / py2app / py2app - 0.7.1 - py2.7.egg - info / python_dateutil - 1.5 - py2.7.egg - info / pytz / pytz - 2012d - py2.7.egg - info / scipy / setuptools / setuptools - 0.6c12dev_r88846 - py2.7.egg - info / twisted / xattr / xattr - 0.6.4 - py2.7.egg - info / zope / zope.interface - 3.8.0 - py2.7.egg - info /
后来尝试了好几种方法,使用 pip 和 easy_install 的方法,分别报错.我就在 site-packages 下删除了原来的文件,然后重新安装了,就成功了.(刚开始失败的原因可能是没有把终端重启,重新进入 python)3,测试学习
➜~python Python 2.7.5(
default, Sep 12 2013, 21 : 33 : 34)[GCC 4.2.1 Compatible Apple LLVM 5.0(clang - 500.0.68)] on darwin Type "help",
"copyright",
"credits"or "license"
for more information. >>> import sklearn >>> from sklearn import datasets >>> iris = datasets.load_iris() >>> digits = datasets.load_digits() >>> print(digits.data)[[0.0.5...., 0.0.0.][0.0.0...., 10.0.0.][0.0.0...., 16.9.0.]..., [0.0.1...., 6.0.0.][0.0.2...., 12.0.0.][0.0.10...., 12.1.0.]] >>>
4,后续计划
想跟着自带的例子,将机器学习的常用算法做一个后续的总结,是不错的学习资料.
http://scikit-learn.org/stable/auto_examples/feature_selection_pipeline.html
来源: http://lib.csdn.net/article/machinelearning/36313