最新在学习Python的OCR图片文字识别,需要使用cnocr

cnocr 是 Python 3 下的中英文OCR工具包,自带了多个训练好的识别模型(最小模型仅 4.7M),安装后即可直接使用。

cnocr主要针对的是排版简单的印刷体文字图片,如截图图片,扫描件等。目前内置的文字检测和分行模块无法处理复杂的文字排版定位。如果要用于场景文字图片的识别,需要结合其他的场景文字检测引擎使用,例如同样基于MXNet 的文字检测引擎 cnstd 。

可是我在安装中文cnocr识别时,报了如下一堆错误。

D:\Python>pip install cnocr
Collecting cnocr
  Using cached cnocr-1.2.2-py3-none-any.whl (50 kB)
Collecting numpy<1.20.0,>=1.14.0
  Downloading numpy-1.19.5-cp39-cp39-win_amd64.whl (13.3 MB)
     |████████████████████████████████| 13.3 MB 3.2 MB/s
Collecting mxnet<1.7.0,>=1.5.0
  Using cached mxnet-1.6.0-py2.py3-none-win_amd64.whl (26.9 MB)
Collecting gluoncv<0.7.0,>=0.3.0
  Using cached gluoncv-0.6.0-py2.py3-none-any.whl (693 kB)
Requirement already satisfied: pillow>=5.3.0 in c:\python\python39\lib\site-packages (from cnocr) (8.1.2)
Requirement already satisfied: matplotlib in c:\python\python39\lib\site-packages (from gluoncv<0.7.0,>=0.3.0->cnocr) (3.3.4)
Collecting tqdm
  Using cached tqdm-4.61.1-py2.py3-none-any.whl (75 kB)
Requirement already satisfied: scipy in c:\python\python39\lib\site-packages (from gluoncv<0.7.0,>=0.3.0->cnocr) (1.6.1)
Collecting portalocker
  Using cached portalocker-2.3.0-py2.py3-none-any.whl (15 kB)
Requirement already satisfied: requests in c:\python\python39\lib\site-packages (from gluoncv<0.7.0,>=0.3.0->cnocr) (2.20.0)
Collecting graphviz<0.9.0,>=0.8.1
  Using cached graphviz-0.8.4-py2.py3-none-any.whl (16 kB)
Collecting requests
  Using cached requests-2.18.4-py2.py3-none-any.whl (88 kB)
Collecting numpy<1.20.0,>=1.14.0
  Using cached numpy-1.16.6.zip (5.1 MB)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in c:\python\python39\lib\site-packages (from requests->gluoncv<0.7.0,>=0.3.0->cnocr) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in c:\python\python39\lib\site-packages (from requests->gluoncv<0.7.0,>=0.3.0->cnocr) (2018.8.24)
Requirement already satisfied: urllib3<1.23,>=1.21.1 in c:\python\python39\lib\site-packages (from requests->gluoncv<0.7.0,>=0.3.0->cnocr) (1.22)
Collecting idna<2.7,>=2.5
  Using cached idna-2.6-py2.py3-none-any.whl (56 kB)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\python\python39\lib\site-packages (from matplotlib->gluoncv<0.7.0,>=0.3.0->cnocr) (1.3.1)
Requirement already satisfied: python-dateutil>=2.1 in c:\python\python39\lib\site-packages (from matplotlib->gluoncv<0.7.0,>=0.3.0->cnocr) (2.8.1)
Requirement already satisfied: cycler>=0.10 in c:\python\python39\lib\site-packages (from matplotlib->gluoncv<0.7.0,>=0.3.0->cnocr) (0.10.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in c:\python\python39\lib\site-packages (from matplotlib->gluoncv<0.7.0,>=0.3.0->cnocr) (2.4.7)
Requirement already satisfied: six in c:\python\python39\lib\site-packages (from cycler>=0.10->matplotlib->gluoncv<0.7.0,>=0.3.0->cnocr) (1.15.0)
Requirement already satisfied: pywin32!=226 in c:\python\python39\lib\site-packages (from portalocker->gluoncv<0.7.0,>=0.3.0->cnocr) (301)
Using legacy 'setup.py install' for numpy, since package 'wheel' is not installed.
Installing collected packages: numpy, idna, tqdm, requests, portalocker, graphviz, mxnet, gluoncv, cnocr
  Attempting uninstall: numpy
    Found existing installation: numpy 1.21.0+mkl
    Uninstalling numpy-1.21.0+mkl:
      Successfully uninstalled numpy-1.21.0+mkl
    Running setup.py install for numpy ... error
    ERROR: Command errored out with exit status 1:
     command: 'c:\python\python39\python.exe' -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'G:\\TEMP\\pip-install-ygzt544u\\numpy_5130c6a65b9b44ddb1c573d3c629228c\\setup.py'"'"'; __file__=
'"'"'G:\\TEMP\\pip-install-ygzt544u\\numpy_5130c6a65b9b44ddb1c573d3c629228c\\setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools
 import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'G:\TEMP\pip-record-j2039wk4\install-record.txt' --si
ngle-version-externally-managed --compile --install-headers 'c:\python\python39\Include\numpy'
         cwd: G:\TEMP\pip-install-ygzt544u\numpy_5130c6a65b9b44ddb1c573d3c629228c\
    Complete output (271 lines):
    Running from numpy source directory.
    
    Note: if you need reliable uninstall behavior, then install
    with pip instead of using `setup.py install`:
    
      - `pip install .`       (from a git repo or downloaded source
                               release)
      - `pip install numpy`   (last NumPy release on PyPi)
    
    
    G:\TEMP\pip-install-ygzt544u\numpy_5130c6a65b9b44ddb1c573d3c629228c\numpy\distutils\misc_util.py:476: SyntaxWarning: "is" with a literal. Did you mean "=="?
      return is_string(s) and ('*' in s or '?' is s)
    blas_opt_info:
    blas_mkl_info:
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries mkl_rt not found in ['c:\\python\\python39\\lib', 'C:\\', 'c:\\python\\python39\\libs']
      NOT AVAILABLE
    
    blis_info:
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries blis not found in ['c:\\python\\python39\\lib', 'C:\\', 'c:\\python\\python39\\libs']
      NOT AVAILABLE
    
    openblas_info:
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries openblas not found in ['c:\\python\\python39\\lib', 'C:\\', 'c:\\python\\python39\\libs']
    get_default_fcompiler: matching types: '['gnu', 'intelv', 'absoft', 'compaqv', 'intelev', 'gnu95', 'g95', 'intelvem', 'intelem', 'flang']'
    customize GnuFCompiler
    Could not locate executable g77
    Could not locate executable f77
    customize IntelVisualFCompiler
    Could not locate executable ifort
    Could not locate executable ifl
    customize AbsoftFCompiler
    Could not locate executable f90
    customize CompaqVisualFCompiler
    Could not locate executable DF
    customize IntelItaniumVisualFCompiler
    Could not locate executable efl
    customize Gnu95FCompiler
    Could not locate executable gfortran
    Could not locate executable f95
    customize G95FCompiler
    Could not locate executable g95
    customize IntelEM64VisualFCompiler
    customize IntelEM64TFCompiler
    Could not locate executable efort
    Could not locate executable efc
    customize PGroupFlangCompiler
    Could not locate executable flang
    don't know how to compile Fortran code on platform 'nt'
      NOT AVAILABLE
    
    atlas_3_10_blas_threads_info:
    Setting PTATLAS=ATLAS
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries tatlas not found in ['c:\\python\\python39\\lib', 'C:\\', 'c:\\python\\python39\\libs']
      NOT AVAILABLE
    
    atlas_3_10_blas_info:
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries satlas not found in ['c:\\python\\python39\\lib', 'C:\\', 'c:\\python\\python39\\libs']
      NOT AVAILABLE
    
    atlas_blas_threads_info:
    Setting PTATLAS=ATLAS
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries ptf77blas,ptcblas,atlas not found in ['c:\\python\\python39\\lib', 'C:\\', 'c:\\python\\python39\\libs']
      NOT AVAILABLE
    
    atlas_blas_info:
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries f77blas,cblas,atlas not found in ['c:\\python\\python39\\lib', 'C:\\', 'c:\\python\\python39\\libs']
      NOT AVAILABLE
    
    accelerate_info:
      NOT AVAILABLE
    
    G:\TEMP\pip-install-ygzt544u\numpy_5130c6a65b9b44ddb1c573d3c629228c\numpy\distutils\system_info.py:639: UserWarning:
        Atlas (http://math-atlas.sourceforge.net/) libraries not found.
        Directories to search for the libraries can be specified in the
        numpy/distutils/site.cfg file (section [atlas]) or by setting
        the ATLAS environment variable.
      self.calc_info()
    blas_info:
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries blas not found in ['c:\\python\\python39\\lib', 'C:\\', 'c:\\python\\python39\\libs']
      NOT AVAILABLE
    
    G:\TEMP\pip-install-ygzt544u\numpy_5130c6a65b9b44ddb1c573d3c629228c\numpy\distutils\system_info.py:639: UserWarning:
        Blas (http://www.netlib.org/blas/) libraries not found.
        Directories to search for the libraries can be specified in the
        numpy/distutils/site.cfg file (section [blas]) or by setting
        the BLAS environment variable.
      self.calc_info()
    blas_src_info:
      NOT AVAILABLE
    
    G:\TEMP\pip-install-ygzt544u\numpy_5130c6a65b9b44ddb1c573d3c629228c\numpy\distutils\system_info.py:639: UserWarning:
        Blas (http://www.netlib.org/blas/) sources not found.
        Directories to search for the sources can be specified in the
        numpy/distutils/site.cfg file (section [blas_src]) or by setting
        the BLAS_SRC environment variable.
      self.calc_info()
      NOT AVAILABLE
    
    non-existing path in 'numpy\\distutils': 'site.cfg'
    lapack_opt_info:
    lapack_mkl_info:
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries mkl_rt not found in ['c:\\python\\python39\\lib', 'C:\\', 'c:\\python\\python39\\libs']
      NOT AVAILABLE
    
    openblas_lapack_info:
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries openblas not found in ['c:\\python\\python39\\lib', 'C:\\', 'c:\\python\\python39\\libs']
      NOT AVAILABLE
    
    openblas_clapack_info:
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries openblas,lapack not found in ['c:\\python\\python39\\lib', 'C:\\', 'c:\\python\\python39\\libs']
      NOT AVAILABLE
    
    atlas_3_10_threads_info:
    Setting PTATLAS=ATLAS
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries lapack_atlas not found in c:\python\python39\lib
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries tatlas,tatlas not found in c:\python\python39\lib
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries lapack_atlas not found in C:\
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries tatlas,tatlas not found in C:\
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries lapack_atlas not found in c:\python\python39\libs
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries tatlas,tatlas not found in c:\python\python39\libs
    <class 'numpy.distutils.system_info.atlas_3_10_threads_info'>
      NOT AVAILABLE
    
    atlas_3_10_info:
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries lapack_atlas not found in c:\python\python39\lib
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries satlas,satlas not found in c:\python\python39\lib
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries lapack_atlas not found in C:\
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries satlas,satlas not found in C:\
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries lapack_atlas not found in c:\python\python39\libs
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries satlas,satlas not found in c:\python\python39\libs
    <class 'numpy.distutils.system_info.atlas_3_10_info'>
      NOT AVAILABLE
    
    atlas_threads_info:
    Setting PTATLAS=ATLAS
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries lapack_atlas not found in c:\python\python39\lib
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries ptf77blas,ptcblas,atlas not found in c:\python\python39\lib
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries lapack_atlas not found in C:\
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries ptf77blas,ptcblas,atlas not found in C:\
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries lapack_atlas not found in c:\python\python39\libs
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries ptf77blas,ptcblas,atlas not found in c:\python\python39\libs
    <class 'numpy.distutils.system_info.atlas_threads_info'>
      NOT AVAILABLE
    
    atlas_info:
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries lapack_atlas not found in c:\python\python39\lib
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries f77blas,cblas,atlas not found in c:\python\python39\lib
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries lapack_atlas not found in C:\
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries f77blas,cblas,atlas not found in C:\
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries lapack_atlas not found in c:\python\python39\libs
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries f77blas,cblas,atlas not found in c:\python\python39\libs
    <class 'numpy.distutils.system_info.atlas_info'>
      NOT AVAILABLE
    
    lapack_info:
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    customize MSVCCompiler
      libraries lapack not found in ['c:\\python\\python39\\lib', 'C:\\', 'c:\\python\\python39\\libs']
      NOT AVAILABLE
    
    G:\TEMP\pip-install-ygzt544u\numpy_5130c6a65b9b44ddb1c573d3c629228c\numpy\distutils\system_info.py:639: UserWarning:
        Lapack (http://www.netlib.org/lapack/) libraries not found.
        Directories to search for the libraries can be specified in the
        numpy/distutils/site.cfg file (section [lapack]) or by setting
        the LAPACK environment variable.
      self.calc_info()
    lapack_src_info:
      NOT AVAILABLE
    
    G:\TEMP\pip-install-ygzt544u\numpy_5130c6a65b9b44ddb1c573d3c629228c\numpy\distutils\system_info.py:639: UserWarning:
        Lapack (http://www.netlib.org/lapack/) sources not found.
        Directories to search for the sources can be specified in the
        numpy/distutils/site.cfg file (section [lapack_src]) or by setting
        the LAPACK_SRC environment variable.
      self.calc_info()
      NOT AVAILABLE
    
    c:\python\python39\lib\distutils\dist.py:274: UserWarning: Unknown distribution option: 'define_macros'
      warnings.warn(msg)
    running install
    running build
    running config_cc
    unifing config_cc, config, build_clib, build_ext, build commands --compiler options
    running config_fc
    unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options
    running build_src
    build_src
    building py_modules sources
    creating build
    creating build\src.win-amd64-3.9
    creating build\src.win-amd64-3.9\numpy
    creating build\src.win-amd64-3.9\numpy\distutils
    building library "npymath" sources
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    error: Microsoft Visual C++ 14.0 is required. Get it with "Build Tools for Visual Studio": https://visualstudio.microsoft.com/downloads/
    ----------------------------------------
  Rolling back uninstall of numpy
  Moving to c:\python\python39\lib\site-packages\numpy-1.21.0+mkl.dist-info\
   from C:\Python\Python39\Lib\site-packages\~umpy-1.21.0+mkl.dist-info
  Moving to c:\python\python39\lib\site-packages\numpy\
   from C:\Python\Python39\Lib\site-packages\~umpy
  Moving to c:\python\python39\scripts\f2py.exe
   from G:\TEMP\pip-uninstall-xcfx15l4\f2py.exe
ERROR: Command errored out with exit status 1: 'c:\python\python39\python.exe' -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'G:\\TEMP\\pip-install-ygzt544u\\numpy_5130c6a65b9b44ddb1c573d3c
629228c\\setup.py'"'"'; __file__='"'"'G:\\TEMP\\pip-install-ygzt544u\\numpy_5130c6a65b9b44ddb1c573d3c629228c\\setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else
 io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'G:\TEMP\pip-record-
j2039wk4\install-record.txt' --single-version-externally-managed --compile --install-headers 'c:\python\python39\Include\numpy' Check the logs for full command output.

其中看到了这个信息:

building library "npymath" sources
    No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
    error: Microsoft Visual C++ 14.0 is required. Get it with "Build Tools for Visual Studio": https://visualstudio.microsoft.com/downloads/

这是因为系统环境缺少了C++支持,虽然我安装了VisualStudio,但是只是安装了C#和C++,没有安装BuildTools,因此要解决这个问题,我先安装了C++BuildTool,如下所示:

另外也可以按照下面方式安装:

1.打开微软 VS 官方的下载页面 https://visualstudio.microsoft.com/zh-hans/downloads/
2.往下找到 “Visual Studio 2019 工具” 选项卡,下载 “Visual Studio 2019 生成工具”。
3.运行下载好的安装包,勾选 “工作负载” 选项卡中的 “C++ 生成工具” 后,勾选右侧安装详细信息中的 “MSVC v141”、“MSVC v140” 并开始安装。

安装之后,再次pip install cnocr,问题解决。

最后修改:2021 年 07 月 01 日 10 : 52 AM