使用腾讯OCR进行文字识别

使用腾讯智能文字识别 OCR 对图片进行文字识别

前段时间有个项目需要一个图片识别转换成文字的功能，后来考虑了一下选择了腾讯云的文字识别OCR。当时对接的过程中觉得有一些地方确实有些坑，所以再记录一下，
避免以后忘记。也希望能给需要的朋友提供一些帮助。

OCR效果

可以参考一下腾讯云官网的链接：文字识别OCR

配置腾讯云OCR准备工作

注册账号

我是直接通过QQ账号进行注册登录，大家也可以查看腾讯云官方教程进行注册，注册腾讯云

创建秘钥

创建新秘钥，可能会弹出窗口提示你不安全，创建子用户之类，这个看你个人需要，想要创建子用户就可以创建，不想创建的话直接点解继续使用即可。最后在左侧菜单栏选择云API秘钥->API秘钥管理，点击 新建秘钥 即可，记录下对应的APPID、SecretId、SecretKey，在项目中需要的地方替换掉。
使用腾讯OCR进行文字识别

使用万象优图创建Bucket

在腾讯云菜单中选择万象优图（链接），点击 Bucket管理，之后点击页面上的 绑定Bucket
- 会提示 该服务需要创建角色
- 点击授权
- 之后继续选择 同意授权
- 之后会提示进行身份验证，使用微信扫描即可，也可以选择使用备选验证方式等
再次点击页面上的 绑定Bucket
- 新增方式选择新建
- 所属项目不用改，直接用 默认项目
- 名称自己命名即可，只要符合规则，其余没什么限制，记住这个名称，之后在项目中会需要用到
- 其余选项可以不需要改动
记住创建之后的bucket名称，之后在项目中需要的地方替换掉

操作指引

如果上面的说明有比较模糊的地方，也可以参考腾讯云官网的操作指引。

实现代码

生成签名

具体说明可以参考腾讯云官网的说明：鉴权签名，我这里使用的java语言，所以直接使用的java签名示例。
将官网给出的代码拷贝到java文件中即可，之后需要使用签名的时候直接调用文件中的appSign方法

配置网络请求，调用OCR识别接口

这一步是当时我觉得比较麻烦的，因为这个接口拼起来有点费劲。并且当前效果是识别本地文件

官方给出的文档在这儿：OCR-通用印刷体识别，如果出现了一些错误也可以在这里找对应的状态码查看原因。

配置网络连接的方法

/**
 * 配置Connection对象
 * @throws Exception 
 */
private static HttpURLConnection handlerConnection(String path, String imageName) throws Exception {
    URL url = new URL(URL);
    // 获取HttpURLConnection对象
    HttpURLConnection connection = (HttpURLConnection) url.openConnection();
    connection.setRequestMethod("POST");	// 设置 Post 请求方式
    connection.setDoOutput(true);			// 允许输出流
    connection.setDoInput(true);			// 允许输入流
    connection.setUseCaches(false);			// 禁用缓存

    // 设置请求头
    connection.setRequestProperty("Connection", "Keep-Alive");
    connection.setRequestProperty("Charset", "UTF-8");
    connection.setRequestProperty("Content-Type","multipart/form-data; boundary=" + BOUNDARY);
    connection.setRequestProperty("authorization", sign());
    connection.setRequestProperty("host", HOST);
    System.out.println( "请求头设置完成");

    // 获取HttpURLConnection的输出流
    DataOutputStream outputStream = new DataOutputStream(connection.getOutputStream());
    
    StringBuffer strBufparam = new StringBuffer();
    strBufparam.append(LINE_END);
    // 封装键值对数据参数
    String inputPartHeader1 = "--" + BOUNDARY + LINE_END + "Content-Disposition:form-data;name=\""+ "appid" +"\";" + LINE_END + LINE_END + APPID + LINE_END;
    String inputPartHeader2 = "--" + BOUNDARY + LINE_END + "Content-Disposition:form-data;name=\""+ "bucket" +"\";" + LINE_END + LINE_END + BUCKET + LINE_END;
    strBufparam.append(inputPartHeader1);
    strBufparam.append(inputPartHeader2);
    // 拼接完成后，一起写入
    outputStream.write(strBufparam.toString().getBytes());

    // 写入图片文件
    String imagePartHeader = "--" + BOUNDARY + LINE_END +
            "Content-Disposition: form-data; name=\"" + "image" + "\"; filename=\"" + imageName + "\"" + LINE_END +
            "Content-Type: image/jpeg" + LINE_END + LINE_END;
    byte[] bytes = imagePartHeader.getBytes();
    outputStream.write(bytes);
    // 获取图片的文件流
    String imagePath = path + File.separator + imageName;
    InputStream fileInputStream = getImgIns(imagePath);
    byte[] buffer = new byte[1024*2];
    int length = -1;
    while ((length = fileInputStream.read(buffer)) != -1){
        outputStream.write(buffer,0,length);
    }
    outputStream.flush();
    fileInputStream.close();

    // 写入标记结束位
    byte[] endData = ("xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" + LINE_END + BOUNDARY + "--" + LINE_END).getBytes();//写结束标记位
    outputStream.write(endData);
    outputStream.flush();
    return connection;
}

部分用到的工具方法

/**
* 根据文件名获取文件输入流
* @throws FileNotFoundException 
*/
private static InputStream getImgIns(String imagePath) throws FileNotFoundException {
    File file = new File(imagePath);
    FileInputStream is = new FileInputStream(file);
    return is;
}
  
/**
* 把输入流的内容转化成字符串
* @param is
* @return
* @throws IOException 
*/
public static String readInputStream(InputStream is) throws IOException{
    ByteArrayOutputStream baos=new ByteArrayOutputStream();
    int length=0;
    byte[] buffer=new byte[1024];
    while((length=is.read(buffer))!=-1){
        baos.write(buffer, 0, length);
    }
    is.close();
    baos.close();
    return baos.toString();
}
    
/**
* 签名方法，调用Sign文件中的appSign方法生成签名
* @return 生成后的签名
*/
public static String sign(){
    long expired = 10000;
    try {
    return Sign.appSign(APPID, SECRET_ID, SECRET_KEY, BUCKET, expired);
} catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}
    return null;
}

进行图片识别

/***
* 上传图片进行识别
* @param urlStr	请求地址
* @param path		图片所在文件夹的路径
* @param imageName	图片名称
*/
public void uploadImage(String path, String imageName) {
    new Thread(){
        @Override
        public void run() {
            try {
                // 配置HttpURLConnection对象
                HttpURLConnection connection = handlerConnection(path, imageName);
                // 连接HttpURLConnection
                connection.connect();
                // 得到响应
                int responseCode = connection.getResponseCode();
                if(responseCode == HttpURLConnection.HTTP_OK){
                    String result = readInputStream(connection.getInputStream());//将流转换为字符串。
                    System.out.println("请求成功：" + result);
                } else {
                    String errorMsg = readInputStream(connection.getErrorStream());//将流转换为字符串。
                    System.out.println("请求失败：" + errorMsg);
                }
            } catch (Exception e) {
                e.printStackTrace();
                System.out.println( "网络请求出现异常: " + e.getMessage());
            }
        }
    }.start();
}

写在后面

源码地址：TencentOCRDemo（欢迎Star，谢谢！）
使用Git下载：git clone https://github.com/beichensky/TencentOCRDemo.git

关于使用 Java 语言抵用腾讯云文字OCR识别的方式，大概就在这里了，如果有说的不好或者不清楚地地方，欢迎大家指正。