OCR tutorials

mangojang·2022년 5월 3일

Google Cloud Platform Node OCR VISION translate

목적 : 이미지에서 텍스트를 추출 하고, 그것을 영문으로 번역한 산출물을 얻는다.

OCR

Fullname : Optical Character Recognition , 텍스트를 감지하고 추출 함.

아래의 작업개요는 google cloud OCR 튜토리얼 공식 문서를 기반으로 함.

https://cloud.google.com/functions/docs/tutorials/ocr

작업 개요

google cloud 서비스 계정 생성 및 환경 세팅
cloud storage API - bucket
cloud vision API - 텍스트 추출
cloud translation api - 추출된 텍스트 번역
텍스트 bucket에 저장

GCP (google cloud platform) setting

서비스 계정으로 인증 | Google Cloud

google cloud 계정 생성, 새 프로젝트 생성 (처음에 90일간은 무료로 사용가능함.)
API 추가
- Cloud Functions
- Pub/Sub
- Cloud Storage
- Cloud Translation API
- Cloud Vision
서비스 계정 키 발급 → 키 만들기 누르고 생성하고 json 선택 → json 파일 다운로드 됨.

⇒ json 파일 path로 환경변수 설정 함.
사용자 인증정보 세팅
1. 환경변수 설정
  - 글로벌하게 사용
  ⭐ GOOGLE_APPLICATION_CREDENTIALS="**KEY_PATH**" 형태로 환경변수 설정
2. 코드를 사용 (필자는 이 방법으로 진행)
  - 사용하는 app 만 적용할 수 있음.
  - 생성자 객체 정의시, projectId 와 keyFilename 을 실어서 정의함.
project library 설치

node.js 기반

npm install  @google-cloud/pubsub @google-cloud/storage @google-cloud/translate @google-cloud/vision

cloud storage API - bucket

cloud 에서 제공하는 저장소 , 저장소 하나당 bucket 이라고 명명함.
기본 튜토리얼 에서는 bucket 에 저장된 이미지를 가져와 텍스트 추출 하는 것을 기본으로함.
튜토리얼 을 따르려면 일단 bucket을 생성해야함. ( 굳이 bucket 저장안하고 바로 local 에서 불러와 추출 할 수 있음.)

기본 설정 - storage 선언

// 코드로 계정인증 하는 방법 사용
const projectId = '프로젝트 아이디'
const keyFilename ='key 포함한 json 파일 path' // app 경로를 기본으로 함. ex> './' 접근시, 바로 app root로 접근

const {Storage} = require('@google-cloud/storage');
const storage = new Storage({ projectId, keyFilename });

버킷 생성

const createBucket = async function(){
		const bucketName = '버킷 이름';
		await storage.createBucket(bucketName);

		console.log(`Bucket ${bucketName} created.`);
}

버킷 조회

버킷 전체 가져오기

const getBucket = async function(){
	const [buckets] = await storage.getBuckets();

	console.log('버킷 가져오기 성공', buckets);
}

버킷 안에 있는 파일 조회

const getFiles = async function(){
	const bucketName = '버킷 이름';
	const [files] = storage.bucket(bucketName).getFiles();

	console.log('버킷 파일 가져오기 성공', files);
}

cloud vision API - 텍스트 추출

공식문서에서 제공하는 sample 코드 를 변형 하여 사용함.

const Vision = require('@google-cloud/vision');
// 코드로 계정인증 하는 방법 사용
const projectId = '프로젝트 아이디'
const keyFilename ='key 포함한 json 파일 path' // app 경로를 기본으로 함. ex> './' 접근시, 바로 app root로 접근
const vision = new Vision.ImageAnnotatorClient({ projectId, keyFilename });

const detectText = async (bucketName, filename) => {
  console.log(`Looking for text in image ${filename}`);
	
	if(bucketName){
		//1-1. bucket에서 파일가져와서 적용
    [textDetections] = await vision.textDetection(
    `gs://${bucketName}/${filename}`
    );
  }else{
		//1-2. local에서 파일가져와서 적용
    [textDetections] = await vision.textDetection(filename);
  }
  const [annotation] = textDetections.textAnnotations;
  const text = annotation ? annotation.description : '';
  console.log('Extracted text from image:', text);

  let [translateDetection] = await translate.detect(text);
  if (Array.isArray(translateDetection)) {
    [translateDetection] = translateDetection;
  }
  console.log(
    `Detected language "${translateDetection.language}" for ${filename}`
  );

  const messageData = {
    text: text,
    filename: filename,
    lang: 'en',

  return messageData

};

const processImage = async event => {
  const {bucket, name} = event;

  if (!bucket) {
    throw new Error(
      'Bucket not provided. Make sure you have a "bucket" property in your request'
    );
  }
  if (!name) {
    throw new Error(
      'Filename not provided. Make sure you have a "name" property in your request'
    );
  }

  return await detectText(bucket, name);
  console.log(`File ${name} processed.`);
};

///// 호출 함수 //////
callFunc = ()=>{
		const bucketName ='버킷이름',
		 fileName ='파일경로 or 버킷내에 파일 경로 ';

		const data = {
		    bucket: bucketName, // bucket 이 없는 경우 local에서 이미지가져와 적용
		    name: fileName
		};	

		const process =  await processImage(data);
    console.log("process",process); // process.text 로 추출 된 text 확인 가능
}

cloud translation API - 텍스트 번역

공식문서에서 제공하는 sample 코드 를 변형 하여 사용함.

// 코드로 계정인증 하는 방법 사용
const projectId = '프로젝트 아이디'
const keyFilename ='key 포함한 json 파일 path' // app 경로를 기본으로 함. ex> './' 접근시, 바로 app root로 접근

const {Translate} = require('@google-cloud/translate').v2;
const translate = new Translate({ projectId, keyFilename }));

exports.translateText = async event => {
  const pubsubData = event.data;
  const jsonStr = Buffer.from(pubsubData, 'base64').toString();
  const {text, filename, lang} = JSON.parse(jsonStr);

  if (!text) {
    // throw new Error(
    //   'Text not provided. Make sure you have a "text" property in your request'
    // );
    return {
      text: text,
      filename: filename,
      lang: lang,
    };
  }
  if (!filename) {
    throw new Error(
      'Filename not provided. Make sure you have a "filename" property in your request'
    );
  }
  if (!lang) {
    throw new Error(
      'Language not provided. Make sure you have a "lang" property in your request'
    );
  }

  console.log(`Translating text into ${lang}`);
  const [translation] = await translate.translate(text, lang);

  console.log('Translated text:', translation);

  const messageData = {
    text: translation,
    filename: filename,
    lang: lang,
  };
  console.log(messageData);
  return messageData
  // await publishResult(process.env.RESULT_TOPIC, messageData);
  // await publishResult('ocr-result', messageData);
  console.log(`Text translated to ${lang}`);
};

///// 호출 함수 //////
callFunc = ()=>{
		const data = {
		    data: Buffer.from(
	          JSON.stringify({
	              text:'번역하려고 하는 문자',
	              filename:'파일경로 or 버킷내에 파일 경로',
	              lang:'언어(~로 번역)', //ex) 영문 = 'en'
	          })
	      ).toString('base64'),
		};	

		const translate=  await translateText(data);
     console.log("translated",translate); // translate.text 로 번역 된 text 확인 가능
}

번역 된 텍스트 버킷 저장

공식문서에서 제공하는 sample 코드 를 변형 하여 사용함.
번역된 텍스트만 얻고 싶은거라면 굳이 적용 안해도 될 듯, 튜토리얼에 포함되어 있기에 추가 함.

// 코드로 계정인증 하는 방법 사용
const projectId = '프로젝트 아이디'
const keyFilename ='key 포함한 json 파일 path' // app 경로를 기본으로 함. ex> './' 접근시, 바로 app root로 접근

const {Storage} = require('@google-cloud/storage');
const storage = new Storage({ projectId, keyFilename });

const renameImageForSave = (filename, lang) => {
  return `${filename}_to_${lang}.txt`;
};

exports.saveResult = async event => {
  const pubsubData = event.data;
  const jsonStr = Buffer.from(pubsubData, 'base64').toString();
  const {text, filename, lang} = JSON.parse(jsonStr);

  if (!text) {
    throw new Error(
      'Text not provided. Make sure you have a "text" property in your request'
    );
  }
  if (!filename) {
    throw new Error(
      'Filename not provided. Make sure you have a "filename" property in your request'
    );
  }
  if (!lang) {
    throw new Error(
      'Language not provided. Make sure you have a "lang" property in your request'
    );
  }

  console.log(`Received request to save file ${filename}`);

  const bucketName = event.result_bucket;
  // const bucketName = process.env.RESULT_BUCKET;
  const newFilename = renameImageForSave(filename, lang);
  const file = storage.bucket(bucketName).file(newFilename);

  console.log(`Saving result to ${newFilename} in bucket ${bucketName}`);

  await file.save(text);
  console.log('File saved.');
};

///// 호출 함수 //////
callFunc = ()=>{
		const data = {
		    data: Buffer.from(
	          JSON.stringify({
	              text:'번역하려고 하는 문자',
	              filename:'파일경로 or 버킷내에 파일 경로',
	              lang:'언어(~로 번역)', //ex) 영문 = 'en'
	          })
				,result_bucket: '결과 저장할 버킷 이름' // 버킷은 미리 생성되어 있어야함.
	      ).toString('base64'),
		};	

		const translate=  await translateText(data);
     console.log("translated",translate); // translate.text 로 번역 된 text 확인 가능
}