Skip to content

OCR

OCR extracts text from images, scans, and document screenshots.

After recognition, you can copy the result, export it as Markdown, PDF, or Word, or package multiple formats together for download.

What OCR Can Do

FeatureDescription
Image text recognitionExtracts text from images, screenshots, and scans.
Document layout recognitionBetter for tables, formulas, stamps, and mixed text-image layouts.
Multiple servicesSupports Baidu PaddleOCR, Microsoft Azure Vision, and Google Vision.
Copy resultsCopy recognized text after processing.
Export filesExport Markdown, PDF, and Word.
Batch packagingAfter recognizing multiple files, download results as a package.

Configure OCR Services First

Open:

text
System Settings -> Other Settings -> OCR

IP geolocation and OCR

Fill in credentials for the services you want to use:

ServiceWhat To EnterBest For
Baidu PaddleOCRPaddleOCR TokenRecommended first choice. Good for documents, images, tables, and mixed layouts.
Microsoft Azure VisionAzure Vision Endpoint and Azure Vision API KeyUseful if you already use Microsoft cloud services.
Google VisionGoogle Vision API Key. Service account JSON is only used for quota query.Useful if you use Google Cloud services.

Save after filling in credentials.

You can configure only one service for initial testing. You do not need all three.

Google Vision Setup

Google setup has two parts:

GoalRequirement
Use OCREnable Cloud Vision API, then create an API Key.
Query usageCreate a service account, grant Monitoring Viewer, then download the service account JSON.

Google API key and service account

Use Google for OCR

  1. Open Google Cloud Console.
  2. Go to APIs & Services.
  3. Open Library, search for Cloud Vision API, and enable it.
  4. Return to Credentials.
  5. Create an API Key.
  6. Open the API Key and copy it.
  7. Paste it into Google Vision API Key in ImgBed.
  8. Save.

You can then choose Google Vision in the OCR dialog.

Query Google Usage

Quota query is not required for recognition.

It only shows roughly how many Google Vision calls were used in the last 30 days.

  1. In Google Cloud Console, open IAM & Admin.
  2. Open Service Accounts.
  3. Create a service account, such as vision-monitor.
  4. Grant it the Monitoring Viewer role.
  5. Open the service account details and create a key.
  6. Choose JSON.
  7. Download the generated JSON file.
  8. Return to ImgBed and import it under service account JSON (optional).
  9. After import succeeds, click quota query.

After import, ImgBed shows the project name that owns the service account. When querying usage, ImgBed reads Google monitoring data and shows this month's call count.

In short:

ItemPurpose
Google Vision API KeyPerforms OCR recognition.
Service account JSONQueries how many Google Vision calls were used.
Monitoring Viewer roleAllows the service account to read usage data.

Get a Baidu PaddleOCR Token

Baidu PaddleOCR requires an access token.

Get PaddleOCR token

Open the API call window on the Baidu PaddleOCR page, click to get a token, then copy it.

Return to ImgBed, paste it into PaddleOCR Token, and save.

Start Recognition

In File Management, select an image or document screenshot and click OCR.

OCR recognition

In the dialog, choose the recognition service and model.

Common PaddleOCR model choices:

ModelBest For
PP-StructureV3Recommended default. Good for documents, tables, formulas, stamps, and mixed layouts.
PP-OCRv5Simple images, ordinary text, and lightweight recognition.
PaddleOCR-VLMultilingual, complex images, and chart-like content.
PaddleOCR-VL-1.5More complex document pages and layout recovery.

If you are unsure, start with PP-StructureV3.

Advanced Options

OptionDescription
Orientation correctionUse when the image is rotated or skewed.
Document flatteningUse for photographed documents with curvature or tilt.
Layout detectionUse when you want to preserve headings, paragraphs, tables, and image structure.
Chart recognitionUse when the image contains charts or complex structures.
Beautify MarkdownMakes exported Markdown easier to read.

For regular screenshots, keep options minimal. For document scans, enable more document-related options.

View Results

After recognition finishes, the dialog shows the result.

You can copy it directly or choose export formats.

PDF recognition

For document pages, exported PDF can preserve page appearance while keeping text searchable. This is useful for archiving scans and finding content later.

Choosing an Export Format

FormatBest For
Markdown (.md)Notes, documentation systems, and later editing.
PDF (.pdf)Preserving page appearance and scanned document results.
Word (.docx)Continued layout editing, text modification, and handoff to others.
Export allSaves multiple formats and the original image, suitable for important archives.

If you only need text, export Markdown.

If you need page appearance, use PDF or Word.

Word Output

Exported Word documents can be opened and edited with office software.

Word result

Some documents include recognized images, headings, and paragraphs in the Word output.

Recognition quality depends on original image clarity, model choice, and document complexity.

Best File Types for OCR

File TypeRecommendation
Clear screenshotsRecognize directly.
ScansPrefer PP-StructureV3.
Photographed documentsEnable orientation correction and document flattening.
Tables, formulas, stampsPrefer structured models.
Simple short text imagesPP-OCRv5 is usually enough.

Clearer images with straighter text usually produce better results.

Common Cases

CaseMeaning
Recognition failsCheck that the service token or key has been saved.
Recognition is slowComplex documents and large images take longer.
Table is incompleteTry a structured model.
Text has mistakesBlur, glare, and skew increase recognition errors. Try a clearer image.
Word output contains many imagesStructured models may preserve some recognized images. This is normal.

Google Quota Query Fails

Check:

  1. Service account JSON has been imported.
  2. The service account has the Monitoring Viewer role.
  3. Cloud Vision API is enabled for the project.

If you only need OCR and not usage query, you can ignore the service account JSON and only fill in Google Vision API Key.

Quick Flow

text
Open System Settings
-> Open Other Settings
-> Fill OCR service credentials
-> Save
-> Return to File Management
-> Select a file and click OCR
-> Choose a model
-> Wait for recognition
-> Copy results or export Markdown / PDF / Word

Released as user documentation for CloudFlare ImgBed.