Loading...

Skip to main content

History: OCR Indexing

View published page Collapse Into Edit Sessions

Source of version: 4

«
»

Copy to clipboard

            !  {{page}}
Since ((Tiki20)), ((file galleries)) can index the contents of files with images uploaded to Tiki, by means of "Optical Character Recognition" (OCR), and take the result to feed the ((search index)) also.


Tiki relies on https://github.com/tesseract-ocr/tesseract so you need to install as per https://tesseract-ocr.github.io/tessdoc/Installation.html

If you are using WikiSuite, Tesseract is installed by default: https://wikisuite.org/Differences-between-Virtualmin-and-WikiSuite

((Server Check)) helps you confirm that Tesseract is working well, and available to Tiki.

!! Required Preferences
To enable OCR indexing in Tiki, make sure to activate the following preference:
ocr_enable: Enables Tiki to extract and index text from supported file types.

!! Optional Preferences
You can further customize OCR behavior with these optional settings:
ocr_every_file: If enabled, Tiki will attempt OCR on all supported files, regardless of other criteria.
ocr_file_level: Allows users to override the default OCR language settings on a per-file basis.

!! Additional Customization
Tiki also offers several advanced customization options:
* Display OCR status per file.
* Set custom paths for the tesseract and pdfimages binaries via the system $PATH.

Alias names for this page:
(alias(OCR)) | (alias(OCRIndexing)) | (alias(Optical Character Recognition))

History

Enable pagination rows per page

Information	Version
Thu 29 May, 2025 20:29 UTC Sammy Ndabo	16
Thu 29 May, 2025 20:27 UTC Sammy Ndabo	15
Thu 29 May, 2025 20:19 UTC Sammy Ndabo	14
Thu 29 May, 2025 20:17 UTC Sammy Ndabo	13
Thu 29 May, 2025 20:10 UTC Sammy Ndabo	12
Thu 29 May, 2025 19:58 UTC Sammy Ndabo	11
Thu 29 May, 2025 19:56 UTC Sammy Ndabo	10
Thu 29 May, 2025 19:55 UTC Sammy Ndabo	9
Thu 29 May, 2025 19:54 UTC Sammy Ndabo image Plugin modified by editor.	8
Thu 29 May, 2025 19:49 UTC Sammy Ndabo	7
Thu 29 May, 2025 19:47 UTC Sammy Ndabo	6
Thu 29 May, 2025 18:14 UTC Sammy Ndabo	5
Thu 29 May, 2025 18:00 UTC Sammy Ndabo Improve the OCR indexing documentation with new details about the OCR feature	4
Wed 28 Feb, 2024 14:29 UTC Marc Laporte	3
Sun 12 May, 2019 17:08 UTC Xavi (as xavidp - admin)	2
Sun 12 May, 2019 17:07 UTC Xavi (as xavidp - admin) minimum doc better than nothing?	1

Collapse/expand modules below