Parser Service¶
Extracts text from 7 file formats.
Supported Formats¶
Format |
Library |
|---|---|
PyMuPDF (fitz) |
|
Word (.docx) |
python-docx |
PowerPoint (.pptx) |
python-pptx |
Excel (.xlsx) |
openpyxl |
Images (.png/.jpg) |
Tesseract OCR |
Markdown (.md) |
Raw text |
Text (.txt) |
Python built-in |