PDF-CSV
- Role
- Python Developer
- Technologies
- Python, Fitz, CSV
- Description
- Process PDF files, extract text content, clean it, and organize information into structured CSV files based on scene descriptions and related objects.
PDF-CSV
Description: The PDF-CSV project involves a Python script designed to process PDF files, extract text content, clean it, and organize the information into structured CSV files based on scene descriptions and related objects found in the PDFs. It uses Python, Fitz library for PDF processing, os for file handling, and csv for CSV file generation. The client requires a Python script that can efficiently process multiple PDF files, extract relevant information such as scene descriptions and associated objects, clean up the extracted text, and output the organized data into CSV files for further analysis or usage.
Project Overview:
- Develop a Python script to process PDF files, extract text content, clean it, and organize information into CSV files.
- Use the Fitz library for PDF processing, os for file handling, and csv for CSV file generation.
- Identify scene descriptions and related objects in PDFs and structure the extracted data into CSV files.
Technologies Used:
- Python
- Fitz
- CSV
- os
Role: Python Developer
Client Requirement:
- Efficiently process multiple PDF files.
- Extract relevant information such as scene descriptions and associated objects.
- Clean up extracted text and output organized data into CSV files.
Live project demo and proof of work is available on request!