Files in AI projects
When working with Python for AI, you’ll constantly work with data files. Your data might come as:- CSV files - Spreadsheet data from Excel or databases
- JSON files - API responses and configuration data
- XML files - Structured data from various systems
- Text files - Raw text for processing
- Parquet files - Efficient data storage format
Common libraries for files
Each file type has specialized libraries: CSV files:pandas- Best for data analysis (recommended)csvmodule - Built-in, for simple operations
jsonmodule - Built-in, handles all JSON operationspandas- Can read/write JSON with DataFrames
xml.etree- Built-in XML parsingopenpyxl- Excel files (.xlsx)PyPDF2- PDF files
Working with our sales data
Let’s work with our CSV file and convert it to different formats. First, install pandas:If you get an error, try
pip3 install pandas or install it through VS Code’s terminal.analyzer.py:
File format comparison
Different formats have different uses:Loading different file types
Here’s how to load various formats:Learn more
To dive deeper into file handling:- Pandas documentation - Comprehensive data handling
- Python JSON module - Official JSON docs
- Real Python file I/O - Detailed tutorial
- CSV module docs - Built-in CSV handling
Organizing code
Split your code into reusable functions