I am working on a GridDB-based application that requires bulk data import from CSV files using the Python API. Additionally, the schema of the incoming data can evolve over time, so I need a strategy to update container schemas dynamically without disrupting ongoing data ingestion. For example, I load data with pandas and then import it into GridDB:
import pandas as pd
from griddb_python import StoreFactory
df = pd.read_csv('data.csv')
I have two specific questions:
- What is the most efficient method for performing bulk data import into GridDB using the Python API? Are there specific functions or best practices to minimize import time and resource usage?
- How can I manage dynamic schema evolution—such as adding new columns or modifying existing ones—without disrupting the continuous data ingestion process?
Additional Context:
I have reviewed the GridDB Python API Reference and the documentation on container creation, but I need a more detailed strategy on bulk import combined with schema evolution, including handling data migration challenges. Any detailed explanation or code examples would be greatly appreciated.