A reusable, schema-agnostic importer that loads CSV node/relationship files into Neo4j using APOC.
- Node files:
*_nodes.csv- Required column:
id - All other columns are imported as properties
- Required column:
- Relationship files:
*_relationships.csv- Required columns (any of these patterns):
start_id/end_idsource_id/target_id
- All other columns are imported as relationship properties
- Required columns (any of these patterns):
- Relationship type and label are derived from the filename (prefix numbers are ignored):
001_Person_nodes.csv-> labelPerson101_authored_relationships.csv-> rel typeAUTHORED
NEO4J_URI(default:bolt://neo4j:7687)NEO4J_USER(default:neo4j)NEO4J_PASSWORD(default:neo4j)NEO4J_DATABASE(default:neo4j)IMPORT_DIR(default:/data/import)CYPHER_DIR(default:/data/cypher)LOG_DIR(default:/data/logs)
Optional schema hints:
REL_LABEL_MAP_JSON- JSON mapping of relationship type ->
[startLabel, endLabel] - Example:
{"authored": ["Person", "Publication"]}
- JSON mapping of relationship type ->
DUAL_LABELS_JSON- JSON mapping of node label ->
[labelA, labelB]to attach two labels - Example:
{"OrgUnit": ["OrgUnit", "Institution"]}
- JSON mapping of node label ->
You can place a config.yaml in the repo root (or set CONFIG_PATH) instead of exporting env vars.
Environment variables still override values from the file.
Example (see config.yaml.example):
neo4j_uri: "bolt://localhost:7687"
neo4j_user: "neo4j"
neo4j_password: "your_password"
neo4j_database: "neo4j"
import_dir: "./import"
cypher_dir: "./cypher"
log_dir: "./logs"docker compose up --buildThis builds the importer image, starts Neo4j with APOC enabled, and runs the importer automatically.
Place CSVs in ./import and optional Cypher scripts in ./cypher. Logs go to ./logs.
Update credentials and database name in docker-compose.yml if needed.
To let others reproduce the import, share one of these:
- Compose bundle (recommended): share the repo (or zip) with
docker-compose.yml,import/, andcypher/. They rundocker compose up --build. - Single self-contained image: bake
import/andcypher/into the image at build time so running the image always imports that data.
Note: an image alone does not include your CSVs unless you bake them in or mount them as volumes.
- Put your
*_nodes.csvand*_relationships.csvfiles inimport/and any.cypherscripts incypher/. - Run:
docker compose up --build - Check logs in
logs/and the container output for import status.
- If
idis missing or blank in node files, UUIDs are generated. - If no label map is provided, relationships are matched by
idonly. - APOC must be enabled in Neo4j (see the example compose file).