MongoDB’s import utility – mongoimport
– introduced a new parameter that allows you to import CSV data as an array.
The --useArrayIndexFields
parameter interprets natural numbers in fields as array indexes when importing CSV or TSV files.
Example
Suppose we have a CSV file called tags.csv
that looks like this:
tags.0,tags.1,tags.2,tags.3
html,css,sql,xml
We can import that using the --useArrayIndexFields
parameter, which will cause mongoimport
to interpret the numbers in the column headers as array indexes.
Example:
mongoimport --db=krankykranes --type=csv --headerline --useArrayIndexFields --file=tags.csv
Output:
2021-01-03T20:55:44.284+1000 no collection specified 2021-01-03T20:55:44.284+1000 using filename 'tags' as collection 2021-01-03T20:55:44.297+1000 connected to: mongodb://localhost/ 2021-01-03T20:55:44.330+1000 1 document(s) imported successfully. 0 document(s) failed to import.
Seeing as I didn’t specify a collection name, it created a collection called tags
(based on the name of the file), then imported my document.
Let’s jump over to the mongo shell and check the collection.
db.tags.find()
Result:
{ "_id" : ObjectId("5ff1a2b0300ed79d9836882f"), "tags" : [ "html", "css", "sql", "xml" ] }
We can see that the CSV data has been imported as a JSON array.
Here it is again but with nicer formatting, which might make the array easier to see.
db.tags.find().pretty()
Result:
{ "_id" : ObjectId("5ff1a2b0300ed79d9836882f"), "tags" : [ "html", "css", "sql", "xml" ] }
Combined with Other CSV Data
The CSV file in the previous example only consisted of array data. But we can include other data in the file too.
Here’s an example of a CSV file called articles.csv
that contains other data.
_id,title,body,tags.0,tags.1,tags.2,tags.3 1.0,Web,blah,html,css,sql,xml 2.0,Animals,blah 2,cats,dogs 3.0,Plants,blah 3,trees 4.0,Oceans,blah 4
Let’s import that file:
mongoimport --db=krankykranes --type=csv --headerline --useArrayIndexFields --file=articles.csv
Output:
2021-01-03T21:14:38.286+1000 no collection specified 2021-01-03T21:14:38.287+1000 using filename 'articles' as collection 2021-01-03T21:14:38.336+1000 connected to: mongodb://localhost/ 2021-01-03T21:14:38.407+1000 4 document(s) imported successfully. 0 document(s) failed to import.
Now let’s jump over to the mongo shell and check the collection.
db.articles.find()
Result:
{ "_id" : 1, "title" : "Web", "body" : "blah", "tags" : [ "html", "css", "sql", "xml" ] } { "_id" : 3, "title" : "Plants", "body" : "blah 3", "tags" : [ "trees" ] } { "_id" : 4, "title" : "Oceans", "body" : "blah 4" } { "_id" : 2, "title" : "Animals", "body" : "blah 2", "tags" : [ "cats", "dogs" ] }
Let’s pretty it up a little.
db.articles.find().pretty()
Result:
{ "_id" : 1, "title" : "Web", "body" : "blah", "tags" : [ "html", "css", "sql", "xml" ] } { "_id" : 3, "title" : "Plants", "body" : "blah 3", "tags" : [ "trees" ] } { "_id" : 4, "title" : "Oceans", "body" : "blah 4" } { "_id" : 2, "title" : "Animals", "body" : "blah 2", "tags" : [ "cats", "dogs" ] }
So we can see that the documents have been created as perfect JSON/BSON documents, and arrays have been created for the relevant data.
Notice that no array has been created for document 4. This is because no array data was provided in the CSV file. So rather than create a field with an empty array, it didn’t create the field or the array.