DuckDB is a lightweight, high-performance database system designed for analytical workloads. As with any database management system (DBMS), one of the first things we normally want to do when we launch it is to create or open a database.
When it comes to creating a database, you can’t create a database in DuckDB by using the SQL CREATE DATABASE
statement (unless you’re using a tool that allows you to do so). DuckDB works differently.
In this article, we look at various options for creating a database in DuckDB.
Specify the Database Name at Startup
One of the simplest ways to create a database in DuckDB is to specify the database name directly when starting DuckDB. When you run the DuckDB command in your terminal or application, provide the desired database file name. For example:
duckdb my_database.db
This command creates a new database file named my_database.db
in the current working directory if it doesn’t already exist. If the file does exist, DuckDB will open the existing database.
Attach a Database to an Open Session
If DuckDB is already running without a specific database, you can attach a new database during the session. To do this, use the ATTACH
SQL statement:
ATTACH 'new_database.db' AS new_db;
This creates (if necessary) and connects a database file named new_database.db
. The alias new_db
can then be used to reference it within the current session.
Note the single quotes around the database name.
To create the database in another directory, either include the path, or navigate to that directory first.
Use the .open
Command in the DuckDB Shell
DuckDB’s command-line shell provides a .open
command to create or connect to a database. Once you start the shell, you can type the following:
.open my_database.db
This creates a new database file named my_database.db
if it doesn’t already exist or connects to the file if it does.
Creating an In-Memory Database
DuckDB also supports creating in-memory databases. This is useful for temporary data processing that doesn’t need persistence. We can create an in-memory database by either omitting the database file name or by passing :memory:
as the file name.
When running in in-memory mode, nothing is saved to disk. Therefore, when the process finishes, all data is lost.
Create an In-Memory Database at Startup
Here’s an example of creating an in-memory database when we launch the DuckDB CLI:
duckdb
Alternatively, we can do this:
duckdb ':memory:'
By default, DuckDB operates in memory until you attach a file or explicitly save data.
Create an In-Memory Database During an Open Session
Here’s an example of doing it in an open session:
ATTACH ':memory:' AS TempDB;
Here I specified :memory:
as the database file and TempDB
as its alias.
Using an API
In addition to the above methods, you can use DuckDB programmatically through its various language bindings, such as Python or R.
For example to create an in-memory database in Python:
import duckdb
con = duckdb.connect()
To create a persistent database in Python:
import duckdb
con = duckdb.connect("my_database.db")
See the list of client APIs on DuckDB website for information.