What is a Vector Database?

A vector database is a specialized database designed to store, index, and query high-dimensional vectors. These are arrays of numbers that represent data in mathematical space. Unlike traditional databases that store text, numbers, or structured data, vector databases work with embeddings, which are numerical representations of complex data like text, images, audio, or video that capture their semantic meaning.

These databases solve the specific problem of finding similar items based on meaning rather than exact matches. Traditional databases excel at finding exact matches or simple comparisons. For example “find all users named Bella” or “find products under $50.” Vector databases excel at similarity searches. These could look something like “find images similar to this one” or “find documents with similar meaning to this query,” even when the exact words or pixels are different.

Read more

Using Subqueries with SQL Server’s DATEDIFF() Function

While SQL Server’s DATEDIFF() function is relatively straightforward when you’re comparing two known dates, it becomes more flexible when the dates you compare are sourced directly from your tables. Instead of hardcoding dates, you can embed subqueries directly into the DATEDIFF() function to dynamically retrieve the dates you need.

This approach can be especially useful when you’re working with aggregate functions like MIN() and MAX(), or when you need to pull specific dates based on certain conditions. The subqueries execute first, return their date values, and DATEDIFF() uses those results to perform the calculation.

Read more

PIVOT vs GROUP BY in SQL Server

If you’re new to SQL or just starting to explore data transformation techniques, you might be confused about when to use PIVOT versus GROUP BY. They may seem similar at first. After all, both aggregate data, both can summarize information, and both produce condensed result sets. But they serve fundamentally different purposes and produce very different output structures.

Understanding the distinction between these two operations will help you write more effective queries. GROUP BY aggregates data vertically, keeping your results in a row-based format. PIVOT transforms data horizontally, turning row values into column headers. Let’s break down exactly what each does and when you’d use one over the other.

Read more

SQL Server SUBSTRING(): A Complete Guide to Extracting Strings

The SUBSTRING() function in SQL Server lets you pull out specific portions of a string. Whether you’re cleaning data, formatting output, or parsing complex text fields, this function is one you’ll use constantly.

If you’ve ever needed to extract an area code from a phone number, grab the domain from an email address, or parse product codes into their component parts, SUBSTRING() is your go-to tool. It’s pretty straightforward, and once you understand how it works, you’ll find yourself using it all the time.

Read more

What is a Multi-Model Database?

A multi-model database is a database system that supports multiple data models within a single, integrated backend. Instead of being limited to one way of organizing data (like relational tables, documents, or graphs) a multi-model database lets you store and query different types of data using the most appropriate model for each use case.

Traditional database systems typically specialize in one data model. A relational database like MySQL organizes everything into tables with rows and columns. A document database like MongoDB stores JSON-like documents. A graph database like Neo4j focuses on nodes and relationships. With a multi-model database, you get several of these capabilities in one system, storing relational data, documents, key-value pairs, and graphs side by side.

Read more

Installing the Netflix Sample Database in SQL Server

The Netflix sample database is a sample database that can be used for learning and practicing SQL. It uses data from the Netflix Engagement Report and the Netflix Global Top 10 weekly list.

This guide covers the installation process and gets you running queries quickly. For more background on what this database is and why it’s useful for learning, see my introduction to the Netflix sample database.

Read more

Fix Error 3234 “Logical file is not part of database” When Restoring a Database in SQL Server

If you’re getting error 3234 that reads something like “Logical file ‘AdventureWorksLT_Data’ is not part of database ‘AdventureWorksLT2025’. Use RESTORE FILELISTONLY to list the logical file names.“, you’re referencing the wrong logical file names when restoring a database.

This issue can happen when you try to map the logical file names to a new location, but you get those logical file names wrong.

Fortunately there’s an easy fix. It involves looking up the actual logical file names, then modifying your RESTORE DATABASE statement accordingly.

Read more

What is a Snowflake Schema?

In relational databases, a snowflake schema is a variation of the star schema where dimension tables are normalized into multiple related tables. Instead of keeping all dimensional attributes in a single wide table, you break them down into a hierarchy of tables that branch out like (you guessed it) a snowflake.

It’s basically what happens when you apply database normalization principles to a star schema’s dimension tables while keeping the central fact table intact.

Read more