top of page

Database Internals

Databases aren't magic — let's open the hood.

350 $/mo

Mon / Wed, 19:00 (UTC+2)

start:

Aug 3, 2026

25 classes, 3 months

350 $/mo

Mon / Wed, 19:00 (UTC+2)

what's inside

How do you build a database management system? Why are there so many kinds of DBMSs — and why do new ones keep appearing every year? How do primary keys, joins, and order by actually work? How is data stored on disk, and what optimizations do modern systems apply? If you want answers to these and other questions about building a DBMS from the ground up, this course is for you.

Throughout the program, we'll walk through the complete journey of building a database — from query parsing to execution and scaling. You'll discover the algorithms and concepts at the heart of modern database systems, and put that knowledge into practice by building your own.

This course is valuable for anyone who wants to go deeper into systems programming, as well as for DBMS users who want to understand how databases work internally and learn to optimize them more effectively. The best learning experience comes with Rust or C++, as they allow you to go further into database architecture and get a real feel for memory management and performance in practice — though this is a recommendation, not a requirement.

Curriculum

it will be hot

FOR ENGINEERS

History of databases

The path from file systems to modern data storage

• The first DBMS models: Hierarchical, Network
• The revolution of the relational model and its impact on the industry
• The emergence of SQL, the development of object-oriented and NoSQL databases

Data warehouse basics

How data is organized on disk and in memory, and why it affects performance

• Pages and blocks as basic storage units
• Organizing records in pages: fixed and variable length
• Row-store and its applications
• Data fragmentation and defragmentation methods

Indexes. B-tree and LSM

Data structures to speed up searches, inserts, and range scans

• B-Tree and B+Tree: structure, search, insertion and deletion algorithms
• LSM trees: how log-structured merges work
• Write/Read amplification and compaction mechanisms
• Using Bloom filters to speed up searches

Columnar and hybrid data stores. Compression

Efficient storage of analytical data

• Row-oriented vs column-oriented storage
• Advantages of the columnar approach for analytical queries
• Data compression methods: RLE, dictionary encoding, delta encoding
• Hybrid HTAP systems (Hybrid Transactional/Analytical Processing)
• PAX format — combining the advantages of row and column approaches

Relational model. Relational algebra

Formal foundations of relational DBMSs and operations underlying SQL

• Relationships, attributes, tuples, and keys
• Primary and foreign keys. Ensuring data integrity
• Basic operations of relational algebra: selection, projection, union, difference, Cartesian product
• Joins: inner, outer, natural
• Properties of algebra — commutativity, associativity. The role of algebra as a basis for query optimization

Query planning and optimization

Logical plans (relational algebra trees)

• Building an Abstract Syntax Tree (AST)
• Heuristic optimizations: pushdown selections and projections
• Cost-based optimization: using statistics and evaluating selectivity
• Join strategies: nested loop, hash join, sort-merge join

Query execution. Plan vectorization. SQL compilation

Row-at-a-time vs batch-at-a-time approach

• Volcano (iterator) execution model
• Vectorization: SIMD and block processing of values
• JIT compilation of SQL queries

Data types. Type system. Type casting.

Basic types: numeric, text, time, logical, binary

• Handling NULL values
• Explicit and implicit type casting
• Rules of precedence in expressions
• User-defined types
• Semi-structured data types (JSON, XML, etc.)

Transactions and Concurrent Access Management

ACID transaction properties

• Concurrency issues: dirty reads, phantom reads, and others
• Transaction isolation levels: Read Committed, Repeatable Read, Serializable, etc.
• 2PL (two-phase locking) and deadlock detection
• MVCC (multi-version concurrency control) and snapshot isolation

Open-source databases. Modern architecture and recent issues

Using ML for indexing, planning, tuning

• Serverless databases
• Separation of compute and storage layers
• Automation and self-driving DBMS
• Lakehouse architecture, agentic DBMS and Iceberg storage

Presentation of projects and analysis of research

You've written your own DBMS. Time to present it :)

Instructor

it will be hot

FOR ENGINEERS

Denys Tsyomenko

Founding Engineer @Embucket

Former Software Engineer @CaspianDB @SingleStore @DataRobot @Microsoft

University lecturer @Kyiv School of Economics

Ready? Take the first step

ready?
take the first step

I accept the terms of the Public Offer Agreement and consent to the processing of my personal data in accordance with the Privacy Policy.

reviews

What alumni say

FOR ENGINEERS

reviews
what alumni say

Senior Software Engineer @ICC Chess Club

Yevhen Dudnik

I went to the course with specific, quite deep questions, but it turned out that my questions were somewhere in the shallows. You have to be prepared for the fact that you will learn things about DBs that you didn't know about and that aren't even written about anywhere. The course is very interesting, my recommendation.

.Net Developer @FlexBricks

Dmytro Avilov

I liked the course. I gained a lot of knowledge, and I began to understand a little better how databases work and which way to look when solving performance problems.

Android Developer @Competo LLC

Anatolii Kokuliuk

There are a lot of topics in the course, and you can get stuck in each one for a long time. As someone who is not interested in databases, the course is 100% engaging for me. Building a storage, vocabulary, building a pipeline, execution — super interesting topics that go far beyond the database.

Java Software Engineer @Intapp

Mykola Pikuza

The course material is deep, practical, and very well structured, but what really makes it special is the community and the instructor. Denis is a great instructor who tries to explain complex database concepts simply and systematically. Always attentive to detail and willing to go over difficult points as many times as needed.

Backend Engineer @Preply

Denys Ralko

What I liked most was the depth of the topics covered. I knew a lot of things as a fact before, but I didn't understand why they worked the way they did. The course gives me an opportunity to understand the reasons and mechanics of these decisions.

Senior Director of Engineering @Pindrop

Volodymyr Shulha

I joined the course because data volumes keep growing — and so does the variety of database types. Many projects already rely on several different databases for different tasks. I wanted to develop a deeper understanding of how different types of databases are structured and which use cases each one is best suited for. The course material is extensive and high quality: finding this much information on your own and structuring it into a coherent whole would have been a real challenge.

format that works

Constant feedback in Slack.

No superficial slides — just deep dives into real production challenges.

Certificates earned through real results: completed assignments, active discussions, measurable progress.

communication that drives you

Twice weekly on Zoom — Mondays and Wednesdays at 7:00 PM, 1.5 hours each. All lectures recorded for later review. Taught in Ukrainian. Supplementary materials in English.

Slack is our hub for discussions, clever test cases, and top company referrals.

environment that energizes

We screen carefully — you'll learn among strong, motivated peers. Skip homework? You're out.

Your instructor is always available. They'll explain until it clicks — whether that's a third code review or staying late after lecture.

That's how we work: learn and grow stronger together.

What awaits you?

have fun and dive deep

FOR ENGINEERS

bottom of page