Database Internals
Databases aren't magic — let's open the hood.
350 $/mo
Mon / Wed, 19:00 (UTC+2)
start:
Aug 3, 2026
25 classes, 3 months
350 $/mo
Mon / Wed, 19:00 (UTC+2)

what's inside
How do you build a database management system? Why are there so many kinds of DBMSs — and why do new ones keep appearing every year? How do primary keys, joins, and order by actually work? How is data stored on disk, and what optimizations do modern systems apply? If you want answers to these and other questions about building a DBMS from the ground up, this course is for you.
Throughout the program, we'll walk through the complete journey of building a database — from query parsing to execution and scaling. You'll discover the algorithms and concepts at the heart of modern database systems, and put that knowledge into practice by building your own.
This course is valuable for anyone who wants to go deeper into systems programming, as well as for DBMS users who want to understand how databases work internally and learn to optimize them more effectively. The best learning experience comes with Rust or C++, as they allow you to go further into database architecture and get a real feel for memory management and performance in practice — though this is a recommendation, not a requirement.
Curriculum
it will be hot
FOR ENGINEERS
History of databases
The path from file systems to modern data storage
• The first DBMS models: Hierarchical, Network
• The revolution of the relational model and its impact on the industry
• The emergence of SQL, the development of object-oriented and NoSQL databases
Data warehouse basics
How data is organized on disk and in memory, and why it affects performance
• Pages and blocks as basic storage units
• Organizing records in pages: fixed and variable length
• Row-store and its applications
• Data fragmentation and defragmentation methods
Indexes. B-tree and LSM
Data structures to speed up searches, inserts, and range scans
• B-Tree and B+Tree: structure, search, insertion and deletion algorithms
• LSM trees: how log-structured merges work
• Write/Read amplification and compaction mechanisms
• Using Bloom filters to speed up searches
Columnar and hybrid data stores. Compression
Efficient storage of analytical data
• Row-oriented vs column-oriented storage
• Advantages of the columnar approach for analytical queries
• Data compression methods: RLE, dictionary encoding, delta encoding
• Hybrid HTAP systems (Hybrid Transactional/Analytical Processing)
• PAX format — combining the advantages of row and column approaches
Relational model. Relational algebra
Formal foundations of relational DBMSs and operations underlying SQL
• Relationships, attributes, tuples, and keys
• Primary and foreign keys. Ensuring data integrity
• Basic operations of relational algebra: selection, projection, union, difference, Cartesian product
• Joins: inner, outer, natural
• Properties of algebra — commutativity, associativity. The role of algebra as a basis for query optimization
Query planning and optimization
Logical plans (relational algebra trees)
• Building an Abstract Syntax Tree (AST)
• Heuristic optimizations: pushdown selections and projections
• Cost-based optimization: using statistics and evaluating selectivity
• Join strategies: nested loop, hash join, sort-merge join
Query execution. Plan vectorization. SQL compilation
Row-at-a-time vs batch-at-a-time approach
• Volcano (iterator) execution model
• Vectorization: SIMD and block processing of values
• JIT compilation of SQL queries
Data types. Type system. Type casting.
Basic types: numeric, text, time, logical, binary
• Handling NULL values
• Explicit and implicit type casting
• Rules of precedence in expressions
• User-defined types
• Semi-structured data types (JSON, XML, etc.)
Transactions and Concurrent Access Management
ACID transaction properties
• Concurrency issues: dirty reads, phantom reads, and others
• Transaction isolation levels: Read Committed, Repeatable Read, Serializable, etc.
• 2PL (two-phase locking) and deadlock detection
• MVCC (multi-version concurrency control) and snapshot isolation
Open-source databases. Modern architecture and recent issues
Using ML for indexing, planning, tuning
• Serverless databases
• Separation of compute and storage layers
• Automation and self-driving DBMS
• Lakehouse architecture, agentic DBMS and Iceberg storage
Presentation of projects and analysis of research
You've written your own DBMS. Time to present it :)
Instructor
it will be hot
FOR ENGINEERS

Denys Tsyomenko
Founding Engineer @Embucket
Former Software Engineer @CaspianDB @SingleStore @DataRobot @Microsoft
University lecturer @Kyiv School of Economics
Ready? Take the first step
ready?
take the first step
I accept the terms of the Public Offer Agreement and consent to the processing of my personal data in accordance with the Privacy Policy.
reviews
What alumni say
FOR ENGINEERS
reviews
what alumni say

Senior Software Engineer @ICC Chess Club
Yevhen Dudnik
I went to the course with specific, quite deep questions, but it turned out that my questions were somewhere in the shallows. You have to be prepared for the fact that you will learn things about DBs that you didn't know about and that aren't even written about anywhere. The course is very interesting, my recommendation.

.Net Developer @FlexBricks
Dmytro Avilov
I liked the course. I gained a lot of knowledge, and I began to understand a little better how databases work and which way to look when solving performance problems.

Android Developer @Competo LLC
Anatolii Kokuliuk
There are a lot of topics in the course, and you can get stuck in each one for a long time. As someone who is not interested in databases, the course is 100% engaging for me. Building a storage, vocabulary, building a pipeline, execution — super interesting topics that go far beyond the database.

Java Software Engineer @Intapp
Mykola Pikuza
The course material is deep, practical, and very well structured, but what really makes it special is the community and the instructor. Denis is a great instructor who tries to explain complex database concepts simply and systematically. Always attentive to detail and willing to go over difficult points as many times as needed.

Backend Engineer @Preply
Denys Ralko
What I liked most was the depth of the topics covered. I knew a lot of things as a fact before, but I didn't understand why they worked the way they did. The course gives me an opportunity to understand the reasons and mechanics of these decisions.

Senior Director of Engineering @Pindrop
Volodymyr Shulha
I joined the course because data volumes keep growing — and so does the variety of database types. Many projects already rely on several different databases for different tasks. I wanted to develop a deeper understanding of how different types of databases are structured and which use cases each one is best suited for. The course material is extensive and high quality: finding this much information on your own and structuring it into a coherent whole would have been a real challenge.
format that works
Constant feedback in Slack.
No superficial slides — just deep dives into real production challenges.
Certificates earned through real results: completed assignments, active discussions, measurable progress.
communication that drives you
Twice weekly on Zoom — Mondays and Wednesdays at 7:00 PM, 1.5 hours each. All lectures recorded for later review. Taught in Ukrainian. Supplementary materials in English.
Slack is our hub for discussions, clever test cases, and top company referrals.
environment that energizes
We screen carefully — you'll learn among strong, motivated peers. Skip homework? You're out.
Your instructor is always available. They'll explain until it clicks — whether that's a third code review or staying late after lecture.
That's how we work: learn and grow stronger together.