Database
MySQL, SQL queries, modelling, and indexing.
Q. SQL vs NoSQL — when to use each? easy ›
SQL databases (MySQL, PostgreSQL) store data in structured tables with a fixed schema. They enforce relationships, support complex queries, and guarantee strong consistency through ACID transactions. Choose SQL when your data is relational and predictable — e.g., users, orders, payments.
NoSQL databases (MongoDB, DynamoDB, Redis) offer flexible schemas and are designed for horizontal scaling. They trade some consistency guarantees for speed and flexibility. Choose NoSQL when you need:
- Rapidly evolving schemas (e.g., early-stage prototypes)
- High write throughput at scale (e.g., event logging, IoT)
- Document-shaped data that doesn’t fit neatly into tables
In practice many production systems use both — SQL for transactional data and NoSQL for caching, search, or analytics.
Q. Explain the types of SQL JOINs. medium ›
- INNER JOIN — only rows matching in both tables.
- LEFT JOIN — all left rows + matching right (NULLs if none).
- RIGHT JOIN — all right rows + matching left.
- FULL OUTER JOIN — all rows from both (MySQL emulates with UNION).
SELECT u.name, o.total
FROM users u
LEFT JOIN orders o ON o.user_id = u.id; Q. What is an index? What's the trade-off? medium ›
An index is a data structure (usually a B-tree) that speeds up reads by avoiding full table scans — like a book’s index. Trade-off: it speeds up SELECT/WHERE/JOIN but slows down INSERT/UPDATE/DELETE and uses storage. Index columns you frequently filter or join on.
Q. What is database normalization? medium ›
Normalization is the process of organizing tables to reduce redundancy and prevent update anomalies.
- 1NF (First Normal Form) — every column holds atomic (indivisible) values; no repeating groups.
- 2NF — meets 1NF and has no partial dependency (every non-key column depends on the whole primary key, not just part of it).
- 3NF — meets 2NF and has no transitive dependency (non-key columns depend only on the primary key, not on other non-key columns).
Example of a 3NF violation:
| student_id | student_name | dept_id | dept_name |
|---|---|---|---|
| 1 | Alice | 10 | CS |
dept_name depends on dept_id, not on student_id. Fix by moving department info to its own table.
Denormalization is the deliberate reverse — adding redundancy (e.g., storing a computed total) to speed up reads at the cost of extra write complexity.
Q. What is SQL injection and how do you prevent it? hard ›
An attack where malicious input is concatenated into a query to alter its logic. Prevent it with parameterized queries / prepared statements (and ORMs that use them), plus input validation and least-privilege DB accounts. Never build SQL by string concatenation.
// safe — parameterized
db.query("SELECT * FROM users WHERE email = ?", [email]); Q. Primary key vs foreign key vs unique key. easy ›
- Primary Key — uniquely identifies every row in a table. Only one per table. Cannot be
NULL. - Foreign Key — a column that references the primary key of another table, enforcing referential integrity (you can’t insert an order for a non-existent user).
- Unique Key — enforces uniqueness on a column (like email), but a table can have multiple unique keys and they allow one NULL (behavior varies by RDBMS).
CREATE TABLE users (
id INT PRIMARY KEY, -- primary key
email VARCHAR(255) UNIQUE, -- unique key
dept INT,
FOREIGN KEY (dept) REFERENCES departments(id) -- foreign key
);
A primary key is implicitly unique, but a unique key is not automatically the primary key.
Q. What are ACID properties? medium ›
ACID is a set of guarantees that database transactions provide:
- Atomicity — a transaction is all-or-nothing. If any part fails, the entire transaction rolls back.
- Consistency — a transaction moves the database from one valid state to another, respecting all constraints and rules.
- Isolation — concurrent transactions do not interfere with each other. The result is the same as if they ran sequentially (exact behavior depends on the isolation level).
- Durability — once a transaction is committed, the data survives crashes, power loss, or restarts (typically via write-ahead logs).
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 500 WHERE id = 1;
UPDATE accounts SET balance = balance + 500 WHERE id = 2;
COMMIT;
-- If either UPDATE fails, both are rolled back (Atomicity).
NoSQL databases often relax one or more ACID properties in favor of availability and partition tolerance (see the CAP theorem).
Q. WHERE vs HAVING; difference and order of execution. medium ›
- WHERE filters individual rows before grouping.
- HAVING filters groups after
GROUP BYhas been applied.
-- WHERE: filter rows before grouping
SELECT department, COUNT(*) AS cnt
FROM employees
WHERE status = 'active'
GROUP BY department
HAVING cnt > 5; -- HAVING: filter groups after grouping
Logical order of SQL execution:
FROM(andJOIN)WHEREGROUP BYHAVINGSELECTORDER BYLIMIT
You cannot use column aliases defined in
SELECTinside aWHEREclause (becauseSELECTruns later), but some databases like MySQL allow aliases inHAVING.
Q. Find the 2nd highest salary — write the query. hard ›
Approach 1 — LIMIT / OFFSET (MySQL, PostgreSQL):
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 1;
Approach 2 — Subquery:
SELECT MAX(salary) AS second_highest
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
Approach 3 — DENSE_RANK window function (works for Nth highest):
SELECT salary
FROM (
SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk
FROM employees
) ranked
WHERE rnk = 2;
Use
DENSE_RANKoverROW_NUMBERwhen duplicate salaries should share the same rank.DISTINCTin Approach 1 handles ties but OFFSET can still skip values unexpectedly if you’re not careful.