• Level Up Coding
  • Posts
  • LUC #59: Optimize Your Searches with Effective Database Indexing

LUC #59: Optimize Your Searches with Effective Database Indexing

Plus, how chaos engineering works, blue-green deployment explained, and what is Terraform and how does it work

This week’s issue brings you:

READ TIME: 4 MINUTES

Thank you to our partner Kickresume who keeps this newsletter free to the reader.

Understanding the Various Approaches to Effective Database Indexing

Most databases require some form of indexing to keep up with performance benchmarks. Searching through a database is much simpler when the data is correctly indexed, which improves the system's overall performance.

A database index is a lot like the index on the back of a book.

It saves you time and energy by allowing you to easily find what you're looking for without having to flick through every page.

Database indexes work the same way.

An index is a key-value pair where the key is used to search for data instead of the corresponding indexed column(s), and the value is a pointer to the relevant row(s) in the table.

To get the most out of your database, you should use the right index type for the job.

One of the most commonly used indexing structures is the B-tree, where keys are sorted and organized in a hierarchical tree structure.

When searching data, the tree is traversed down to the leaf node that contains the appropriate key and pointer to the relevant rows in the table.

B-tree is most commonly used because of its efficiency in storing and searching through ordered data. Their balanced structure means that all keys can be accessed in the same number of steps, making performance consistent.

Hash indexes are best used when you are searching for an exact value match.

The key component of a hash index is the hash function.

When searching for a specific value, the search value is passed through a hash function which returns a hash value. That hash value tells the database where the key and pointers are located in the hash table.

For indexing columns with a low set of unique values, bitmap indexing can be used.

With bitmap indexing, each bitmap represents a unique value. A bitmap indicates the presence or absence of a value in a dataset, using 1’s and 0’s. For existing values, the position of the 1 in the bitmap shows the location of the row in the table.

Bitmap indexes are very effective in handling complex queries where multiple columns are used.

When you are indexing a table, make sure to carefully select the columns to be indexed based on the most frequently used columns in WHERE clauses.

composite index may be used when multiple columns are often used in a WHERE clause together.

With a composite index, a combination of two or more columns are used to create a concatenated key. The keys are then stored based on the index strategy, such as the options mentioned above.

Indexing can be a double-edged sword. It significantly speeds up queries, but it also takes up storage space and adds overhead to operations. Balancing performance and optimal storage is crucial to get the most out of your database without introducing inefficiencies.

What is Terraform, and How Does it Work (Recap)

Terraform, an open-source innovation from HashiCorp, transforms infrastructure management through a high-level configuration language, enabling quick, automated deployments.

Terraform ensures secure, predictable infrastructure updates with features like:

Infrastructure as code — shifts management to code for efficiency and version control.

Execution plans — offers previews before changes, adding predictability.

Resource graph — visualizes dependencies for coherent changes.

Change automation — reduces errors by automating deployments.

Terraform is more flexible than other IaC tools like AWS CloudFormation and Ansible, supporting multiple cloud services and facilitating seamless updates between development and operations.

It enhances efficiency, predictability, and scalability, bridging the gap between development and operations.

Blue-Green Deployment Explained (Recap)

Blue-green deployment is renowned for "zero downtime" and involves maintaining two identical production environments, called "blue" and "green."

At any given time, only one environment is live, (let's say blue),serving all production traffic. When a new version of the software is ready, it's deployed to the inactive environment (green).

After comprehensive testing, users switch to the green environment, making it live while the blue becomes idle for the next update, minimizing downtime and deployment failures.

A high-level overview of how blue-ereen deployment is implemented:

1) Set up two identical environments.
2) Deploy the new version to the inactive environment.
3) Test extensively.
4) Switch traffic.
5) Monitor and roll back if necessary.

What is Chaos Engineering, and How Does it Work (Recap)

Chaos engineering improves system resilience by deliberately injecting faults into systems in a controlled manner, allowing engineering teams to proactively detect and address potential failures before they occur.

Grounded in five fundamental principles:

Hypothesis-driven approach —Define normal system behavior and establish metrics reflecting the system's steady state. This sets the stage for understanding the impact of introduced variables.

Real-world event simulation — Introduce variables simulating real-world disruptions like network outages or traffic spikes, mimicking likely disruptions to provide valuable insights into system coping mechanisms.

Experiments in production — Testing in controlled settings is valuable, but real-world conditions reveal unforeseen vulnerabilities. Production tests should be carefully monitored to ensure accuracy without significant user impact. If experiments in production make sense, proceed prudently, mitigating blast radius and user impact.

Automation at scale —As systems scale, manually conducting experiments becomes impractical. Automated tools and scripts allow systematic, efficient, and comprehensive testing across various parts of the system.

Minimizing impact —Start with smaller experiments to limit the 'blast radius' and gradually scale up. The goal is to learn and improve without compromising system stability, balancing resilience testing with operational integrity.

That wraps up this week’s issue of Level Up Coding’s newsletter!

Join us again next week where we’ll explore and visually distill more important engineering concepts.