This week we took a fairly deep dive into indexing, the strategies behind it, and the advantages and disadvantages of each. This is by far the densest topic that we’ve covered so far. Again, I have to refer back to my personal experience dealing with databases. I’ve often been called out by DBA for either adding too many, not enough or simply for querying indexes in a not proper way. It certainly makes sense now why indexes are effective, but also why keeping them can be expensive. You index in excess and you’re essentially killing your disk space with data you simply cannot use. Not enough, and you will be hitting the database for a scan of data you need. But then, while reading becomes trivial with indexes, writing can be expensive, having to balance b-trees, or update sparse tables.
So what is the right solution? The reality is that you still need to solve the main problem of locating data fast in the database. So the trade-off of each approach mainly depend on the type of data, and the quantity to write. Sparse index over large amounts of data that can be spread over several sectors is a great choice, while for systems with little writes just having pointers on an ordered linked list can be an easy solution that is very effective.
Of all topics so far, this has been one that took me a while to click. The video was perhaps the best addition. Having a clear explanation of the context of b-trees and b+-trees was essential for success, and while the reading was good to get an idea of what is needed, ultimately having a voice tell you how it all works, was the best approach for me.