Week 22 – Transactions and Locks

Weekly Learning

Interesting week going over transaction and locks. All very new material for me, and I’ve had to read and reread some of it to make sure I’m actually understanding it. The assignment itself wasn’t that hard, but there were both typos and inconsistencies that  make it difficult to get through it. Of all the weeks,  and all the assignments, this has been the most taxing one because of that.

Transactions and locks, and MVCC are subjects I had never studied, or worked with. I’m glad those are solved problems, and that the only part I need to worry about is when/how they should be used which is already hard on its own. They’re not particularly interesting, but I understand why they’re needed. The relevance of keeping data consistent across reads and write.

Understanding that was perhaps the best part of the assignment. Creating the transaction and rollback mechanisms was fun, despite the different setbacks with the unclear assignment. Still, I do think that as much as I’ve learned to like databases and try to understand how they work, DBA work is just not that interesting to me. Mostly because of this type of issues. As developer, I’ll need to learn why this things work the way they do, and how to leverage them on my code. But I would not want to sit down and solve for this sort of optimization or problem in a daily basis.

Still, I’m happy I’ve learned about it. I think transactions, locks, concurrency are very important subjects. Understanding the trade-offs and the performance implications that go along with it are essential for software development, specially in today’s world when the programs we write demand to be both fast, and accurate. ACID as a paradigm is becoming too important to dismiss or gloss over.

Posted in CST-363, CSUMB | Comments Off on Week 22 – Transactions and Locks

Week 21 – Indexes and B-Trees

Weekly Learning

This week we took a fairly deep dive into indexing, the strategies behind it, and the advantages and disadvantages of each. This is by far the densest topic that we’ve covered so far. Again, I have to refer back to my personal experience dealing with databases. I’ve often been called out by DBA for either adding too many, not enough or simply for querying indexes in a not proper way. It certainly makes sense now why indexes are effective, but also why keeping them can be expensive. You index in excess and you’re essentially killing your disk space with data you simply cannot use. Not enough, and you  will be hitting the database for a scan of data you need.  But then, while reading becomes trivial with indexes, writing can be expensive, having to balance b-trees, or update sparse tables.

So what is the right solution? The reality is that you still need to solve the main problem of locating data fast in the database. So the trade-off of each approach mainly depend on the type of data, and the quantity to write. Sparse index over large amounts of data that can be spread over several sectors is a great choice, while for systems with little writes just having pointers on an ordered linked list can be an easy solution that is very effective.

Of all topics so far, this has been one that took me a while to click. The video was perhaps the best addition. Having a clear explanation of the context of b-trees and b+-trees was essential for success, and while the reading was good to get an idea of what is needed, ultimately having a voice tell you how it all works, was the best approach for me.

Posted in CST-363, CSUMB | Comments Off on Week 21 – Indexes and B-Trees

Week 20 – Project 1 Part 2

Weekly Learning

This week we worked on the exam. One of the interesting things about the exam is that despite it being open book, it is still a challenge. I’ve been very careful to craft each query and double check against each data point to make sure I am getting the answer I believe it best fits the question, but some of the queries are very complex and make me second guess myself.

We have also been working on the OLAP part of the project. I have to say that I had never enjoyed database work until now. Before, most of my queries were just basic selects, and getting data was more of a chore. Having practice more both with the designed, and how data is composed, I have started to enjoy writing queries that extract data. Where JOINS were a total mystery for me before, they are now almost second nature. I have been doing a lot of playing around with JOINS, GROUP BY, subqueries of virtual tables, and other fun features of myself.

The OLAP assignment has been a great opportunity to also start shifting my thinking into what data should be collected and processed. I realized because this is all fake data, our questions are probably naive. However, I have a good opportunity to compare it against the questions that my coworkers ask, or the assumptions they make when they request new features. Looking at how I would organize the data, and trying to understand what of my data is valuable is a very important step.

I guess I can only compare it to how people say that half of being good at a job such as programing, is knowing how to google. In this case, I’m getting much better at knowing what questions need to be asked, and how my data can either answer it, or what would need to change to make sure it can answer those questions.

Overall, I can honestly say I came to the class expecting it to be one of the classes that you dread to take, and push through, to being one of the classes I’ve most enjoyed.

Posted in CST-363, CSUMB | Comments Off on Week 20 – Project 1 Part 2

Week 19 – Project 1 – Database Design

Weekly Learning

This has been a very demanding week. Project one has been a challenge with both the creation of the database, as well as creating an application to properly visualize the data. I think we started working on it a little later than we should and that meant we had to pull some longer hours getting everything ready. However, I am pretty satisfied with the result with how little time we actually had to accomplish it.

Victor and I decided to go with a game database. This is fairly unconventional since it does not necessarily have as much transactional data as other application would. But, I think it presented an interesting challenge for both of us. Game have a lot of very interesting Business Intelligence data working behind them, as players move through different levels, or value one item over another when making a purchase. Understanding what your users do is important for any business, and games are no different. This data is a bit more abstract than a reservation system would be, since it must be derived from activity rather than numbers, but it’s equally important to break down those behaviors and make sense of them so the operators of the game can make better decisions that will capitalize and monetize the users.

I also spent a bit of time doing research on OLAP and data cubes. I have worked in the past with BI. But this is the first time I have actually sat down to understand how this data is constructed, processed and broken down to be useful on the backend. In the past I have just been a consumer of the data, and that was good enough to get me through the day, but as I have been digging through the topic, I have found it to be very interesting. The analogy of the hot kitchen is incredibly apt with some of the requests I have gotten in the past. Building the datasets was an ugly process but generated reports that could be very useful to some of my customers.

Posted in CST-363, CSUMB | Comments Off on Week 19 – Project 1 – Database Design

Week 18 – Database Design

Weekly Learning

Database Design is a very interesting topic. My personal experience designing databases has been very limited professionally. Normally, I have gone by what the data looks like and designing around it. Comparing this to how the chapter describes database design, I think the overall outcomes have not been very far from what is expected, but data but this has been certainly based on good tribal knowledge and mentorship. Having read and worked on the cases for normalization, I think I have a much better understanding of why those designs worked. More importantly, I understand why normalization is so important, and why we often sacrifice it to make data more accessible.

We’re also starting to deep our toes into Data Science and how sql supports aggregation. Since I am very much interested in the monetization and targeting strategies, this is a topic that is very dear to me. I often have to look for ways to make sense of purchase trends and patterns, and look for ways to visualize that data so users can look into it. In the past, I’ve used code to aggregate and process the data, or used specialized tools like Druid. When data gets to my hands, it’s often already gone through the ETL process, so i’m very much looking forward to the next couple of weeks to better understand the whole process. But using GROUP BY and other aggregators, already helps me!

Posted in CST-363, CSUMB | Comments Off on Week 18 – Database Design