Fortsätt till huvudinnehåll

Some database theory

For reasons at work I am currently looking into how storing and retrieving data from databases works. Was introduced to an abbrivation that I didn't know of earlier, ACID. ACID is a combination of Atomicity, Consistency, Isolation, and Durability.

  • Atomicity - requires each transaction to be "all or nothing". A transaction may consist of several writes to the database, however, if any of these fail, or is interrupted, the entire transaction must be invalidated in order to have the database in a valid state.
  • Consistency - requires that any transaction will bring the database from one valid state to another. No transactions may currupt the database.
  • Isolation - ensures that concurrent transactions results in a system state that would be obtained if they were executed sequentially. Read that sentence again, it is a tricky one. Be able to perform concurrent transactions in a way that the result is like they have been done sequentially...
  • Durability - ensures that a transaction, once commited, will remain so even if the power is lost, something crashes, some other error occurs.
One technique for providing the A and D (atomicity and durability) is Write-ahead logging (WAL). With this tecnique all modifications to the database is written to a log before being applied. The log usually contains both redo and undo information. When recovering from a power loss the log can be compared to the actual contents of the database to determine where in the execution the power was lost. That information can then be used to determine if and how to roll back, finish any interruptad transactions, or leave the database in the state it was when the power was lost.

That pretty much sums up what I have been learning about databases today. More reading on the topic tomorrow (and hopefully some implementation).

Kommentarer

Skicka en kommentar

Populära inlägg i den här bloggen

Does TDD really improve software quality?

I have asked myself this question several times, and searched for answers, without coming up with any clear answer. Therefore I have decided to go hard core TDD for a longer period of time (at least 6 months) to really evaluate the effects. There are several things that I find confusing when it comes to TDD. One example is what actually defines a unit test. What is a "unit" anyway? After reading a bit about it I found a text claiming that the "unit" is "a unit of work", i.e. something quite small. Like converting a string to UPPERCASE or splitting a string into an ['a','r', 'r', 'a', 'y'] of chars. This work is usually performed by a single call to a single method in a single, isolated, class. So, what does it mean that a class is isolated? Does it mean that it doesn't have any dependencies to other classes? NO! In the context of TDD it means that any dependencies are supplied by the test environment, for exa...

Codility tasks - Part I

I was recently faced with two codility tasks when applying for a job as an Embedded Software Engineer. For those of you who arn't familiar with Codility you can check out their website here:  www.codility.com Task one - Dominator The first task was called Dominator. The goal was to, given a std::vector of integers, find an integer that occurs in more than half of the positions in the vector. If no dominator was found -1 should be returned. My approach was to loop through the vector from the first to the last element, using a std::map to count the number of occurences of each integer. If the count ever reached above half the size of the vector I stopped and returned that integer and if I reached the end without finding a dominator I returned -1. So was that a good approach? Well, the reviewer at the company rated the solution as 'pretty ok'. His preferred solution was store the first integer in the array and set a counter to 1. Then loop through the remaining i...

Codility tasks - Part II

Now, the second codility task I was faced with was a bit tougher. The goal was to create a function that, given a vector of integers A and an integer K, returned the number of integer pairs in the vector that, when added, sums up to K. Let me give you an example. Assume that you are given a vector A = [0, -1, 3, 2, -5, 7] and K = 2. Possible combinations to get K are (0, 2), (-1, 3), (3, -1), (2, 0),  (-5, 7), and (7, -5). In other words, the function should return 6. Now, how did I solve this task? The first solution that came to mind involved nested for-loops. The outer loop picking one integer at the time from the vector and the inner loop adding the integer to the others one by one to see if the result is K. This solution works, but it does not scale well. Time complexity will be O(N**2) ,   something that for large vectors will result in very long execution times. My second approach was to use my old friend, the integer counter, and count all occurences of each...