Metronome's Cacophony (6/14) - concurrency in Go - Sharing state

May 1, 2018 · 7 min read · from-java-to-go go concurrency ·

Asynchronous approach (ctd)

Our solutions utilising Goroutines purposefully didn't share much state between Goroutines. I didn't want to pile it on you straight from start. A little exception was a solution with WaitGroup which allowed us to synchronise the completion of all Goroutines with the program termination, but that was easy because we've used proper tool for the job.

Based on previous posts' conclusion, we now want to share the volume value between a Goroutine and main timing loop.

Hmmm...sharing can be fun. However, within the remit of concurrent system design sharing state access is challenging, error prone and often inefficient. I am talking about sharing access to mutable state in particular.

Mutable state is a state that changes during its lifecycle. For example, a list of all people that are currently alive. If you try to do some analysis on it, by the time you reach conclusion it may be stale. Trying sorting it by account balance is nearly impossible as not only the keys are added and removed but also the value changes all the time. Although, with the assumption of descending order, the keys of first entries could be treated as nearly immutable. An extreme example but I hope it shows how hard it is to work with data like that.

Immutability is an antonym of mutability. Immutable state does not change after creation. Think of the year the UK joined the EU (for the first time). No matter how many hands it goes through this fact stays the same. We can even extend it to a whole chronicle of XX century (assuming the regime puts facts before their agenda). You should be able to pass this chronicle dataset around without much worry of it mutating. You can have multiple readers progressing through it simultaneously, at whatever pace, and each would look at exactly same data. You can cross cut by whatever aspect or sort it and it will render a predictable result.

That's way more pleasant and safe to work with.

Have a look at this simple example:

 1package main
 2
 3import (
 4   "fmt"
 5   "sync"
 6)
 7
 8var (
 9   balance = 0
10   financialGroup sync.WaitGroup
11)
12
13func processTransaction(amount int) {
14   balance = balance + amount
15   financialGroup.Done()
16}
17
18func main() {
19
20   for i := 0; i < 10000; i++ {
21      financialGroup.Add(1)
22      go processTransaction(5)
23   }
24
25   financialGroup.Wait()
26   fmt.Printf("Final balance: %d", balance)
27}

We are starting with balance equal 0. Next, we are using a Goroutine with the goal to credit 5. We are repeating that 10000 times. The resulting balance surely is 50000, right?

1Final balance: 47055
2Process finished with exit code 0

Whoops-a-daisy! Where's the money gone?

Note: you will most likely get different result. If you see the correct amount keep re-running the program. You may need to increase the number of iterations to give the bad luck a chance.

Look at line 14 where we are increasing the balance. The issue we are observing is caused by performing non-atomic operation (more about this later) on a shared state. It may be counterintuitive if you are new to programming, but even though adding amount to balance sits on one line, it does not mean it is a single operation after compilation. Compilation translates the source code to low-level operations understood by processor. Just this line will take few operations to complete. The compiled sequence may be:

Move balance value to register
Perform sum in that register
Move sum back from register to balance

If you were to run the above example synchronously (just remove go keyword in line 22), you would get correct result every time. However, the incrementation in line 14 will most likely be compiled to exactly same set of operations.

Note: if you want to experiment with Go source code compilation to assembly have a look at Compiler Explorer

What's the difference then?

Without running a Goroutine, the whole flow is sequential. Flow proceeds from line to line and you can easily predict where the control is going to go after any line.

When running a function in a Goroutine, the code inside the function's body gains some independence. It is as if you set it free to do its own thing. It does not have to wait for anything that happens in the main function or any other Goroutines. Go will decide for you how many threads needs to be created to multiplex Goroutines onto them. That should allow for sensible utilisation of system resources. The processor will then do a bit of one Goroutine and a bit of other, until all of them complete. It all happens very fast (it takes about 5ms to execute the above code on my machine).

The above final balance discrepancy is a result of multiple goroutines trying to operate on the same area of memory without synchronisation between each other. For example:

Thread A	`balance` variable	Thread B
thread picks up goroutine	0
move `balance` (0) to register (0)	0	thread picks up goroutine
add 5 to register (5)	0	move `balance` (0) to register (0)
move register to `balance`	5	add 5 to register (5)
	5	move register to `balance`

and we ended up with balance of 5 after applying two credits of 5 units each.

Note: This is a symptom of another phenomenon, called race condition. It is a condition that occurs when the program's behaviour is compromised by the order in which its tasks execute. You can use Go's compiler to detect some of the race conditions by using -race flag. Also, if you run the code that supports the metronome's engine examples for this series you will see a warning about a race condition. It is because some of the demos purposefully cause race condition to illustrate the issue.

Java developers: coming from Java (or C) background you may wonder what is the equivalent of volatile keyword in Go. There isn't one. You have to use Go's concurrency primitives in order to safely share memory between threads. Even better, use channels to shift the design paradigm, and you will not have to share memory. More about this later.

Conclusion

Sharing state in concurrent model is a boggy ground. Here's why you have to be alert to it:

sharing state calls for extra design decisions, requires extra code and caution to synchronise access to state and use it correctly,
badly synchronised state will result in performance issues, inconsistencies, leaking resources and systems being down, but you wouldn't necessarily know when, under what circumstances, how all the symptoms manifest themselves and what the severity may be.
it can make your application expose the most sensitive details (e.g. private messages, financial records or confidential details of criminal investigation) which - as you can imagine - can have devastating effect on people and companies,
trying to replicate failures resulting from incorrectly synchronised state (for testing purpose) is sometimes not feasible. There can be a high number of factors which can spoil the test conditions and - potentially - prevent the error from happening when you want to diagnose it and verify the fix worked.
it is often dependant upon developer's experience and due diligence to care about state sharing, as functional requirements wouldn't normally mention anything about it (it is, quite rightly, considered to be an implementation detail or architectural decision at best),
be aware that even if you properly synchronise state access, performance of the application may be dropping as synchronisation will mean your Goroutines will now have to wait for a right time to perform their job. We say the contention between threads is increasing.
synchronising will reduce the readability and ease of maintainability of your code, as you had to type more code.

Real monster, feels very organic as well.

Best would be to avoid sharing state, and that's what you can read in Go's source code

Share memory by communicating; don't communicate by sharing memory.

-- Go's atomic package source

That's the goal I'd like to lead you towards in this series.

However, for the purpose of learning about concurrency primitives we need to go through it, as sometimes you will have to communicate by sharing memory.

In next sections I will show you how to use Go's tools to share state with a chance of success.

Metronome's Cacophony (6/14) - concurrency in Go - Sharing state

Asynchronous approach (ctd)

Sharing access to mutable state

Conclusion

Other posts in metronome-cacophony series