1. Overview#
When Google initially wrote Golang, it was to address the high concurrency needs of its internal business, and one of Golang's major features is high concurrency. This article introduces the principles, concepts, and technical points related to Golang's high concurrency.
I will first introduce some concepts, such as: parallelism and concurrency, processes, threads, and goroutines, along with their differences. Then I will introduce goroutines and channels in Golang, which are key to achieving high concurrency in Golang. After that, I will discuss select, timers, runtime, and synchronization locks, and finally introduce the concurrency advantages of Go, its concurrency model, and Go's scheduler.
2. Parallelism and Concurrency#
If you have studied operating systems, you should be familiar with parallelism and concurrency.
Parallelism:
At the same time, multiple instructions are executed simultaneously on multiple processors.
Concurrency:
At the same time, only one instruction can be executed, but multiple process instructions are executed in a rapidly rotating manner (different rotation algorithms may apply based on different situations).
Differences between parallelism and concurrency:
- Parallelism exists in multi-processor systems, while concurrency can exist in both single-processor and multi-processor systems.
- Parallelism requires the program to execute multiple operations simultaneously, while concurrency only requires the program to pretend to execute multiple operations simultaneously (executing one operation per time slice and then rotating through multiple operations).
3. Processes, Threads, Goroutines#
Process:
A program execution environment that contains computer instructions, user data, and system data, as well as other types of resources it is allowed to acquire.
Thread:
A smaller and lighter entity compared to a process. A thread is created by a process and contains its own control flow and stack. The difference between processes and threads is that a process is an executing binary file, while a thread is a subset of a process.
Goroutine:
A goroutine is the smallest unit of concurrent execution in a Go program. Unlike Unix processes, goroutines are not autonomous entities. The main advantage of goroutines is that they are very lightweight, allowing thousands to run without issues. Goroutines are lighter than threads and require a process environment to exist. When creating a goroutine, a process is needed, and this process must have at least one thread. Goroutines are a type of user-space lightweight thread, and their scheduling is entirely controlled by the user. Switching between goroutines only requires saving the context of the task, with no kernel overhead. The thread stack space is typically 2M, while the minimum stack space for a goroutine is 2K.
4. Goroutine#
Having introduced the concept of goroutines (hereafter referred to as goroutines), let's look at the actual syntax for goroutines.
In Go, you can start a new goroutine by using the go
keyword followed by the function name or a complete anonymous function definition. When a function is called with the go
keyword, it returns immediately, and the function runs in the background as a goroutine, while the rest of the program continues to execute.
Creating a goroutine
package main
import (
"fmt"
"time"
)
func main() {
go function()
go func() {
for i := 10; i < 20; i++ {
fmt.Print(i, " ")
}
}()
time.Sleep(1 * time.Second)
}
func function() {
for i := 0; i < 10; i++ {
fmt.Print(i)
}
fmt.Println()
}
You may notice that the output above is not fixed (the main function may finish early). We can use the sync package to solve this problem.
package main
import (
"flag"
"fmt"
"sync"
)
func main() {
n := flag.Int("n", 20, "Number of goroutines")
flag.Parse()
count := *n
fmt.Printf("Going to create %d goroutines.\n", count)
var waitGroup sync.WaitGroup // Define a variable of type sync.WaitGroup
fmt.Printf("%#v\n", waitGroup)
for i := 0; i < count; i++ { // Use a for loop to create the required number of goroutines
waitGroup.Add(1) // Each call increases the counter in the sync.WaitGroup variable to prevent any race conditions
go func(x int) {
defer waitGroup.Done() // Decrease the sync.WaitGroup variable by one
fmt.Printf("%d ", x)
}(i)
}
fmt.Printf("%#v\n", waitGroup)
waitGroup.Wait() // The sync.Wait call will block until the counter in the sync.WaitGroup variable is 0, ensuring all goroutines complete execution
fmt.Println("\nExiting...")
}
5. Channel#
Channels are a communication mechanism in Go that allows data transmission between goroutines.
Some explicit rules:
- Each channel only allows the exchange of data of a specified type, which is the element type of the channel.
- To ensure the channel operates normally, there must be a method to receive data from the channel.
You can declare a channel using the chan
keyword, and you can close the channel using the close()
function.
When using a channel as a function parameter, you can specify it as a one-way channel.
5.1 Writing to a Channel#
package main
import (
"fmt"
"time"
)
func main() {
c := make(chan int)
go writeToChannel(c, 10)
time.Sleep(1 * time.Second)
}
func writeToChannel(c chan int, x int) {
fmt.Println(x)
c <- x
close(c)
fmt.Println(x)
}
5.2 Receiving Data from a Channel#
package main
import (
"fmt"
"time"
)
func main() {
c := make(chan int)
go writeToChannel(c, 10)
time.Sleep(1 * time.Second)
fmt.Println("Read:", <-c)
time.Sleep(1 * time.Second)
_, ok := <-c
if ok {
fmt.Println("Channel is open!")
}else {
fmt.Println("Channel is closed!")
}
}
func writeToChannel(c chan int, x int) {
fmt.Println("l", x)
c <- x
close(c)
fmt.Println("2", x)
}
5.3 Passing a Channel as a Function Parameter#
package main
import (
"fmt"
//"time"
)
func main() {
c := make(chan bool, 1)
for i := 0; i < 10; i++ {
go Go(c, i)
}
<-c
}
func Go(c chan bool, index int) {
sum := 0
for i := 0; i < 1000000; i++ {
sum += i
}
fmt.Println(sum)
c <- true
}
6. Select#
The select statement in Go looks like a switch statement for channels. In fact, select allows a goroutine to wait for multiple communication operations, so the main benefit of using select is that it can handle multiple channels and perform non-blocking operations.
Note: The biggest issue with using channels and select is deadlock. To resolve deadlock issues, synchronization locks will be introduced later.
package main
import(
"fmt"
"math/rand"
"os"
"strconv"
"time"
)
func main() {
rand.Seed(time.Now().Unix())
createNumber := make(chan int)
end := make(chan bool)
if len(os.Args) != 2 {
fmt.Println("Please give me an integer!")
return
}
n, _ := strconv.Atoi(os.Args[1])
fmt.Printf("Going to create %d random numbers.\n", n)
go gen(0, 2*n, createNumber, end)
for i := 0; i < n; i++ {
fmt.Printf("%d ", <-createNumber)
}
time.Sleep(5 * time.Second) // Give enough time for the time.After() function in gen() to return and activate the select branch
fmt.Println("Exiting...")
end <- true // Activate the case->end branch in the select statement inside gen() to terminate the program and execute related code
}
func gen(min, max int, createNumber chan int, end chan bool) {
for {
select {
case createNumber <- rand.Intn(max-min) + min:
case <- end:
close(end)
return
case <- time.After(4 * time.Second): // The time.After function returns after a specified time, thus unlocking the select statement when other channels are blocked
fmt.Println("\ntime.After()!") // This case can be treated as a default branch
}
}
}
Note: The select statement does not require a default branch.
The select statement does not evaluate in order, as all channels are checked simultaneously.
If no channels in the select statement are ready, the select statement will block until a channel is ready, at which point the Go runtime will make a random selection among those ready channels, ensuring fairness.
The greatest advantage of select is that it can connect, orchestrate, and manage multiple channels.
When channels connect goroutines, select connects those channels that connect the goroutines.
7. Timers#
The select statement also uses timers, so what is a timer?
A timer is a mechanism that executes a task at a future point in time by setting it.
There are two types of timers:
- One-time delay mode
- Interval mode that executes repeatedly after a certain period
Timers in Go are quite comprehensive, and all APIs are in the time package.
7.1 Delay Mode#
There are two types of delayed execution: time.After and time.Sleep.
7.1.1 time.After#
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("1")
timeAfterTrigger := time.After(1 * time.Second)
<-timeAfterTrigger
fmt.Println("2")
}
The time package provides several pre-defined constants of type int.
const (
Nanosecond Duration = 1
Microsecond = 1000 * Nanosecond
Millisecond = 1000 * Microsecond
Second = 1000 * Millisecond
Minute = 60 * Second
Hour = 60 * Minute
)
7.1.2 time.Sleep#
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("1")
time.Sleep(1 * time.Second)
fmt.Println("2")
}
The difference between the two is that time.Sleep blocks the current goroutine, while time.After is implemented based on channels and can be passed between different goroutines.
7.2 Interval Mode#
Interval mode can be divided into two types: one that ends after executing N times, and another that executes continuously without stopping.
7.2.1 time.NewTicker#
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("1")
count := 0
timeTicker := time.NewTicker(1 * time.Second)
for {
<-timeTicker.C
fmt.Println("Output 2 every second")
count++
if count >= 5 {
timeTicker.Stop()
}
}
}
7.2.2 time.Tick#
package main
import (
"fmt"
"time"
)
func main() {
t := time.Tick(1 * time.Second)
for {
<-t
fmt.Println("Output once every second")
}
}
7.3 Controlling Timers#
Timers provide Stop and Reset methods.
- The Stop method stops the timer.
- The Reset method changes the timer's interval.
7.3.1 time.Stop#
package main
import (
"fmt"
"time"
)
func main() {
timer := time.NewTimer(time.Second * 6)
go func() {
<-timer.C
fmt.Println("Time's up")
}()
timer.Stop()
}
7.3.2 time.Reset#
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("1")
count := 0
timeTicker := time.NewTicker(1 * time.Second)
for {
<-timeTicker.C
fmt.Println("2")
count++
if count >= 3 {
timeTicker.Reset(2 * time.Second)
}
}
}
8. Runtime#
The runtime is the infrastructure needed for Go language execution, such as controlling goroutine functionality, debugging, and support for pprof, trace, and race detection, memory allocation, system operations, and CPU-related operations encapsulation (signal handling, system calls, register operations, atomic operations, etc.), as well as the implementation of built-in types like map, channel, string, and reflection.
Unlike the runtime in Java and Python, which is a virtual machine, Go's runtime is compiled together with user code into a single executable file.
Development history of runtime:
9. Synchronization Locks#
The biggest issue raised in the previous sections regarding channels and select is deadlock. This section introduces the solution to deadlock issues—synchronization locks.
There are two ways to implement synchronization locks in Go: atomic locks and mutex locks.
9.1 Atomic Locks#
You can send messages to all goroutines using a certain signal.
package main
import (
"fmt"
"sync"
"sync/atomic"
"time"
)
var (
shotdown int64 // This flag notifies multiple goroutines of the status
wg sync.WaitGroup
)
func main() {
wg.Add(2)
go doWork("A")
go doWork("B")
time.Sleep(1 * time.Second)
atomic.StoreInt64(&shotdown, 1) // Modify
wg.Wait()
}
func doWork(s string) {
defer wg.Done()
for {
fmt.Printf("Doing homework %s\n", s)
time.Sleep(2 * time.Second)
if atomic.LoadInt64(&shotdown) == 1 { // Read
fmt.Printf("Shutting down homework %s\n", s)
break
}
}
}
9.2 Mutex Locks#
Using a mutex, you can encapsulate a critical section to allow only a single goroutine to execute.
package main
import (
"fmt"
"runtime"
"sync"
)
var (
counter int
wg sync.WaitGroup
mutex sync.Mutex // Define the critical section
)
func main() {
wg.Add(2)
go incCount(1)
go incCount(2)
wg.Wait()
fmt.Printf("Final Counter: %d\n", counter)
}
func incCount(i int) {
defer wg.Done()
for count := 0; count < 2; count++ {
mutex.Lock()
{
value := counter
runtime.Gosched()
value++
counter = value
}
mutex.Unlock()
}
}
10. Advantages of Go Concurrency#
Go language's built-in higher-level API for concurrent programming is based on the CSP (communicating sequential processes) model. This means that explicit locks can be avoided, as Go uses channels to send and receive data safely for synchronization, greatly simplifying the writing of concurrent programs.
In general, a typical desktop computer running a dozen or twenty threads can become overloaded, but the same machine can easily handle hundreds, thousands, or even tens of thousands of goroutines competing for resources.
11. Go Concurrency Model#
Go implements two forms of concurrency:
- Multi-threaded shared memory (communicating through shared memory)
- CSP (communicating sequential processes) concurrency model (sharing memory through communication)
Do not communicate by sharing memory; instead, share memory by communicating.
Threads in Java, C++, and Python communicate through shared memory.
Go's CSP concurrency model is implemented through goroutines and channels.
Example of using goroutines and channels together:
package main
import (
"fmt"
)
// Write Data
func writeData(intChan chan int) {
for i := 1; i <= 50; i++ {
// Put data
intChan<- i //
fmt.Println("writeData ", i)
}
close(intChan) // Close
}
// Read Data
func readData(intChan chan int, exitChan chan bool) {
for {
v, ok := <-intChan
if !ok {
break
}
fmt.Printf("readData read data=%v\n", v)
}
// After reading data, the task is complete
exitChan<- true
close(exitChan)
}
func main() {
// Create two channels
intChan := make(chan int, 10)
exitChan := make(chan bool, 1)
go writeData(intChan)
go readData(intChan, exitChan)
for {
_, ok := <-exitChan
if !ok {
break
}
}
}
12 Go Scheduler#
The Go scheduler uses three structures:
G:
G represents a goroutine. Each goroutine corresponds to a G structure, which stores the goroutine's running stack, state, and task function, and can be reused.
M:
M represents a kernel thread, which is the actual resource that performs the computation. After binding to a valid P, it enters the scheduling loop; the scheduling loop mechanism roughly retrieves from the Global queue, the Local queue of P, and the wait queue.
P:
P represents a logical processor, indicating the context of scheduling. It can be thought of as a local scheduler that allows Go code to run on a separate thread. This is key to mapping Go from an N:1 scheduler to an M scheduler.
For G, P is equivalent to a CPU core; G can only be scheduled when bound to P.
For M, P provides the relevant execution environment (Context), such as memory allocation state (mcache), task queue (G), etc.
The number of P determines the maximum number of G that can run in parallel in the system (provided that the number of physical CPU cores >= number of P).
The number of P is determined by the user-set GoMAXPROCS, but regardless of how large GoMAXPROCS is set, the maximum number of P is 256.
Using the classic whack-a-mole model to illustrate the relationship between the three:
The task of the mole is: there are several bricks on the construction site, and the mole uses a cart to transport the bricks to the fire source.
13. Conclusion#
This article introduced some knowledge related to Golang concurrency, starting from basic concepts such as parallelism, concurrency, processes, threads, and goroutines, to practical uses of Golang concurrency, including goroutines, channels, select, timers, and synchronization locks. It also briefly introduced runtime and concluded with Go's scheduler model.