Explore go runtime engine

作者 Lu Liang 日期 2018-11-17
Explore go runtime engine

Go runtime engine是Go语言中非常有意思的部分。下面我们通过go的hello_world来初步分析,探索一下它的启动顺序及调用函数和源程序文件。

hello_world source code.

This is the source code of hello_world. The change is that we add 1 minute sleep time before printing “hello world”.

➜  more hello_world.go
package main

import (
"fmt"
"time"
)

func main() {
time.Sleep(time.Duration(60)*time.Second)
fmt.Printf("hello, world\n")
}

entry point of hello_world.

Since we compile this program in linux platform. The binary execution of hello_world is ELF format. We can get the entry point with readelf command. The following output shows the entry point is “0x452590”

➜  file ./hello_world
./hello_world: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped

➜ readelf -h ./hello_world
ELF 头:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
类别: ELF64
数据: 2 补码,小端序 (little endian)
版本: 1 (current)
OS/ABI: UNIX - System V
ABI 版本: 0
类型: EXEC (可执行文件)
系统架构: Advanced Micro Devices X86-64
版本: 0x1
入口点地址: 0x452590
程序头起点: 64 (bytes into file)
Start of section headers: 456 (bytes into file)
标志: 0x0
本头的大小: 64 (字节)
程序头大小: 56 (字节)
Number of program headers: 7
节头大小: 64 (字节)
节头数量: 23
字符串表索引节头: 3

decompile the binary to asm to find the sequence of execution.

At the entry point, we find that the first called function is ‘_rt0_amd64_linux’.

➜ readelf -s ./hello_world|grep 452590
2185: 0000000000452590 18 FUNC GLOBAL DEFAULT 1 _rt0_amd64_linux

Let’s continue to decompile the binary file with objdump from the entry point. We will find it will call main function at 0x4525b0. The main function will call runtime.rt0_go

➜  objdump -d --start-address=0x452590 ./hello_world |more

./hello_world: 文件格式 elf64-x86-64


Disassembly of section .text:

0000000000452590 <_rt0_amd64_linux>:
452590: 48 8d 74 24 08 lea 0x8(%rsp),%rsi
452595: 48 8b 3c 24 mov (%rsp),%rdi
452599: 48 8d 05 10 00 00 00 lea 0x10(%rip),%rax # 4525b0 <main>
4525a0: ff e0 jmpq *%rax
4525a2: cc int3
4525a3: cc int3
4525a4: cc int3
4525a5: cc int3
4525a6: cc int3
4525a7: cc int3
4525a8: cc int3
4525a9: cc int3
4525aa: cc int3
4525ab: cc int3
4525ac: cc int3
4525ad: cc int3
4525ae: cc int3
4525af: cc int3

00000000004525b0 <main>:
4525b0: 48 8d 05 c9 c7 ff ff lea -0x3837(%rip),%rax # 44ed80 <runtime.rt0_go>
4525b7: ff e0 jmpq *%rax
4525b9: cc int3
4525ba: cc int3
4525bb: cc int3
4525bc: cc int3
4525bd: cc int3
4525be: cc int3
4525bf: cc int3

Let’s go to 44ed80 to figure out what it did in runtime.rt0_go. By looking into the asm code, we know that it did some runtime checks with hardware and then call the following functions.

  • runtime.args (runtime/runtime1.go) –> 将argc,argv设置到static全局变量
  • runtime.osinit (runtime/os_linux.go) –> 设置runtime.ncpu
  • runtime.schedinit (runtime/proc.go) –> 内存管理初始化,根据GOMAXPROCS设置使用的procs等等
  • runtime.newproc (runtime/proc.go) –> 新开个goroutine把runtime.main放到就绪线程队列里面
  • runtime.mstart (runtime/proc.go) –> 调用到调度函数schedule执行就绪线程队列中的main协程
➜  objdump -d --start-address=0x44ed80 ./hello_world |more

./hello_world: 文件格式 elf64-x86-64


Disassembly of section .text:

000000000044ed80 <runtime.rt0_go>:
44ed80: 48 89 f8 mov %rdi,%rax
44ed83: 48 89 f3 mov %rsi,%rbx
44ed86: 48 83 ec 27 sub $0x27,%rsp
44ed8a: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
44ed8e: 48 89 44 24 10 mov %rax,0x10(%rsp)
44ed93: 48 89 5c 24 18 mov %rbx,0x18(%rsp)
44ed98: 48 8d 3d c1 fc 0d 00 lea 0xdfcc1(%rip),%rdi # 52ea60 <runtime.g0>
44ed9f: 48 8d 9c 24 68 00 ff lea -0xff98(%rsp),%rbx
44eda6: ff
44eda7: 48 89 5f 10 mov %rbx,0x10(%rdi)
44edab: 48 89 5f 18 mov %rbx,0x18(%rdi)
44edaf: 48 89 1f mov %rbx,(%rdi)
44edb2: 48 89 67 08 mov %rsp,0x8(%rdi)
44edb6: 31 c0 xor %eax,%eax
44edb8: 0f a2 cpuid
44edba: 89 c6 mov %eax,%esi
44edbc: 83 f8 00 cmp $0x0,%eax
44edbf: 0f 84 01 01 00 00 je 44eec6 <runtime.rt0_go+0x146>
44edc5: 81 fb 47 65 6e 75 cmp $0x756e6547,%ebx
44edcb: 75 1e jne 44edeb <runtime.rt0_go+0x6b>
44edcd: 81 fa 69 6e 65 49 cmp $0x49656e69,%edx
44edd3: 75 16 jne 44edeb <runtime.rt0_go+0x6b>
44edd5: 81 f9 6e 74 65 6c cmp $0x6c65746e,%ecx
44eddb: 75 0e jne 44edeb <runtime.rt0_go+0x6b>
44eddd: c6 05 6e da 0f 00 01 movb $0x1,0xfda6e(%rip) # 54c852 <runtime.isIntel>
44ede4: c6 05 6b da 0f 00 01 movb $0x1,0xfda6b(%rip) # 54c856 <runtime.lfenceBeforeRdtsc>
44edeb: b8 01 00 00 00 mov $0x1,%eax
44edf0: 0f a2 cpuid
44edf2: 89 05 b4 da 0f 00 mov %eax,0xfdab4(%rip) # 54c8ac <runtime.processorVersionInfo>
44edf8: f7 c2 00 00 00 04 test $0x4000000,%edx
44edfe: 0f 95 05 5c da 0f 00 setne 0xfda5c(%rip) # 54c861 <runtime.support_sse2>
44ee05: f7 c1 00 02 00 00 test $0x200,%ecx
44ee0b: 0f 95 05 52 da 0f 00 setne 0xfda52(%rip) # 54c864 <runtime.support_ssse3>
44ee12: f7 c1 00 00 08 00 test $0x80000,%ecx
44ee18: 0f 95 05 43 da 0f 00 setne 0xfda43(%rip) # 54c862 <runtime.support_sse41>
44ee1f: f7 c1 00 00 10 00 test $0x100000,%ecx
44ee25: 0f 95 05 37 da 0f 00 setne 0xfda37(%rip) # 54c863 <runtime.support_sse42>
44ee2c: f7 c1 00 00 80 00 test $0x800000,%ecx
44ee32: 0f 95 05 27 da 0f 00 setne 0xfda27(%rip) # 54c860 <runtime.support_popcnt>
44ee39: f7 c1 00 00 00 02 test $0x2000000,%ecx
44ee3f: 0f 95 05 13 da 0f 00 setne 0xfda13(%rip) # 54c859 <runtime.support_aes>
44ee46: f7 c1 00 00 00 08 test $0x8000000,%ecx
44ee4c: 0f 95 05 0c da 0f 00 setne 0xfda0c(%rip) # 54c85f <runtime.support_osxsave>
44ee53: f7 c1 00 00 00 10 test $0x10000000,%ecx
44ee59: 0f 95 05 fa d9 0f 00 setne 0xfd9fa(%rip) # 54c85a <runtime.support_avx>
44ee60: 83 fe 07 cmp $0x7,%esi
44ee63: 7c 3d jl 44eea2 <runtime.rt0_go+0x122>
44ee65: b8 07 00 00 00 mov $0x7,%eax
44ee6a: 31 c9 xor %ecx,%ecx
44ee6c: 0f a2 cpuid
44ee6e: f7 c3 08 00 00 00 test $0x8,%ebx
44ee74: 0f 95 05 e1 d9 0f 00 setne 0xfd9e1(%rip) # 54c85c <runtime.support_bmi1>
44ee7b: f7 c3 20 00 00 00 test $0x20,%ebx
44ee81: 0f 95 05 d3 d9 0f 00 setne 0xfd9d3(%rip) # 54c85b <runtime.support_avx2>
44ee88: f7 c3 00 01 00 00 test $0x100,%ebx
44ee8e: 0f 95 05 c8 d9 0f 00 setne 0xfd9c8(%rip) # 54c85d <runtime.support_bmi2>
44ee95: f7 c3 00 02 00 00 test $0x200,%ebx
44ee9b: 0f 95 05 bc d9 0f 00 setne 0xfd9bc(%rip) # 54c85e <runtime.support_erms>
44eea2: 80 3d b6 d9 0f 00 01 cmpb $0x1,0xfd9b6(%rip) # 54c85f <runtime.support_osxsave>
44eea9: 75 0d jne 44eeb8 <runtime.rt0_go+0x138>
44eeab: 31 c9 xor %ecx,%ecx
44eead: 0f 01 d0 xgetbv
44eeb0: 83 e0 06 and $0x6,%eax
44eeb3: 83 f8 06 cmp $0x6,%eax
44eeb6: 74 0e je 44eec6 <runtime.rt0_go+0x146>
44eeb8: c6 05 9b d9 0f 00 00 movb $0x0,0xfd99b(%rip) # 54c85a <runtime.support_avx>
44eebf: c6 05 95 d9 0f 00 00 movb $0x0,0xfd995(%rip) # 54c85b <runtime.support_avx2>
44eec6: 48 8b 05 5b f3 0d 00 mov 0xdf35b(%rip),%rax # 52e228 <_cgo_init>
44eecd: 48 85 c0 test %rax,%rax
44eed0: 74 26 je 44eef8 <runtime.rt0_go+0x178>
44eed2: 48 89 f9 mov %rdi,%rcx
44eed5: 48 8d 35 c4 1b 00 00 lea 0x1bc4(%rip),%rsi # 450aa0 <setg_gcc>
44eedc: ff d0 callq *%rax
44eede: 48 8d 0d 7b fb 0d 00 lea 0xdfb7b(%rip),%rcx # 52ea60 <runtime.g0>
44eee5: 48 8b 01 mov (%rcx),%rax
44eee8: 48 05 70 03 00 00 add $0x370,%rax
44eeee: 48 89 41 10 mov %rax,0x10(%rcx)
44eef2: 48 89 41 18 mov %rax,0x18(%rcx)
44eef6: eb 2f jmp 44ef27 <runtime.rt0_go+0x1a7>
44eef8: 48 8d 3d e9 00 0e 00 lea 0xe00e9(%rip),%rdi # 52efe8 <runtime.m0+0x88>
44eeff: e8 8c 3d 00 00 callq 452c90 <runtime.settls>
44ef04: 64 48 c7 04 25 f8 ff movq $0x123,%fs:0xfffffffffffffff8
44ef0b: ff ff 23 01 00 00
44ef11: 48 8b 05 d0 00 0e 00 mov 0xe00d0(%rip),%rax # 52efe8 <runtime.m0+0x88>
44ef18: 48 3d 23 01 00 00 cmp $0x123,%rax
44ef1e: 74 07 je 44ef27 <runtime.rt0_go+0x1a7>
44ef20: 89 04 25 00 00 00 00 mov %eax,0x0
44ef27: 48 8d 0d 32 fb 0d 00 lea 0xdfb32(%rip),%rcx # 52ea60 <runtime.g0>
44ef2e: 64 48 89 0c 25 f8 ff mov %rcx,%fs:0xfffffffffffffff8
44ef35: ff ff
44ef37: 48 8d 05 22 00 0e 00 lea 0xe0022(%rip),%rax # 52ef60 <runtime.m0>
44ef3e: 48 89 08 mov %rcx,(%rax)
44ef41: 48 89 41 30 mov %rax,0x30(%rcx)
44ef45: fc cld
44ef46: e8 85 68 fe ff callq 4357d0 <runtime.check>
44ef4b: 8b 44 24 10 mov 0x10(%rsp),%eax
44ef4f: 89 04 24 mov %eax,(%rsp)
44ef52: 48 8b 44 24 18 mov 0x18(%rsp),%rax
44ef57: 48 89 44 24 08 mov %rax,0x8(%rsp)
44ef5c: e8 6f 62 fe ff callq 4351d0 <runtime.args>
44ef61: e8 aa 62 fd ff callq 425210 <runtime.osinit>
44ef66: e8 d5 ac fd ff callq 429c40 <runtime.schedinit>
44ef6b: 48 8d 05 b6 ae 07 00 lea 0x7aeb6(%rip),%rax # 4c9e28 <runtime.mainPC>
44ef72: 50 push %rax
44ef73: 6a 00 pushq $0x0
44ef75: e8 36 14 fe ff callq 4303b0 <runtime.newproc>
44ef7a: 58 pop %rax
44ef7b: 58 pop %rax
44ef7c: e8 7f cb fd ff callq 42bb00 <runtime.mstart>
44ef81: c7 04 25 f1 00 00 00 movl $0xf1,0xf1
44ef88: f1 00 00 00
44ef8c: c3 retq
44ef8d: cc int3
44ef8e: cc int3
44ef8f: cc int3

000000000044ef90 <runtime.asminit>:
44ef90: c3 retq

At the same time, we can use dlv to debug it step by step. with dlv, we can verify the sequence of execution and identify the source files.

use dlv to debug ‘hello_world’ step by step.

Until now, we get the below tables for the starting of go runtime engine.

|---------------------------------------------------|
| Functions | File |
| ------------------------------------------------- |
| 1. _rt0_amd64_linux <---- rt0_linux_amd64.s |
|---------------------------------------------------|
| 2. runtime.rt0_go <---------- asm_amd64.s |
| |_____ stubs.go |
|---------------------------------------------------|
| 3. runtime.args <----------- runtime1.go |
|---------------------------------------------------|
| 4. runtime.osinit <---------- os_linux.go |
|---------------------------------------------------|
| 5. runtime.schedinit <----| |
| runtime.newproc <----|__ /-- proc.go |
| runtime.mstart <----| \-- runtime2.go |
|---------------------------------------------------|

Thoughts.

如果我们比较一下GO和SPARK的运行引擎,我们就会发现一些有意思的地方。GO和SPAKR虽然一个是编程语言,一个是计算框架, 但是都是为了解决计算瓶颈,并发问题而提出的方案。

GO是彻底的革命者,通过从编程语言级别上的协程支持及CSP的并发模型,来彻底压榨CPU。尤其是GO的runtime能够监控系统调用,高效的协程调度的确给人眼前一亮,十分惊艳的感觉。在语言的设计上也很大胆,充分揉合了面向过程,面向对象及函数编程的精髓。

Spark是一个并行计算框架,也是非常聪明的务实者,通过Scala及Actor并发模型来高效解决大数据计算的问题。同时它也充分利用了JVM及Java world里已存的资源。在运行时调度上是通过分析spark作业,并发的完成作业的task。

main function in proc.go

main
-- getg
-- systemstack (asm_amd64.s)
-- newm (sysmon)
-- allocm
-- acquirep
-- wirep
-- mcommoninit
-- mpreinit (?)
-- malg (allocate new g, with a stack)
-- newm1
-- funcPC
-- msanwrite
-- asmcgocall
-- newosproc
-- lockOSThread
-- dolockOSThread
-- runtime_init
-- gcenable
-- startTemplateThread
-- newm
-- cgocall
-- main_init
-- unlockOSThread
-- dounlockOSThread
-- main_main
-- racefini
-- Gosched
-- checkTimeouts
-- mcall
-- gosched_m
-- traceGoSched
-- goschedImpl
-- dumpgstatus
-- readgstatus
-- casgstatus
-- dropg
-- setMNoWB
-- setGNoWB
-- globrunqput
-- schedule
-- stoplockedm
-- releasep
-- handoffp
-- runqempty
-- startm
-- gcBlackenEnabled
-- gcMarkWorkAvailable
-- pidleput
-- notesleep
-- acquirep
-- execute
-- gogo
-- procyield
-- osyield
-- gopark

Refer To