This post has two main purposes:
My talk doesn’t have any slide, I think that’s better to spend my time mentioning the code to run and what I do to use the profiler with the basic implementation which I’ve created for this purpose.
To follow me you have first to make sure that you have installed in your machine:
pprof(The tool offered by Golang out of the box to profile your programs)
NOTE that I used a linux machine to run it; I haven’t done anything which doesn’t look cross platform, but bear in mind that you may have issues run it on other operative system
Get the code, cloning this gist
The implementation have a test for each function, to make sure that they work as expected, some benchmarks which will use to profile the 8 mentioned functions and a basic server with 2 endpoints to show how
net/http/pprof package expose profile information on an http server just only importing it.
I’ve also added a
Makefile as helper to run the most of the commands which I need to profile the benchmarks and the basic http server.
To profile benchmarks, we need to run the benchmarks with some flags to get profile metrics data files; we also have to get the compiled binary, because, thereafter, we can match the profile metrics with the function names and the lines of the source code where those metrics refer.
So we do something like
go test -bench . -benchmem -memprofile mem.out -cpuprofile cpu.out
With the previous command, run the benchmarks and output a memory and cpu profile data; the binary is also generated with an auto-generated name, pass
-o mybin to give it the name that you would like.
With the output generated in our terminal, we already know which flavor function (on each functions set) is the fastest. However, having the profiling data we can know from each function, which are the cost of their internal operations; for this simple functions isn’t worth as they only have a few instructions, however in large programs where there are quite a few nested function calls, we can see which are the cost of each call and deepen in those ones which are the most costly and know where the bottlenecks are to see if we can change the implementation for more optimized ones.
The profiling data can be analyzed using
pprof tool, executing something like
go tool pprof mybin cpu.out.
When we run pprof in that way, we get into a “repl” where we can execute commands to read the the different information which profile data file contains and see some of them next to the line of the implementation which they refer
Another way, which probably is the most practical, is to generate a flow graph of the function calls with the cost of each (or the most costly if there are too many) function call; we can get that executing something like
go tool pprof -svg -output cpu.svg mybin cpu.out
And we are ready to open
cpu.svg and see where are our bottlenecks.
A service is an application that, ideally, will run forever, exposing an interface which other applications (clients) can request to perform the exposed operations. Due they behavior, they have to offer an interface to get the profile data or they have to export it on a defined intervals of time.
Golang has the package
which allows to expose, automatically, a few HTTP endpoints to collect the profile data, if the service has an http server, otherwise we can create one HTTP only for that purpose.
The exposed endpoints are listed under
/debug/pprof/ path and to collect the profile data we execute
pprof pointing to the specific path, for example
go tool pprof http://localhost:8000/debug/pprof/profile
go tool pprof http://localhost:8000/debug/pprof/heap
go tool pprof http://localhost:8000/debug/pprof/block
Each of those calls generate an output profile data file as we has got from the benchmark and with those you can use
pprof in the same way to analyze the data.
This is all for now!
Thanks for reading it