Perf Profiling on Linux

· Harshil Jani

#computers #linux #perf #profiling

With the fact that, For many people, This would be something bit confusing as it was to me days ago. So, This is going to be very beginner friendly write-up. If you get no idea from the Title of Article try not to worry, If you read this, You are going to know something which might be unlearned by many. Let’s start with the terminologies extracted from Wikipedia.

Profiling : In software engineering, profiling is a form of dynamic program analysis that measures, for example, the space or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls.

Perf : perf is a performance analyzing tool in Linux, available from Linux kernel version 2.6.31 in 2009. Userspace controlling utility, named perf, is accessed from the command line and provides a number of subcommands; it is capable of statistical profiling of the entire system.

Kernel : The kernel is a computer program at the core of a computer’s operating system and generally has complete control over everything in the system. It is the portion of the operating system code that is always resident in memory, and facilitates interactions between hardware and software components.

Woah ! You feel this is filled with so much of technicalities ? Hilarious, That’s why you will read this blog ahead to make it less technical and more practical.

Understand that, Linux Operating System is a girl named as Becky. Now, Becky’s heart is a Linux Kernel. Now, Becky had stomach ache, So Dr. Weird had asked Becky to go to laboratory for conducting some medical reports. Now, drive your brain to create an analogy between the types of medical tests to be conducted as programs, commands or softwares that are ran on the Linux Operating System and handled further by Kernel. Medical test will track different conditions based on particular tests and help in analyzing medical condition of Becky. Blood counts, platlets, RBC, WBC, Oxygen level etc. are few parameters that will help in tracking the reason of stomach ache in Becky. Summarizing the analogy once again more precisely,

So, Basically, Perf is a tool to analyse Performace. And the commands used in perf are known as perf_events. It can help you to understand CPU usage quickly and completely. If you can manage to save even a small percent of your CPU usage you can make a great impact. It will reduce CPU consumption which in turn will reduce power consumption and thus the electricity and you can save the environment. XD we went far away from the topic. But yeah, Improving performance of any software or program is so much important in industry.

Cool ! Now, You know what is perf. Let’s get into how it works and how to get out some information with the help of perf.

At first, in your command line, You have to install perf package. It depends on your kernel version and Package manager you prefer to use. So, I leave this onto you as an activity to do. Incase you get struck believe me a simple google search will suffice. Also, It is recommended to use perf as root into your systems.


# perf Basic Workflow

perf stat : command will return the CPU counter for specific command we use. Let’s say You want to have a look on how things changes on System when you run manual page of perf command.

Write : perf stat man perf

 1Performance counter stats for 'man perf':
 293.15 msec task-clock                #    0.026 CPUs utilized          
 3               174      context-switches          #    1.868 K/sec                  
 4                19      cpu-migrations            #  203.972 /sec                   
 5             5,167      page-faults               #   55.470 K/sec                  
 6      24,60,86,205      cycles                    #    2.642 GHz                    
 7       2,41,56,647      stalled-cycles-frontend   #    9.82% frontend cycles idle   
 8       7,00,98,079      stalled-cycles-backend    #   28.49% backend cycles idle    
 9      24,71,58,494      instructions              #    1.00  insn per cycle         
10                                                  #    0.28  stalled cycles per insn
11       5,96,20,188      branches                  #  640.045 M/sec                  
12          6,78,023      branch-misses             #    1.14% of all branches
133.609929615 seconds time elapsed
140.051910000 seconds user
15       0.044520000 seconds sys

This is what is thrown back for my system. At first look you might be confused about what is being written. If Becky is British girl and has no exposure to Russian then what would be her situation if you ask Becky to read Russian newspaper. But, Obviously, Becky can figure out that, Top Most side of the page contains some numbers and see can get the idea that, it might be Date of Newspaper. That entirely depends on her observation skills. So do your’s. If you clearly observe the output then you can see some Frequencies, Rates, Processes, Event names etc. But, What are they? Well, In this world, No one can explain everything. You have to research by your own about every specific thing. For Example, You want to know what is meant by branch-misses. Open Search Engine (Google) type the same

When there is a conditional branch along the way, there are two possible paths that can be followed, and the prefetch unit has no idea which one it should choose, until all the actual condition for that instruction is calculated. So, It misses some branches in execution of task.

This is just an example. You can try it with other things frequently to learn more and more.

You can give arguments to your perf commands which will execute particular event accordingly.Perf Events allows you to understand hold over particular action or situations.


Now, You know basic commands, The workflow of perf and execution. Internet is wide my friend. To go more advanced in perf you can always look up at resources available. Let me mention few of them here itself.

Brendann Gregg’s perf analysis for Netlix

Perf wiki page

I think this would suffice for knowing about perf. If you are working on some project which is used by so many potential users, then performance analysis is something to be cared about as our highest priority.

I really appreciate the ones, who clinged uptil here in the write-up and my condoloscenes to all the Becky’s in the world to whom I apparently gave stomach aches just to explain my understandings about perf. And readers you people can criticize me in any manner where you feel things go wrong. Or even appriciate when they are correct. Thank you !