Kepler performance Analysis #391

marceloamaral · 2022-11-15T11:26:17Z

marceloamaral
Nov 15, 2022
Maintainer

Kepler Performance Analysis

Goal

Issue #365 pointed to a possible performance regression after changing metrics from per-pod to per-container.

This discussion aims to shed light on Kepler's performance.

To understand performance, let's look at three different code versions:

v0.3 (by pod metrics)
before gpu update (before PR Add GPU metric in the same flow as the other metrics #345)
with gpu update (this update improved some parts of the code, that is, it started to use more pointers instead of duplicate data in arrays)

For the analysis, we are going to use the go pprof tool for profiling CPU and MEM (heap) analysis for 60s.

CPU analysis

v0.3 (per pod metrics)
go tool pprof cpu.prof

before gpu update
go tool pprof cpu.prof

with gpu update
go tool pprof cpu.prof

CPU Analysis Comments

We can see that the number of seconds of the runtime.cgocall func execution has increased after the update.
In fact, we may find a lot of runtime activity, which often indicates Garbage Collector (GC) activity.
This is also the conclusion from issue #381 that the GC can be the root cause of performance degradation.

There are several reasons why the GC can become more intensive and the main one is because it must free up the heap memory of many objects.

Heap scape is a known issue in our code, we have a test that always fails and we need to improve this in the future.
However, we need to understand which functions are creating the most objects and what is happening in the code.
We'll discuss this in the next section.

Note that to mitigate the GC problem we can decrease the number of heap memory and the possibility to slow down the GC using the GOGC variable. But GOGC increases program memory usage.

Memory analysis

v0.3 (per pod metrics)
go tool pprof -alloc_objects mem.prof

before gpu update
go tool pprof -alloc_objects mem.prof

with gpu update
go tool pprof -alloc_objects mem.prof

MEM Analysis Comments

We can see in the results that functions that are allocating more objects are: getCPUCoreFrequency and kubelet ListPods.

Code Version	`getCPUCoreFrequency`	`kubelet ListPods`	ListPods Ratio (version_val/v0.3_val)
v0.3 (per pod metrics)	1082414	74169
before gpu update	1141056	243323	3.2
with gpu update	848989	133082	1.8

In a high-level analysis, ListPods appears to be the critical function that is creating more objects after the upgrade, i.e., creating up to 3.2x more objects than version v0.3.
ListPods is called when the pod information is not in the cache, so we need to call the kubelet API to get this information.
Notice that in ListPods, we iterate through the list of all pods to find the pods with the target id. Therefore, this function will become more expensive when the system has more pods running.
While we need a deeper look into this, the issue could be related to caches. So maybe we need to improve that.

Another solution we discussed earlier is the introduction of a new component/proxy that will watch for resources (pods, jobs, etc.). This proxy will act as a cache and will transmit event updates to kepler instances, where events will be filtered by node name.
The main reason to create a proxy for apiserver is to avoid overloading it with the List operation when multiple instances of kepler are restarted. With this solution, we could remove the ListPods and ensure consistency in the code as we will be able to promptly identify the deleted pods.

Latency analysis

The logs show how long kepler took to update all metrics and the num is the number of pod or containers, depending on the code.
v0.3 (per pod metrics)

with gpu update

Latency Analysis Comments

Sometimes the updated code is taking longer to update all the metrics. We need a deeper analysis to fully understand why.
But the number of containers is probably not the reason for the performance degradation as the difference in num is minimal

Signed-off-by: Marcelo Amaral [email protected]

rootfs · 2022-11-15T15:12:47Z

rootfs
Nov 15, 2022
Maintainer

ListPods is usually linked to new Pods creation. If during the test, more pods are added (e.g. by scaling up the replica), the impact could be more visible.

0 replies

marceloamaral · 2022-11-29T12:00:30Z

marceloamaral
Nov 29, 2022
Maintainer Author

I updated method to collect CPU frequency using BPF instead of reading kernel files.
The test is compared to the most recent main, with commit afd9d6da765c6b0c90207ed253c78ee975c71fd2.

This update has a big performance improvement.

Before the update, Kepler was using ~12% CPU and dropped to ~1.5%.
The goprof CPU profile was showing ~2.23s for runtime.cgocall, after the update it dropped to 1.21s.
goprof CPU profiling was showing huge heap object creation for getCPUCoreFrequency and os.newFile, update removed getCPUCoreFrequency and disabled os.newFile from kernel read CPU frequency file with available BPF.

0 replies

marceloamaral · 2022-11-29T14:05:02Z

marceloamaral
Nov 29, 2022
Maintainer Author

The next steps are to improve the ListPods and prometheus MakeLabePairs.

go tool pprof -alloc_objects mem.prof

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Kepler performance Analysis #391

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Kepler performance Analysis #391

Uh oh!

Uh oh!

marceloamaral Nov 15, 2022 Maintainer

Kepler Performance Analysis

Goal

CPU analysis

CPU Analysis Comments

Memory analysis

MEM Analysis Comments

Latency analysis

Latency Analysis Comments

Replies: 3 comments

Uh oh!

rootfs Nov 15, 2022 Maintainer

Uh oh!

marceloamaral Nov 29, 2022 Maintainer Author

Uh oh!

marceloamaral Nov 29, 2022 Maintainer Author

marceloamaral
Nov 15, 2022
Maintainer

rootfs
Nov 15, 2022
Maintainer

marceloamaral
Nov 29, 2022
Maintainer Author

marceloamaral
Nov 29, 2022
Maintainer Author