Reducing memory allocation from client-side stats#10705
Reducing memory allocation from client-side stats#10705
Conversation
Introduced DDCache-s around UTF8BytesString construction UTF8BytesString are advantageous for serialization, but that only applies to the key instance that is actually serialized Most key instances here are being created to do a look-up into the map. so the UTF8BytesString creation was extra work. This change is intended as a quick fix. I think there's still a lot of room for improvement by restructuring the code further, but that is left for a future PR. As is this changer reduces the impact on application throughput by 5-20% depending on the heap size.
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 64 metrics, 7 unstable metrics. Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.60.0-SNAPSHOT~eae7a8957f, baseline=1.61.0-SNAPSHOT~dee7cdf0a9
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.064 s) : 0, 1063893
Total [baseline] (8.879 s) : 0, 8878667
Agent [candidate] (1.066 s) : 0, 1065734
Total [candidate] (8.851 s) : 0, 8850627
section iast
Agent [baseline] (1.225 s) : 0, 1224986
Total [baseline] (9.582 s) : 0, 9581541
Agent [candidate] (1.234 s) : 0, 1234135
Total [candidate] (9.604 s) : 0, 9604109
gantt
title insecure-bank - break down per module: candidate=1.60.0-SNAPSHOT~eae7a8957f, baseline=1.61.0-SNAPSHOT~dee7cdf0a9
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.192 ms) : 0, 1192
crashtracking [candidate] (1.202 ms) : 0, 1202
BytebuddyAgent [baseline] (632.643 ms) : 0, 632643
BytebuddyAgent [candidate] (630.42 ms) : 0, 630420
AgentMeter [baseline] (29.354 ms) : 0, 29354
AgentMeter [candidate] (29.35 ms) : 0, 29350
GlobalTracer [baseline] (257.391 ms) : 0, 257391
GlobalTracer [candidate] (258.806 ms) : 0, 258806
AppSec [baseline] (31.513 ms) : 0, 31513
AppSec [candidate] (31.905 ms) : 0, 31905
Debugger [baseline] (58.52 ms) : 0, 58520
Debugger [candidate] (59.001 ms) : 0, 59001
Remote Config [baseline] (582.775 µs) : 0, 583
Remote Config [candidate] (589.277 µs) : 0, 589
Telemetry [baseline] (8.669 ms) : 0, 8669
Telemetry [candidate] (8.71 ms) : 0, 8710
Flare Poller [baseline] (7.823 ms) : 0, 7823
Flare Poller [candidate] (9.663 ms) : 0, 9663
section iast
crashtracking [baseline] (1.203 ms) : 0, 1203
crashtracking [candidate] (1.198 ms) : 0, 1198
BytebuddyAgent [baseline] (795.062 ms) : 0, 795062
BytebuddyAgent [candidate] (802.275 ms) : 0, 802275
AgentMeter [baseline] (11.316 ms) : 0, 11316
AgentMeter [candidate] (11.567 ms) : 0, 11567
GlobalTracer [baseline] (247.127 ms) : 0, 247127
GlobalTracer [candidate] (248.446 ms) : 0, 248446
IAST [baseline] (25.153 ms) : 0, 25153
IAST [candidate] (25.282 ms) : 0, 25282
AppSec [baseline] (26.342 ms) : 0, 26342
AppSec [candidate] (26.529 ms) : 0, 26529
Debugger [baseline] (62.682 ms) : 0, 62682
Debugger [candidate] (62.414 ms) : 0, 62414
Remote Config [baseline] (521.286 µs) : 0, 521
Remote Config [candidate] (523.13 µs) : 0, 523
Telemetry [baseline] (14.643 ms) : 0, 14643
Telemetry [candidate] (14.766 ms) : 0, 14766
Flare Poller [baseline] (4.911 ms) : 0, 4911
Flare Poller [candidate] (4.939 ms) : 0, 4939
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.60.0-SNAPSHOT~eae7a8957f, baseline=1.61.0-SNAPSHOT~dee7cdf0a9
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.058 s) : 0, 1057903
Total [baseline] (11.177 s) : 0, 11176799
Agent [candidate] (1.056 s) : 0, 1056193
Total [candidate] (10.994 s) : 0, 10994269
section appsec
Agent [baseline] (1.247 s) : 0, 1246516
Total [baseline] (11.208 s) : 0, 11208306
Agent [candidate] (1.249 s) : 0, 1248818
Total [candidate] (11.173 s) : 0, 11173253
section iast
Agent [baseline] (1.246 s) : 0, 1245596
Total [baseline] (11.423 s) : 0, 11422697
Agent [candidate] (1.228 s) : 0, 1228115
Total [candidate] (11.359 s) : 0, 11359352
section profiling
Agent [baseline] (1.194 s) : 0, 1194338
Total [baseline] (11.064 s) : 0, 11063629
Agent [candidate] (1.186 s) : 0, 1186109
Total [candidate] (11.065 s) : 0, 11064794
gantt
title petclinic - break down per module: candidate=1.60.0-SNAPSHOT~eae7a8957f, baseline=1.61.0-SNAPSHOT~dee7cdf0a9
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.185 ms) : 0, 1185
crashtracking [candidate] (1.194 ms) : 0, 1194
BytebuddyAgent [baseline] (628.836 ms) : 0, 628836
BytebuddyAgent [candidate] (626.279 ms) : 0, 626279
AgentMeter [baseline] (29.052 ms) : 0, 29052
AgentMeter [candidate] (29.069 ms) : 0, 29069
GlobalTracer [baseline] (257.152 ms) : 0, 257152
GlobalTracer [candidate] (256.75 ms) : 0, 256750
AppSec [baseline] (31.537 ms) : 0, 31537
AppSec [candidate] (31.372 ms) : 0, 31372
Debugger [baseline] (59.276 ms) : 0, 59276
Debugger [candidate] (59.152 ms) : 0, 59152
Remote Config [baseline] (585.929 µs) : 0, 586
Remote Config [candidate] (583.034 µs) : 0, 583
Telemetry [baseline] (8.585 ms) : 0, 8585
Telemetry [candidate] (8.647 ms) : 0, 8647
Flare Poller [baseline] (5.712 ms) : 0, 5712
Flare Poller [candidate] (7.164 ms) : 0, 7164
section appsec
crashtracking [baseline] (1.178 ms) : 0, 1178
crashtracking [candidate] (1.195 ms) : 0, 1195
BytebuddyAgent [baseline] (657.548 ms) : 0, 657548
BytebuddyAgent [candidate] (659.661 ms) : 0, 659661
AgentMeter [baseline] (11.993 ms) : 0, 11993
AgentMeter [candidate] (12.063 ms) : 0, 12063
GlobalTracer [baseline] (259.02 ms) : 0, 259020
GlobalTracer [candidate] (259.23 ms) : 0, 259230
AppSec [baseline] (177.801 ms) : 0, 177801
AppSec [candidate] (177.753 ms) : 0, 177753
Debugger [baseline] (65.588 ms) : 0, 65588
Debugger [candidate] (65.595 ms) : 0, 65595
Remote Config [baseline] (566.643 µs) : 0, 567
Remote Config [candidate] (568.511 µs) : 0, 569
Telemetry [baseline] (8.982 ms) : 0, 8982
Telemetry [candidate] (8.89 ms) : 0, 8890
Flare Poller [baseline] (3.606 ms) : 0, 3606
Flare Poller [candidate] (3.589 ms) : 0, 3589
IAST [baseline] (23.989 ms) : 0, 23989
IAST [candidate] (24.002 ms) : 0, 24002
section iast
crashtracking [baseline] (1.21 ms) : 0, 1210
crashtracking [candidate] (1.186 ms) : 0, 1186
BytebuddyAgent [baseline] (809.56 ms) : 0, 809560
BytebuddyAgent [candidate] (796.86 ms) : 0, 796860
AgentMeter [baseline] (11.932 ms) : 0, 11932
AgentMeter [candidate] (11.346 ms) : 0, 11346
GlobalTracer [baseline] (249.874 ms) : 0, 249874
GlobalTracer [candidate] (247.774 ms) : 0, 247774
AppSec [baseline] (26.852 ms) : 0, 26852
AppSec [candidate] (26.391 ms) : 0, 26391
Debugger [baseline] (63.83 ms) : 0, 63830
Debugger [candidate] (63.212 ms) : 0, 63212
Remote Config [baseline] (547.029 µs) : 0, 547
Remote Config [candidate] (541.262 µs) : 0, 541
Telemetry [baseline] (14.894 ms) : 0, 14894
Telemetry [candidate] (14.778 ms) : 0, 14778
Flare Poller [baseline] (4.933 ms) : 0, 4933
Flare Poller [candidate] (4.872 ms) : 0, 4872
IAST [baseline] (25.619 ms) : 0, 25619
IAST [candidate] (25.097 ms) : 0, 25097
section profiling
ProfilingAgent [baseline] (94.661 ms) : 0, 94661
ProfilingAgent [candidate] (94.542 ms) : 0, 94542
crashtracking [baseline] (1.172 ms) : 0, 1172
crashtracking [candidate] (1.171 ms) : 0, 1171
BytebuddyAgent [baseline] (689.723 ms) : 0, 689723
BytebuddyAgent [candidate] (684.188 ms) : 0, 684188
AgentMeter [baseline] (8.75 ms) : 0, 8750
AgentMeter [candidate] (8.624 ms) : 0, 8624
GlobalTracer [baseline] (218.089 ms) : 0, 218089
GlobalTracer [candidate] (216.546 ms) : 0, 216546
AppSec [baseline] (32.449 ms) : 0, 32449
AppSec [candidate] (32.246 ms) : 0, 32246
Debugger [baseline] (63.466 ms) : 0, 63466
Debugger [candidate] (60.827 ms) : 0, 60827
Remote Config [baseline] (593.696 µs) : 0, 594
Remote Config [candidate] (592.268 µs) : 0, 592
Telemetry [baseline] (9.866 ms) : 0, 9866
Telemetry [candidate] (12.252 ms) : 0, 12252
Flare Poller [baseline] (4.368 ms) : 0, 4368
Flare Poller [candidate] (4.314 ms) : 0, 4314
Profiling [baseline] (95.246 ms) : 0, 95246
Profiling [candidate] (95.108 ms) : 0, 95108
LoadParameters
See matching parameters
SummaryFound 1 performance improvements and 1 performance regressions! Performance is the same for 18 metrics, 16 unstable metrics.
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~eae7a8957f, baseline=1.61.0-SNAPSHOT~dee7cdf0a9
dateFormat X
axisFormat %s
section baseline
no_agent (1.18 ms) : 1168, 1191
. : milestone, 1180,
iast (3.13 ms) : 3087, 3172
. : milestone, 3130,
iast_FULL (5.888 ms) : 5828, 5947
. : milestone, 5888,
iast_GLOBAL (3.517 ms) : 3469, 3566
. : milestone, 3517,
profiling (2.11 ms) : 2090, 2130
. : milestone, 2110,
tracing (1.814 ms) : 1798, 1830
. : milestone, 1814,
section candidate
no_agent (1.194 ms) : 1182, 1206
. : milestone, 1194,
iast (3.218 ms) : 3173, 3263
. : milestone, 3218,
iast_FULL (5.65 ms) : 5594, 5705
. : milestone, 5650,
iast_GLOBAL (3.538 ms) : 3484, 3593
. : milestone, 3538,
profiling (2.077 ms) : 2058, 2095
. : milestone, 2077,
tracing (1.873 ms) : 1856, 1891
. : milestone, 1873,
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~eae7a8957f, baseline=1.61.0-SNAPSHOT~dee7cdf0a9
dateFormat X
axisFormat %s
section baseline
no_agent (17.768 ms) : 17591, 17945
. : milestone, 17768,
appsec (18.895 ms) : 18701, 19089
. : milestone, 18895,
code_origins (17.882 ms) : 17703, 18060
. : milestone, 17882,
iast (17.991 ms) : 17811, 18170
. : milestone, 17991,
profiling (20.004 ms) : 19801, 20206
. : milestone, 20004,
tracing (17.712 ms) : 17537, 17886
. : milestone, 17712,
section candidate
no_agent (18.233 ms) : 18045, 18421
. : milestone, 18233,
appsec (19.474 ms) : 19274, 19673
. : milestone, 19474,
code_origins (17.749 ms) : 17573, 17926
. : milestone, 17749,
iast (17.766 ms) : 17591, 17941
. : milestone, 17766,
profiling (18.865 ms) : 18673, 19058
. : milestone, 18865,
tracing (18.756 ms) : 18566, 18946
. : milestone, 18756,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 2 unstable metrics. Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~eae7a8957f, baseline=1.61.0-SNAPSHOT~dee7cdf0a9
dateFormat X
axisFormat %s
section baseline
no_agent (1.469 ms) : 1457, 1480
. : milestone, 1469,
appsec (3.787 ms) : 3564, 4010
. : milestone, 3787,
iast (2.245 ms) : 2176, 2314
. : milestone, 2245,
iast_GLOBAL (2.298 ms) : 2229, 2368
. : milestone, 2298,
profiling (2.519 ms) : 2349, 2688
. : milestone, 2519,
tracing (2.051 ms) : 1997, 2104
. : milestone, 2051,
section candidate
no_agent (1.469 ms) : 1458, 1481
. : milestone, 1469,
appsec (3.793 ms) : 3573, 4014
. : milestone, 3793,
iast (2.248 ms) : 2179, 2317
. : milestone, 2248,
iast_GLOBAL (2.293 ms) : 2224, 2363
. : milestone, 2293,
profiling (2.095 ms) : 2038, 2152
. : milestone, 2095,
tracing (2.055 ms) : 2001, 2108
. : milestone, 2055,
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~eae7a8957f, baseline=1.61.0-SNAPSHOT~dee7cdf0a9
dateFormat X
axisFormat %s
section baseline
no_agent (15.455 s) : 15455000, 15455000
. : milestone, 15455000,
appsec (15.129 s) : 15129000, 15129000
. : milestone, 15129000,
iast (18.18 s) : 18180000, 18180000
. : milestone, 18180000,
iast_GLOBAL (17.878 s) : 17878000, 17878000
. : milestone, 17878000,
profiling (14.979 s) : 14979000, 14979000
. : milestone, 14979000,
tracing (15.246 s) : 15246000, 15246000
. : milestone, 15246000,
section candidate
no_agent (15.569 s) : 15569000, 15569000
. : milestone, 15569000,
appsec (14.776 s) : 14776000, 14776000
. : milestone, 14776000,
iast (17.858 s) : 17858000, 17858000
. : milestone, 17858000,
iast_GLOBAL (17.873 s) : 17873000, 17873000
. : milestone, 17873000,
profiling (15.531 s) : 15531000, 15531000
. : milestone, 15531000,
tracing (15.289 s) : 15289000, 15289000
. : milestone, 15289000,
|
| public final class MetricKey { | ||
| static final DDCache<String, UTF8BytesString> RESOURCE_CACHE = DDCaches.newFixedSizeCache(32); | ||
| static final DDCache<String, UTF8BytesString> SERVICE_CACHE = DDCaches.newFixedSizeCache(8); | ||
| static final DDCache<String, UTF8BytesString> SERVICE_SOURCE_CACHE = DDCaches.newFixedSizeCache(4); |
There was a problem hiding this comment.
This should be already filled with UTF8BytesString. It can take values of the instrumentation name (that's should also be UTF8BytesString) so I suggest removing caching for this specific one. 4, is also too small
There was a problem hiding this comment.
The lookup logic accounts for receiving a UTF8BytesString and in that case bypasses the cache entirely.
Also, I erred on the side of keeping the caches small. I just want to eliminate most of the allocation - not all of the allocation.
That said, I may have misunderstood what "serviceSource" is so maybe that one should be larger. If it is instrumentations, then maybe we should increase it to 16. (The thinking being that only a few instrumentations are typically active in a given system.)
There was a problem hiding this comment.
If it can only be a UTF8BytesString, we could also tighten the type in the constructor. Although in that case, the JIT would dead code eliminate using the cache anyway (except for the initial allocation).
| static final DDCache<String, UTF8BytesString> TYPE_CACHE = DDCaches.newFixedSizeCache(8); | ||
| static final DDCache<String, UTF8BytesString> KIND_CACHE = DDCaches.newFixedSizeCache(8); | ||
| static final DDCache<String, UTF8BytesString> HTTP_METHOD_CACHE = DDCaches.newFixedSizeCache(8); | ||
| static final DDCache<String, UTF8BytesString> HTTP_ENDPOINT_CACHE = DDCaches.newFixedSizeCache(32); |
There was a problem hiding this comment.
Would it be possible to widen those a bit? Also, for the http endpoint, I'm wondering if we should just do that caching earlier to make other serialisation benefitting of this (i.e. in EndpointResolver)
There was a problem hiding this comment.
I erred on the side of making the caches small.
The thinking is any object reuse is an improvement over what we were doing, but consuming too much memory could be harmful.
| } | ||
|
|
||
| static UTF8BytesString utf8(DDCache<String, UTF8BytesString> cache, CharSequence charSeq) { | ||
| if ( charSeq == null ) { |
There was a problem hiding this comment.
There are things that are supposing to check nullity of this. Now if we replace with empty is not more the same semantic and this will break some code. I suggest to let the caller return the default value so it won't break the existing logic
There was a problem hiding this comment.
Nevermind - I see that I did accidentally change the semantics for method & endpoint. I'll fix that.
I kept the semantics that already existed. If you look at the prior code, null was replaced with EMPTY, so I did the same.
There was a problem hiding this comment.
well there are things that are OK to return empty but others needs to be literally null (i.e. service source, httpMethod, httpEndpoint)
There was a problem hiding this comment.
Yes, I realized after my initial reply. I'll double check all of them.
There was a problem hiding this comment.
Since the split of EMPTY vs null was about 50 / 50, I decided to just restore the null checks in the constructor.
- fix null handling for http method & endpoint - increase service source cache size
Since it about a 50/50 split between cases that use EMPTY vs null, when a null is passed I decided to just put the null handling back into the constructor. That makes the choice more explicit, and makes the PR easier to review / compare to the prior logic
| public final class MetricKey { | ||
| static final DDCache<String, UTF8BytesString> RESOURCE_CACHE = DDCaches.newFixedSizeCache(32); | ||
| static final DDCache<String, UTF8BytesString> SERVICE_CACHE = DDCaches.newFixedSizeCache(8); | ||
| static final DDCache<String, UTF8BytesString> SERVICE_SOURCE_CACHE = |
There was a problem hiding this comment.
That field does not need to be cached, for two main reasons:
- It is already either a
UTF8BytesStringornull. In fact, it comes from thecomponent()method, which in helpers typically returns aUTF8BytesString. - Its cardinality matches that of the instrumentations, the size of the cache is too small.
The constructor calling UTF8BytesString.create() may look confusing at first glance, but in this case it does not allocate a new object — it simply returns the existing instance. It was kept for consistency with the usual pattern.
Unless there are strong counterarguments, I would suggest removing the cache for this field.
What Does This Do
Introduces DDCache-s around UTF8BytesString construction
Motivation
UTF8BytesString are advantageous for serialization, but that only applies to the MetricKey instance that is actually serialized
Most MetricKey instances are only created to do a look-up into the map. so the UTF8BytesString creation was just extra work
This change reduces GC time by 7% in span creation stress test.
This change reduces impact on application throughput by 5-20% depending on heap size.
Additional Notes
This change is intended as a quick fix. I think there's still a lot of room for improvement by restructuring the code further, but that is left for a future PR.
Contributor Checklist
type:and (comp:orinst:) labels in addition to any other useful labelsclose,fix, or any linking keywords when referencing an issueUse
solvesinstead, and assign the PR milestone to the issueJira ticket: [PROJ-IDENT]
Note: Once your PR is ready to merge, add it to the merge queue by commenting
/merge./merge -ccancels the queue request./merge -f --reason "reason"skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.