Forum:

Performance

String vs StringBuffer vs StringBuilder and String.replaceAll vs String.replace

Tan Quang

Ranch Hand

Posts: 284

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

As I know:
String is immutable.
StringBuffer is synchronous (slow).
StringBuilder is asynchronous (fast).
As far as I understand, when performing the string concatenation with the "plus" operator, when the compilation will become StringBuilder. For example:

Will become:

Also according to this question: Difference between StringBuilder and StringBuffer. Performance will be String < StringBuffer < StringBuilder.
But in my test, StringBuffer is faster than StringBuilder. This is the code I use:

public class Main {
    public static final String TIP_MICRO_OF_SYSTEM = "[SYSTEM] ";
    public static final String TIP_ITEM_NOTICE_BY_FIFTTEEN_FLASH_END = "All the essence of heaven and earth gathered here, helping [%p] have created [%e +%s] successfully. [%p] so divine!";
    public static void main(String[] args) {
        long startTime = System.currentTimeMillis();
        String con = new StringBuilder(TIP_MICRO_OF_SYSTEM).append(TIP_ITEM_NOTICE_BY_FIFTTEEN_FLASH_END.replaceAll("%p", "aaa").replaceAll("%s", String.valueOf(100)).replaceAll("%e", "BBB")).toString();
        System.out.println("String replace is: " + (System.currentTimeMillis() - startTime) + "ms");

startTime = System.currentTimeMillis();
        String str = "Hello";
        for (int i = 0; i < 100000; i++) {
            str += "Word";
        }
        System.out.println("String is: " + (System.currentTimeMillis() - startTime) + "ms");

startTime = System.currentTimeMillis();
        StringBuffer buffer = new StringBuffer("Hello");
        for (int i = 0; i < 100000; i++) {
            buffer.append("Word");
        }
        System.out.println("StringBuffer is: " + (System.currentTimeMillis() - startTime) + "ms");

startTime = System.currentTimeMillis();
        StringBuilder buider = new StringBuilder("Hello");
        for (int i = 0; i < 100000; i++) {
            buider.append("Word");
        }
        System.out.println("StringBuilder is: " + (System.currentTimeMillis() - startTime) + "ms");
    }
}

Result:

String replace is: 2ms
String is: 3300ms
StringBuffer is: 4ms
StringBuilder is: 6ms

Why is StringBuffer faster than StringBuilder in this case?
_______________________________________________________________________________
The next question, according to these two questions: Replace all occurrences of substring in a string - which is more efficient in Java? and String.replaceAll is considerably slower than doing the job yourself.
As can be seen, String.replaceAll() and String.replace() both use internal regex and String.replace() will always generate a new string every time it is called. And on top of that, all 2 deliver poor performance, although String.replace() is still rated as delivering better performance than String.replaceAll() - not to mention third-party libraries like StringUtils.replace() or StringBuilder.replace() (uses start-end and no string replacement with strings).
So, that seems to be all the differences between these two methods? In simple replacement cases (as shown in the example above), which method will bring better performance?
According to this answer: Apache StringUtils vs Java implementation of replace()
It seems that the performance between String.replace() and StringUtils.replace() is nothing difference?

Mike Simmons

Master Rancher

Posts: 5291

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Tan Quang wrote:Why is StringBuffer faster than StringBuilder in this case?

I don't believe it is. First, your test case is far too short to measure a meaningful value. You should run the code many more times to get meaningful results. Second, when I run the code myself, I get faster results for StringBuilder, not StringBuffer. Maybe it depends on which JDK version you're using. If you're still using JDK 8 or whatever, it may be that there was some bug in the code that was fixed a long time ago - I don't know.

Tan Quang wrote:The next question, according to these two questions: Replace all occurrences of substring in a string - which is more efficient in Java? and String.replaceAll is considerably slower than doing the job yourself.
As can be seen, String.replaceAll() and String.replace() both use internal regex

No. String.replace() doesn't use any regular expression. It treats the first argument as a literal. No regex involved. You may be confusing this method with other methods like replaceAll() and replaceFirst(). They did a bad job naming these methods, because they're not consistent. I always have to double-check the documentation to see which ones use regex and which don't.

Tan Quang wrote:and String.replace() will always generate a new string every time it is called.

So will replaceAll(). So will any String method, pretty much.

Tan Quang wrote:And on top of that, all 2 deliver poor performance

Compared to what? They don't do the same thing. Getting the optimal performance out of code with multiple replacements can be complicated - not every method is optimized for all situations.

Tan Quang wrote:although String.replace() is still rated as delivering better performance than String.replaceAll()

Probably because it doen't use any regex. If you don't need regex, don't use it, it slows you down for simple cases. But if you need it (and it can be very useful), then use it.

You also have a test case at the beginning that is labeled as testing String replace(), but the code is using replaceAll(). It's also doing completely different things than your other test cases, so it doesn't meaningfully compare to anything else.

Tim Holloway

Saloon Keeper

Posts: 29101

215

I like...

posted 3 years ago

1
Number of slices to send:

Optional 'thank-you' note:

Send

Tan Quang wrote:
StringBuffer is synchronous (slow).
StringBuilder is asynchronous (fast).

No. StringBuffer is synchronized, not synchronous. That is, its services are managed by a lock so that it is thread-safe. StringBuilder is the same as StringBuffer except that its services do not implement the Java synchronized locking. Synchronous and asynchronous are better off used to describe threads than resources.

StringBuilder is faster than StringBuffer because the act of testing, acquiring and releasing a lock is extra logic that StringBuilder does not have. Less logic, faster time for the same overall algorithm, as a rule. As Mike said, your sampling size isn't large enough to show the difference accurately, though.

Where the StringBuffer gets a lot slower is when multiple threads are all trying to use the same StringBuffer at the same time and perforce some threads must sleep while the current thread uses the resource.

As to why StringBuffer and Vector were implemented with blocking abilities and StringBuilder and java List came later, that's a mystery to me. Perhaps they were used in that capacity somewhere in the core JVM.

The problem with getting rid of the "undesirables" is that sooner or later someone will decide that YOU are an undesirable.

Mike Simmons

Master Rancher

Posts: 5291

posted 3 years ago

1
Number of slices to send:

Optional 'thank-you' note:

Send

Tim Holloway wrote:Where the StringBuffer gets a lot slower is when multiple threads are all trying to use the same StringBuffer at the same time and perforce some threads must sleep while the current thread uses the resource.

That's pretty rare, though - why would anyone need to do that? Though if they do, then (a) they do need some sort of locking, but (b) StringBuffer probably isn't doing the locking at the right level of granularity.

Tim Holloway wrote:As to why StringBuffer and Vector were implemented with blocking abilities and StringBuilder and java List came later, that's a mystery to me. Perhaps they were used in that capacity somewhere in the core JVM.

I think they were excited about the idea of having "synchronized" as a catch-all solution to threading problems, and thought it was a good idea. Over time they realized that the implementations in StringBuffer and Vector were badly thought out, as far as thread safety, allowing people to think they were writing "thread-safe" code when in fact they were not. Incidentally they also made the code slower. So they promoted new non-thread-safe versions instead. When people need thread safety, they're better off putting it in themselves, usually at a higher level.

Tim Holloway

Saloon Keeper

Posts: 29101

215

I like...

posted 3 years ago

1
Number of slices to send:

Optional 'thank-you' note:

Send

Mike Simmons wrote:
That's pretty rare, though - why would anyone need to do that?

Beats the heck out of me. I suppose there would be a benefit if you had one big character buffer that you built all your strings in and didn't want the overhead that making a new buffer or builder for each String creation would require in a multi-threaded environment. Other than that… ???

The problem with getting rid of the "undesirables" is that sooner or later someone will decide that YOU are an undesirable.

Mike Simmons

Master Rancher

Posts: 5291

posted 3 years ago

1
Number of slices to send:

Optional 'thank-you' note:

Send

Yeah, I have hard time imagining a situation where that would ever work, really. Aside from the fact that you probably wouldn't see any performance benefit at all even in a single-threaded case, the synchronization of StringBuffer is at the wrong level. One thread could be calling

while another is calling

and the result could be "Hello Goodbye, Blue SkyWorld!" Which goes back to why I hate StringBuffer and Vector, they synchronize at too low a level to be useful in most cases. You can occasionally share a mutable Hashtable in a thread-safe manner without additional synchronization. Maybe. But even that is rare.

Tan Quang

Ranch Hand

Posts: 284

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Mike Simmons wrote:I don't believe it is. First, your test case is far too short to measure a meaningful value. You should run the code many more times to get meaningful results. Second, when I run the code myself, I get faster results for StringBuilder, not StringBuffer. Maybe it depends on which JDK version you're using. If you're still using JDK 8 or whatever, it may be that there was some bug in the code that was fixed a long time ago - I don't know.

You are right, I don't test them on my computer, I tested them with online compiler tools: jdoodle (JDK 17.0.1), tutorialspoint (unknown JDK version), programiz (unknown JDK version), and onlinegdb (unknown JDK version).
Results: StringBuilder faster StringBuffer on jdoodle (JDK 17.0.1) and tutorialspoint (unknown JDK version) and give the opposite results on programiz (unknown JDK version) and onlinegdb (unknown JDK version).

No. String.replace() doesn't use any regular expression. It treats the first argument as a literal. No regex involved. You may be confusing this method with other methods like replaceAll() and replaceFirst(). They did a bad job naming these methods, because they're not consistent. I always have to double-check the documentation to see which ones use regex and which don't.

So will replaceAll(). So will any String method, pretty much.

Compared to what? They don't do the same thing. Getting the optimal performance out of code with multiple replacements can be complicated - not every method is optimized for all situations.

Probably because it doen't use any regex. If you don't need regex, don't use it, it slows you down for simple cases. But if you need it (and it can be very useful), then use it.

You also have a test case at the beginning that is labeled as testing String replace(), but the code is using replaceAll(). It's also doing completely different things than your other test cases, so it doesn't meaningfully compare to anything else.

As far as String.java source code, lines 2209 and 2227, String.replaceAll() and String.replace() all use replaceAll() inside, the difference is that String.replace() uses Pattern.LITERAL for the first argument.
So, in the case of replacing all instances where strings appear, in strings that don't contain regex references (\d, \s,...) as in my example code, String.replace() would give better performance than String.replaceAll() right? And if true, comparing String.replace() together with StringUtils.replace() which is better?
In case multiple strings need to be replaced as in the example code, String.replaceAll() and String.replace() will do better than StringUtils.replace(). I haven't thought of a StringUtils.replace() implementation with this case! Perhaps create an array of strings to be replace and an array of strings to replaced, then use StringUtils.replaceEach() (but it is not StringUtils.replace())?

Tan Quang

Ranch Hand

Posts: 284

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Tim Holloway wrote:No. StringBuffer is synchronized, not synchronous. That is, its services are managed by a lock so that it is thread-safe. StringBuilder is the same as StringBuffer except that its services do not implement the Java synchronized locking. Synchronous and asynchronous are better off used to describe threads than resources.

StringBuilder is faster than StringBuffer because the act of testing, acquiring and releasing a lock is extra logic that StringBuilder does not have. Less logic, faster time for the same overall algorithm, as a rule. As Mike said, your sampling size isn't large enough to show the difference accurately, though.

Where the StringBuffer gets a lot slower is when multiple threads are all trying to use the same StringBuffer at the same time and perforce some threads must sleep while the current thread uses the resource.

As to why StringBuffer and Vector were implemented with blocking abilities and StringBuilder and java List came later, that's a mystery to me. Perhaps they were used in that capacity somewhere in the core JVM.

Yes, you're right, I was confused between synchronized and synchronous! And it is used in the core of JVM, I'm not sure...

Stephan van Hulst

Bartender

Posts: 15743

368

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

As I already mentioned in one of your other threads, you may not draw conclusions from any benchmark written using currentTimeMillis(), nanoTime() or any timer classes.

If you didn't use a microbenchmark framework to get your timings, they are wrong.

Since you seem to be very preoccupied with micro-optimizations, at least use the proper tools to test them. I strongly recommend jmh.

Tim Cooke

Marshal

Posts: 6298

507

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

JMH seconded for microbenchmarking. Do not attempt to roll your own, ever.

Tim Driven Development | Test until the fear goes away

Campbell Ritchie

Marshal

Posts: 82158

593

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

But why would anybody want to do micro‑benchmarking on such code in the first place?

Java 8 (verified skill)
Skill verified by Jeanne Boyarsky

Tim Holloway

Saloon Keeper

Posts: 29101

215

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Campbell Ritchie wrote:But why would anybody want to do micro‑benchmarking on such code in the first place?

Because they work on the JVM development teams tuning the String classes at Oracle, IBM, OpenJPA, et. al.?

But as for the rest of us, probably not.

The problem with getting rid of the "undesirables" is that sooner or later someone will decide that YOU are an undesirable.

Mike Simmons

Master Rancher

Posts: 5291

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Re: replace() using replaceAll(), interesting. That code seems to be from Java 1.6 or 1.7, since it has methods added in 1.6 but not 1.8. I was looking at code from JDK 18 and JDK 11, and it does not use replaceAll() internally. I would guess that they found they could optimize it better that way. If you're still using Java 8 you should look at the source for Java 8. But this also points to another issue when measuring performance - it can vary from Java version to Java version, and also from OS to OS and computer to computer.

Tan Quang

Ranch Hand

Posts: 284

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Stephan van Hulst wrote:As I already mentioned in one of your other threads, you may not draw conclusions from any benchmark written using currentTimeMillis(), nanoTime() or any timer classes.

If you didn't use a microbenchmark framework to get your timings, they are wrong.

Since you seem to be very preoccupied with micro-optimizations, at least use the proper tools to test them. I strongly recommend jmh.

Tim Cooke wrote:JMH seconded for microbenchmarking. Do not attempt to roll your own, ever.

As far as everyone's advice goes, the metric seems to make me no longer trust performance, it's unacceptable and unreliable anyway.
StringBuffer is faster than StringBuilder in my test code (perhaps or sure) is due to the difference between the Java versions?! Anyway, I'm not sure this, because as I checked the Java version used on onlinegdb with this code:

Although Java is version 11.0.4, but StringBuffer is faster than StringBuilder in the case of the example code I gave.
So, my conclusion on the first question: Concatenating strings using the "plus" operator, although easier to read, but yields the worst performance. Therefore, so it is necessary to avoid the use of string concatenation using the "plus" operator, and it is advisable to use StringBuffer or StringBuilder (on a case-by-case basis) for better performance?!
_____________________________________________________________________________________________
Anyway, using String.replaceAll() is not advisable, too expensive, and offers poor performance with simple string replacement cases like the current one.
I have read this question and answer: Commons Lang StringUtils.replace performance vs String.replace
They have ceased to rely on micro-benchmarking figures to draw conclusions, they have also analyzed the internal source code and revision histories of both methods. Based on that answer, it can be seen that String.replace() has been greatly improved from Java 9, and outperformed StringUtils.replace() from subsequent versions of Java. But the omission of that answer is that the changes are from OpenJDK, not Oracle JDK, Oracle JDK has long been closed source, its changes although partially publicized but still quite "ambiguous", it is impossible to be sure if String.replace() is really improved in Oracle JDK as on Open JDK or not!?
I'm thinking about whether it's necessary to write a separate method to replace the string? Anyway, improving the String.replace() method is only available from Java 9, versions from Java 8 and below still provide poorer performance than using StringUtils.replace(). According to loukili answer and the micro-benchmarking result of that code is based on JMH in the next answer of qxo. It's probably pretty good for simple string replacements using Java 8 and below (but not "friendly" in case it is necessary to replace many places as in my example code (3 positions)).

Tan Quang

Ranch Hand

Posts: 284

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Mike Simmons wrote:Re: replace() using replaceAll(), interesting. That code seems to be from Java 1.6 or 1.7, since it has methods added in 1.6 but not 1.8. I was looking at code from JDK 18 and JDK 11, and it does not use replaceAll() internally. I would guess that they found they could optimize it better that way. If you're still using Java 8 you should look at the source for Java 8. But this also points to another issue when measuring performance - it can vary from Java version to Java version, and also from OS to OS and computer to computer.

Yes, String.replace() has improved quite a bit since Java 9, but that's with OpenJDK, which is pretty "ambiguous" with Oracle JDK!

Mike Simmons

Master Rancher

Posts: 5291

posted 3 years ago

1
Number of slices to send:

Optional 'thank-you' note:

Send

Tan Quang wrote:Although Java is version 11.0.4, but StringBuffer is faster than StringBuilder in the case of the example code I gave.

The example you gave is still far too short to give meaningful results. I just tested it with JDK 8 and 11 using 100000000 repetitions, and StringBuilder was the clear winner on my machine - though as I repeated it more and more, the results got closer and closer. I agree with everyone encouraging you to use jmh - your tests are close to meaningless, as it is.

Tan Quang wrote:So, my conclusion on the first question: Concatenating strings using the "plus" operator, although easier to read, but yields the worst performance. Therefore, so it is necessary to avoid the use of string concatenation using the "plus" operator, and it is advisable to use StringBuffer or StringBuilder (on a case-by-case basis) for better performance?!

Well, that's stated far too generally to be true in all cases. There are many cases where concatenating with + is equally fast as the others. Maybe sometimes faster. But it depends how it's used in code. The one big thing to avoid is using + inside a loop in cases where you're appending to a string that keeps growing in size. The fundamental problem is that the intermediate result is forced to be a String each time - which means that you're going to be spending a lot of time copying characters into new char[] arrays each time you append. You get basically the same performance from each of these:

The problem is not concatenation with +, but the fact that you're creating a new immutable object (String) on each iteration of the loop, and (importantly) it's getting bigger each time. That's what makes this bad performance.

Conversely, if you are not doing this in a loop, and if the result is not getting bigger each time, then it probably doesn't matter whether you use + or StringBuilder. There's no reason to scare people off from using + in general - but be very careful when it's in a loop.

As for your concerns about OpenJDK and what's in Oracle's code... sigh. You can check the actual source code by looking at src.zip, or using a decent IDE like IntelliJ which will show you a decompiled version. You can also worry about many, many other valid Java releases out there, if you like. There are a lot. I think you'll find that in many cases, improvements from OpenJDK make it into other versions as well.

Tim Holloway

Saloon Keeper

Posts: 29101

215

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Mike Simmons wrote:
The problem is not concatenation with +, but the fact that you're creating a new immutable object (String) on each iteration of the loop, and (importantly) ]it's getting bigger each time. That's what makes this bad performance.

I don't know if there's any particular badness about making larger and larger String instances as such. When you discard a String in favor of a new String, whether larger or smaller, it's not like there's going to be immediate storage reclamation. The garbage collection process has its own schedule.

The concatenation operator ("+") and concatenation method calls are basically the same thing except that one is defined in the language and the other is an explicit method call. If the compiler can do concatenations at compile-time, it will do so in either event. This is an optimization technique known as constant folding.

As I recall, yes, there was some penalty for using "+" in early Java releases, but it was repaired long ago.

You probably won't suffer too much building strings from explicitly concatenating 2 or 3 other strings. But if it gets more complex than that - and especially if type conversion (say Integer-to-String) is involed, use a StringBuilder.

And always remember. Optimization isn't don't what you "know" is efficient, it's doing what you've measured to be efficient. And if it isn't enough difference to measure, don't bother. Your efficiency is more important than the machine's.

The problem with getting rid of the "undesirables" is that sooner or later someone will decide that YOU are an undesirable.

Mike Simmons

Master Rancher

Posts: 5291

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Tim Holloway wrote:I don't know if there's any particular badness about making larger and larger String instances as such. When you discard a String in favor of a new String, whether larger or smaller, it's not like there's going to be immediate storage reclamation. The garbage collection process has its own schedule.

Ignoring garbage collection for the moment, the key issue is that in the three code examples I shows, each String is being built by adding something to the old String. Each time that happens, that means a new String has a char[] array whose contents are (mostly) copied from the previous String. In the code I showed, that's 5 characters copied the first time... and 10 the second... and 15 the third... and 20 the fourth... and 25 the fifth... which adds up quite a bit. It may be double that each time, if you are actually creating a new StringBuilder and then a new String with each step. Regardless, it's O(N^2) performance overall, for a case where it was clearly possible to get O(N) performance by using a single StringBuilder and not re-copying a new one each time:

Sure, some copying of arrays occurs internally as the StringBuilder gets resized now and then. But it's done smartly, such that overall performance is still O(N). It doesn't have to re-copy all the content on every iteration. Because that would be silly.

Campbell Ritchie

Marshal

Posts: 82158

593

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Tim Holloway wrote:. . . I don't know if there's any particular badness about making larger and larger String instances as such. When you discard a String in favor of a new String, whether larger or smaller, it's not like there's going to be immediate storage reclamation. The garbage collection process has its own schedule. . . .

But once the String gets to sizes like 10⁸ characters, GC will be necessary every few runs of the loop

Optimization isn't don't what you "know" is efficient, it's doing what you've measured to be efficient. . . . .

And isn't one of the best ways to mess up your performance to try to be too clever about optimisations?

Tim Holloway

Saloon Keeper

Posts: 29101

215

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Mike Simmons wrote:the key issue is that in the three code examples I shows, each String is being built by adding something to the old String.

Well, yes. I was talking about string concatenation in general, not making a rule and computing the performance of a specific case. Also note that I'm not assuming that the text copying is necessarily the major consumer of resources. Just constructing a String is no minor feat, regardless of the length of its contents.

Nor am I assuming that GC requires a certain memory threshold to kick off. Last time I heard, GC was running as a gradual and ongoinh process, not as a stop-everything-and-recover atomic operation, as it infamously did in the Microsoft Basic multi-tasking demo app for the Commodore Amiga. GC has progressed a long ways since 1985.

Actually the case of constructing a mega-string by repeatedly concatenating the same smaller string over and over isn't exactly the most common case. The main place you'd see it might be something like creating a long separator line for printed reports, except that Java isn't employed all that often for printed reports. Further, consider what the following code might produce:

Now consider this:

The second case would almost certainly be optimised at compile time into a single assignment of a constructed String of 100 stars --- constant folding. But the first example could result in the exact same thing depending on how determined the compiler was, thanks to another operation known as loop unrolling. Loop unrolling takes advantage that for a small fixed number of iterations, the iterate-test-and-branch is overhead that can be eliminated in favor of simply replicating the loop body multiple times with no testing or branching.

The problem with getting rid of the "undesirables" is that sooner or later someone will decide that YOU are an undesirable.

Campbell Ritchie

Marshal

Posts: 82158

593

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Tim Holloway wrote: . . . The second case would almost certainly be optimised at compile time into a single assignment of a constructed String of 100 stars --- constant folding. . . . .

Agree; it would actually be executed at compile time and a String of 100 stars created.

Stephan van Hulst

Bartender

Posts: 15743

368

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Tan Quang wrote:StringBuffer is faster than StringBuilder in my test code (perhaps or sure) is due to the difference between the Java versions?!

No. It's probable that by the time execution reaches the code that tests StringBuffer, the JVM has warmed up enough that it appears this code runs faster and it has nothing to do with using StringBuffer over StringBuilder. Swap the two around and see if it changes anything. It's exactly for this reason I told you to use jmh, because it deals with situations like this.

So, my conclusion on the first question: Concatenating strings using the "plus" operator, although easier to read, but yields the worst performance. Therefore, so it is necessary to avoid the use of string concatenation using the "plus" operator, and it is advisable to use StringBuffer or StringBuilder (on a case-by-case basis) for better performance?!

Again, you're drawing this conclusion prematurely. It's likely that using the + operator will give you worse performance than using a StringBuilder, when used in a loop with many iterations. Does that mean you need to avoid it at all costs? No.

1) If you're not building a string inside a loop, you might as well use +, because the performance benefits of using StringBuilder only really become apparent when you are performing many edits to get the final result.

2) Even if you are building a string in a loop using the + operator, the performance penalty may not be apparent because that portion of your code is not called often. If performance is a worry, use a profiler to identify hot paths in your code before you start optimizing.

Anyway, using String.replaceAll() is not advisable, too expensive, and offers poor performance with simple string replacement cases like the current one.

Who cares? Stop worrying. Start profiling. You're wasting so much time worrying about stuff that might be inconsequential. Also, as you've already seen, improvements may be made to methods that you previous considered "too expensive", and rules you've made for yourself may no longer hold.

Seriously, the next time you write an application and worry about the performance of a small part of your code, run a profiler and see how much time is spent there in total.

________________________________________________________________________________________

Now, for those interested, I wrote a couple of benchmarks in jmh to test the performance characteristics of +, StringBuilder.append() and StringBuffer.append().

import java.util.concurrent.*;

import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.*;
import org.openjdk.jmh.runner.options.*;

@Fork          (1)
@BenchmarkMode (Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State         (Scope.Thread)
public class Benchmark1_AppendOnce {

private String initialValue = Constants.INITIAL_VALUE;

private String stringToAppend = Constants.STRING_TO_APPEND;

@Benchmark
  public StringBuilder test0_establishBaseline() {
    return new StringBuilder(initialValue);
  }

@Benchmark
  public String test1_useConcatenationOnce() {
    return initialValue + stringToAppend;
  }

@Benchmark
  public StringBuilder test2_useStringBuilderOnce() {
    return new StringBuilder(initialValue).append(stringToAppend);
  }

@Benchmark
  public StringBuffer test3_useStringBufferOnce() {
    return new StringBuffer(initialValue).append(stringToAppend);
  }

public static void main(String... args) throws RunnerException {
    var options = new OptionsBuilder()
      .include(Benchmark1_AppendOnce     .class.getSimpleName())
      .include(Benchmark2_AppendManyTimes.class.getSimpleName())
      .build();

new Runner(options).run();
  }
}

establishBaseline avgt 5 102,926 ± 10,111 ns/op useConcatenationOnce avgt 5 98,797 ± 9,018 ns/op useStringBuilderOnce avgt 5 100,541 ± 5,908 ns/op useStringBufferOnce avgt 5 105,161 ± 3,027 ns/op

As you can see, for a single invocation, the execution time for both string concatenation and calling the append() methods is almost identical to the baseline. That means that most execution time is used for allocation of the builder/buffer that is used for the concatenation.

"But Stephan, isn't it unfair to create a builder/buffer object in all benchmarks, except in useConcatenationOnce()?"

No. String concatenation requires execution time to create the resulting String object. The comparison is fair because I didn't call toString() on the builder/buffer used in the other benchmarks, something that you likely would have to do in an actual application.

Conclusion: For concatenating two strings once, performance-wise IT REALLY DOESN'T MATTER. So just use + for clarity.

Let's take a look at repeated concatenation:

import java.util.concurrent.*;

import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.infra.*;

@Fork          (1)
@BenchmarkMode (Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State         (Scope.Thread)
public class Benchmark2_AppendManyTimes {

private static final int ITERATIONS = 10_000;

private String initialValue = Constants.INITIAL_VALUE;

private String stringToAppend = Constants.STRING_TO_APPEND;

@Benchmark
  @OperationsPerInvocation(ITERATIONS)
  public void test0_estabishBaseline(Blackhole blackhole) {
    for (var i = 0; i < ITERATIONS; i++) {
      blackhole.consume(initialValue);
    }
  }

@Benchmark
  @OperationsPerInvocation(ITERATIONS)
  public String test1_useConcatenationManyTimes(Blackhole blackhole) {
    var stringToAppendTo = initialValue;

for (var i = 0; i < ITERATIONS; i++) {
      blackhole.consume(stringToAppendTo += stringToAppend);
    }

return stringToAppendTo;
  }

@Benchmark
  @OperationsPerInvocation(ITERATIONS)
  public StringBuilder test2_useStringBuilderManyTimes(Blackhole blackhole) {
    var builderToAppendTo = new StringBuilder(initialValue);

for (var i = 0; i < ITERATIONS; i++) {
      blackhole.consume(builderToAppendTo.append(stringToAppend));
    }

return builderToAppendTo;
  }

@Benchmark
  @OperationsPerInvocation(ITERATIONS)
  public StringBuffer test3_useStringBufferManyTimes(Blackhole blackhole) {
    var bufferToAppendTo = new StringBuffer(initialValue);

for (var i = 0; i < ITERATIONS; i++) {
      blackhole.consume(bufferToAppendTo.append(stringToAppend));
    }

return bufferToAppendTo;
  }
}

estabishBaseline avgt 5 0,193 ± 0,069 ns/op useConcatenationManyTimes avgt 5 2947,531 ± 111,243 ns/op useStringBuilderManyTimes avgt 5 12,964 ± 1,157 ns/op useStringBufferManyTimes avgt 5 27,201 ± 5,807 ns/op

Here we can see that using string concatenation inside a loop is significantly less efficient than appending to a builder/buffer. We also see that using StringBuffer is roughly twice as expensive as using StringBuilder, which is expected because access to StringBuffer is synchronized.

HOWEVER, we also see that repeated string concatenation costs roughly 30 milliseconds for 10,000 concatenations. So unless your loop in executed in a hot path of your application, you probably won't even notice it if you use + instead of append().

Conclusion: USE A PROFILER.

Finally, here is the full output from jmh:

# JMH version: 1.35
# VM version: JDK 18.0.1.1, OpenJDK 64-Bit Server VM, 18.0.1.1+2-6
# VM invoker: C:\Program Files\OpenJDK\jdk-18.0.1.1\bin\java.exe
# VM options: <none>
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.example.performance.Benchmark1_AppendOnce.test0_establishBaseline

# Run progress: 0,00% complete, ETA 00:13:20
# Fork: 1 of 1
# Warmup Iteration   1: 107,704 ns/op
# Warmup Iteration   2: 102,598 ns/op
# Warmup Iteration   3: 110,039 ns/op
# Warmup Iteration   4: 106,790 ns/op
# Warmup Iteration   5: 103,450 ns/op
Iteration   1: 105,113 ns/op
Iteration   2: 104,959 ns/op
Iteration   3: 102,229 ns/op
Iteration   4: 103,613 ns/op
Iteration   5: 98,718 ns/op

Result "com.example.performance.Benchmark1_AppendOnce.test0_establishBaseline":
  102,926 ±(99.9%) 10,111 ns/op [Average]
  (min, avg, max) = (98,718, 102,926, 105,113), stdev = 2,626
  CI (99.9%): [92,815, 113,037] (assumes normal distribution)

# Run progress: 12,50% complete, ETA 00:11:42
# Fork: 1 of 1
# Warmup Iteration   1: 102,408 ns/op
# Warmup Iteration   2: 101,282 ns/op
# Warmup Iteration   3: 101,895 ns/op
# Warmup Iteration   4: 98,684 ns/op
# Warmup Iteration   5: 98,802 ns/op
Iteration   1: 98,174 ns/op
Iteration   2: 97,126 ns/op
Iteration   3: 102,908 ns/op
Iteration   4: 97,570 ns/op
Iteration   5: 98,207 ns/op

Result "com.example.performance.Benchmark1_AppendOnce.test1_useConcatenationOnce":
  98,797 ±(99.9%) 9,018 ns/op [Average]
  (min, avg, max) = (97,126, 98,797, 102,908), stdev = 2,342
  CI (99.9%): [89,780, 107,815] (assumes normal distribution)

# Run progress: 25,00% complete, ETA 00:10:02
# Fork: 1 of 1
# Warmup Iteration   1: 101,870 ns/op
# Warmup Iteration   2: 97,198 ns/op
# Warmup Iteration   3: 100,735 ns/op
# Warmup Iteration   4: 98,022 ns/op
# Warmup Iteration   5: 101,944 ns/op
Iteration   1: 101,735 ns/op
Iteration   2: 100,758 ns/op
Iteration   3: 98,711 ns/op
Iteration   4: 102,252 ns/op
Iteration   5: 99,251 ns/op

Result "com.example.performance.Benchmark1_AppendOnce.test2_useStringBuilderOnce":
  100,541 ±(99.9%) 5,908 ns/op [Average]
  (min, avg, max) = (98,711, 100,541, 102,252), stdev = 1,534
  CI (99.9%): [94,634, 106,449] (assumes normal distribution)

# Run progress: 37,50% complete, ETA 00:08:22
# Fork: 1 of 1
# Warmup Iteration   1: 108,755 ns/op
# Warmup Iteration   2: 106,913 ns/op
# Warmup Iteration   3: 105,012 ns/op
# Warmup Iteration   4: 106,025 ns/op
# Warmup Iteration   5: 106,332 ns/op
Iteration   1: 105,833 ns/op
Iteration   2: 105,052 ns/op
Iteration   3: 105,536 ns/op
Iteration   4: 105,537 ns/op
Iteration   5: 103,848 ns/op

Result "com.example.performance.Benchmark1_AppendOnce.test3_useStringBufferOnce":
  105,161 ±(99.9%) 3,027 ns/op [Average]
  (min, avg, max) = (103,848, 105,161, 105,833), stdev = 0,786
  CI (99.9%): [102,135, 108,188] (assumes normal distribution)

# Run progress: 50,00% complete, ETA 00:06:41
# Fork: 1 of 1
# Warmup Iteration   1: 0,192 ns/op
# Warmup Iteration   2: 0,202 ns/op
# Warmup Iteration   3: 0,238 ns/op
# Warmup Iteration   4: 0,195 ns/op
# Warmup Iteration   5: 0,196 ns/op
Iteration   1: 0,191 ns/op
Iteration   2: 0,206 ns/op
Iteration   3: 0,205 ns/op
Iteration   4: 0,199 ns/op
Iteration   5: 0,163 ns/op

Result "com.example.performance.Benchmark2_AppendManyTimes.test0_estabishBaseline":
  0,193 ±(99.9%) 0,069 ns/op [Average]
  (min, avg, max) = (0,163, 0,193, 0,206), stdev = 0,018
  CI (99.9%): [0,124, 0,261] (assumes normal distribution)

# Run progress: 62,50% complete, ETA 00:05:01
# Fork: 1 of 1
# Warmup Iteration   1: 3128,551 ns/op
# Warmup Iteration   2: 3168,543 ns/op
# Warmup Iteration   3: 2996,954 ns/op
# Warmup Iteration   4: 2983,545 ns/op
# Warmup Iteration   5: 3047,775 ns/op
Iteration   1: 2987,531 ns/op
Iteration   2: 2965,426 ns/op
Iteration   3: 2916,090 ns/op
Iteration   4: 2940,614 ns/op
Iteration   5: 2927,992 ns/op

Result "com.example.performance.Benchmark2_AppendManyTimes.test1_useConcatenationManyTimes":
  2947,531 ±(99.9%) 111,243 ns/op [Average]
  (min, avg, max) = (2916,090, 2947,531, 2987,531), stdev = 28,889
  CI (99.9%): [2836,288, 3058,774] (assumes normal distribution)

# Run progress: 75,00% complete, ETA 00:03:20
# Fork: 1 of 1
# Warmup Iteration   1: 13,632 ns/op
# Warmup Iteration   2: 12,601 ns/op
# Warmup Iteration   3: 13,455 ns/op
# Warmup Iteration   4: 12,679 ns/op
# Warmup Iteration   5: 13,391 ns/op
Iteration   1: 12,674 ns/op
Iteration   2: 13,401 ns/op
Iteration   3: 12,730 ns/op
Iteration   4: 13,123 ns/op
Iteration   5: 12,890 ns/op

Result "com.example.performance.Benchmark2_AppendManyTimes.test2_useStringBuilderManyTimes":
  12,964 ±(99.9%) 1,157 ns/op [Average]
  (min, avg, max) = (12,674, 12,964, 13,401), stdev = 0,300
  CI (99.9%): [11,807, 14,121] (assumes normal distribution)

# Run progress: 87,50% complete, ETA 00:01:40
# Fork: 1 of 1
# Warmup Iteration   1: 29,979 ns/op
# Warmup Iteration   2: 29,033 ns/op
# Warmup Iteration   3: 29,366 ns/op
# Warmup Iteration   4: 27,712 ns/op
# Warmup Iteration   5: 30,098 ns/op
Iteration   1: 27,517 ns/op
Iteration   2: 29,664 ns/op
Iteration   3: 26,433 ns/op
Iteration   4: 25,797 ns/op
Iteration   5: 26,593 ns/op

Result "com.example.performance.Benchmark2_AppendManyTimes.test3_useStringBufferManyTimes":
  27,201 ±(99.9%) 5,807 ns/op [Average]
  (min, avg, max) = (25,797, 27,201, 29,664), stdev = 1,508
  CI (99.9%): [21,393, 33,008] (assumes normal distribution)

# Run complete. Total time: 00:13:23

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

NOTE: Current JVM experimentally supports Compiler Blackholes, and they are in use. Please exercise
extra caution when trusting the results, look into the generated code to check the benchmark still
works, and factor in a small probability of new VM bugs. Additionally, while comparisons between
different JVMs are already problematic, the performance difference caused by different Blackhole
modes can be very significant. Please make sure you use the consistent Blackhole mode for comparisons.

Benchmark                                                   Mode  Cnt     Score     Error  Units
Benchmark1_AppendOnce.test0_establishBaseline               avgt    5   102,926 ±  10,111  ns/op
Benchmark1_AppendOnce.test1_useConcatenationOnce            avgt    5    98,797 ±   9,018  ns/op
Benchmark1_AppendOnce.test2_useStringBuilderOnce            avgt    5   100,541 ±   5,908  ns/op
Benchmark1_AppendOnce.test3_useStringBufferOnce             avgt    5   105,161 ±   3,027  ns/op
Benchmark2_AppendManyTimes.test0_estabishBaseline           avgt    5     0,193 ±   0,069  ns/op
Benchmark2_AppendManyTimes.test1_useConcatenationManyTimes  avgt    5  2947,531 ± 111,243  ns/op
Benchmark2_AppendManyTimes.test2_useStringBuilderManyTimes  avgt    5    12,964 ±   1,157  ns/op
Benchmark2_AppendManyTimes.test3_useStringBufferManyTimes   avgt    5    27,201 ±   5,807  ns/op

Stephan van Hulst

Bartender

Posts: 15743

368

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Ah yes, for completeness, here is the Constants class I used:

Tan Quang

Ranch Hand

Posts: 284

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Stephan van Hulst wrote:Who cares? Stop worrying. Start profiling. You're wasting so much time worrying about stuff that might be inconsequential. Also, as you've already seen, improvements may be made to methods that you previous considered "too expensive", and rules you've made for yourself may no longer hold.

In theory, replacing a simple string that does not contain regex references like the one in my example code (line 3), the use of:

StringUtils.replace(): Will probably provide the best performance. But it's not "friendly" in cases where there are multiple locations to replace as in this example code.

String.replace(): This is the best option because whether used on Java 8 and below (not yet improved internal source code - still using replaceAll() + Pattern.LITERAL) or newer versions (which have improved internal source code) it is still fast and offers better performance than String.replaceAll().

String.replaceAll(): This is the "to avoid" option. Since it is quite similar to String.replace() on Java 8 and below but since it does not use Pattern.LITERAL, the performance will not be good (at least when compared to using String.replace()).

Is this true?

Tan Quang

Ranch Hand

Posts: 284

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Mike Simmons wrote:The example you gave is still far too short to give meaningful results. I just tested it with JDK 8 and 11 using 100000000 repetitions, and StringBuilder was the clear winner on my machine - though as I repeated it more and more, the results got closer and closer. I agree with everyone encouraging you to use jmh - your tests are close to meaningless, as it is.

In the case of this code:

public class Main {
    public static final String TIP_MICRO_OF_SYSTEM = "[SYSTEM] ";
    public static final String TIP_ITEM_NOTICE_BY_ONE_FLASH_END = "All the essence of heaven and earth gathered here, helping [%p] have created [%e +%s] successfully. [%p] so divine!"; // 1
    ... // skip line from TWO to FOURTEEN
    public static final String TIP_ITEM_NOTICE_BY_FIFTTEEN_FLASH_END = "All the essence of heaven and earth gathered here, helping [%p] have created [%e +%s] successfully. [%p] is legendary!"; // 15
    public static void main(String[] args) {
        byte lvl = getLevel();
        StringBuilder content = new StringBuilder(TIP_MICRO_OF_SYSTEM);
        if(lvl == 1) {
            content.append(Tip.TIP_ITEM_NOTICE_BY_ONE_FLASH_END.replace("%p", getPlayer().getName()).replace("%s", String.valueOf(lvl)).replace("%e", getEquipment().getName()));
        }
        else if(lvl == 2) {
            content.append(Tip.TIP_ITEM_NOTICE_BY_TWO_FLASH_END.replace("%p", getPlayer().getName()).replace("%s", String.valueOf(lvl)).replace("%e", getEquipment().getName()));
        }
        ...
        else if(lvl == 15) {
            content.append(Tip.TIP_ITEM_NOTICE_BY_TWO_FLASH_END.replace("%p", getPlayer().getName()).replace("%s", String.valueOf(lvl)).replace("%e", getEquipment().getName()));
        }
        /* String printStr = content.toString();
        System.out.println(printStr); */
        System.out.println(content.toString());
    }
}

As far as I remember not mistakenly, the string is invariant, a string is generated and will be stored in memory, when it is necessary to use the same string, it will directly use the one already in memory or create a new one if there is no string like that in memory.

But in the above case, the strings are almost the same, differing only in where they are replaced (and the text at the end), should I call content.toString() directly as in line 21 or attach it to a variable and then call it as in lines 19, 20?

Should I use StringBuilder (line 8) then use StringBuilder.append() in this case? Or should use String directly then concatenation the string with "plus" operators?

Paul Clapham

Marshal

Posts: 28532

114

I like...

posted 3 years ago

1
Number of slices to send:

Optional 'thank-you' note:

Send

Part of the performance equation is to not write complicated code when simple code would be equivalent. Doing that makes it hard to see what needs to be optimized and what doesn't.

The purpose of the code you posted is simply to concatenate two Strings, although it's hard to realize that. So there is no reason to consider anything except simple String concatenation. Here's my simplified version of what you posted:

Java 8 (verified skill)
Skill verified by Paul Clapham

Mike Simmons

Master Rancher

Posts: 5291

posted 3 years ago

1
Number of slices to send:

Optional 'thank-you' note:

Send

One nice thing about using replaceAll() is that it can give you the opportunity to replace everything you want in a single pass:

(Here I pretended there was some difference between the different TIP_ITEM_NOTICE_BY_[ x ]_FLASH_END constants. As was probably intended - though Paul correctly observes that they're all the same in the code shown.)

The point is, there's just one replaceAll(), not three consecutive ones. Is that faster? Maybe, maybe not - the overhead of using a Pattern may still be greater. But the more variables you want to replace, the more it makes sense to do them all in one pass. Is it easier to read? That may depend on taste. Again, I think it scales nicely as you add more variables to replace (if there are any).

Admittedly, this is the replaceAll() in Matcher, not one in String. It works similarly, but it's designed to work with a MatchResult on each replacement.

Tim Holloway

Saloon Keeper

Posts: 29101

215

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Paul Clapham wrote:Part of the performance equation is to not write complicated code when simple code would be equivalent. Doing that makes it hard to see what needs to be optimized and what doesn't.

Also, the compiler understands simple common cases. If you code gnarly performance hacks, it may backfire because the compiler's optimizers may not be able to tune them. Recall what I said is possible with the "100 star" example.

But in the end, Don't optimize unless you have to!!!. Hardware time is cheap these days. Developer time is not. We're no longer having to fit apps into 16K on a mainframe whose processor speed is slower than an Apple Watch. Management is not amused when you're tinkering for no obvious profit. Or, as I once told one clever fellow, "It's not like you can collect all those nanoseconds you saved and put them in a jar for a rainy day".

If you DO get yelled at for poorly performing software, THEN profile it in as close to the errant environment as you can. DON'T rely on the wisdom and benchmarks we're reporting here, because a different environment may be involved and the previously-listed optimizations could actually make things worse. Optimize ONLY the parts that are actually hurting. And again, don't get too clever, or you'll end up in a fight with the compiler.

And above all, keep in mind that optimization is best applied from the top. As someone who has spent years in high-performance, high-reliability environments I can say from experience that a wise algorithm selection can blow away clever statement coding tricks by orders of magnitude.

The problem with getting rid of the "undesirables" is that sooner or later someone will decide that YOU are an undesirable.

Tan Quang

Ranch Hand

Posts: 284

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Paul Clapham wrote:Part of the performance equation is to not write complicated code when simple code would be equivalent. Doing that makes it hard to see what needs to be optimized and what doesn't.

The purpose of the code you posted is simply to concatenate two Strings, although it's hard to realize that. So there is no reason to consider anything except simple String concatenation. Here's my simplified version of what you posted:

public class Main { public static final String TIP_MICRO_OF_SYSTEM = "[SYSTEM] "; public static final String[] TIP_ITEM_NOTICE_FLASH_END = { "All the essence of heaven and earth gathered here, helping [%p] have created [%e +%s] successfully. [%p] so divine!", // 1 // skip line from TWO to FOURTEEN "All the essence of heaven and earth gathered here, helping [%p] have created [%e +%s] successfully. [%p] is legendary!") // 15 }; public static void main(String[] args) { byte lvl = getLevel(); String content = ""; if (lvl >= 1 && lvl <= 15) { content = TIP_MICRO_OF_SYSTEM + TIP_ITEM_NOTICE_FLASH_END[lvl - 1].replace("%p", getPlayer().getName()).replace("%s", String.valueOf(lvl)).replace("%e", getEquipment().getName()); } System.out.println(content); } }

Oh that's right, I hadn't thought of the string array before, because it was a bit difficult to read and a bit difficult to determine the index (in the case of my old code).

Tan Quang

Ranch Hand

Posts: 284

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Mike Simmons wrote:One nice thing about using replaceAll() is that it can give you the opportunity to replace everything you want in a single pass:
public static void main(String[] args) { System.out.print(TIP_MICRO_OF_SYSTEM); System.out.println(customizeTemplate(getEndTemplate())); } private static String getEndTemplate() { return switch (getLevel()) { case 1 -> TIP_ITEM_NOTICE_BY_ONE_FLASH_END; case 2 -> TIP_ITEM_NOTICE_BY_TWO_FLASH_END; [...] case 15 -> TIP_ITEM_NOTICE_BY_FIFTEEN_FLASH_END; default -> throw new IllegalArgumentException("Unsupported level: " + getLevel()); }; } private static final Pattern templateVariablePattern = Pattern.compile("%([a-z])"); private static String customizeTemplate(String template) { return templateVariablePattern.matcher(template).replaceAll(mr -> switch (mr.group(1)) { case "p" -> getPlayer().getName(); case "e" -> getEquipment().getName(); case "s" -> String.valueOf(getLevel()); default -> mr.group(0); // no replacement }); }
(Here I pretended there was some difference between the different TIP_ITEM_NOTICE_BY_[ x ]_FLASH_END constants. As was probably intended - though Paul correctly observes that they're all the same in the code shown.)

The point is, there's just one replaceAll(), not three consecutive ones. Is that faster? Maybe, maybe not - the overhead of using a Pattern may still be greater. But the more variables you want to replace, the more it makes sense to do them all in one pass. Is it easier to read? That may depend on taste. Again, I think it scales nicely as you add more variables to replace (if there are any).

Admittedly, this is the replaceAll() in Matcher, not one in String. It works similarly, but it's designed to work with a MatchResult on each replacement.

I feel Paul's way looks simpler and brief. But anyway, thank you for your proposal.
P/s: It's not that the strings are exactly the same, it just changes slightly in the last sentence in the string.

Tim Holloway

Saloon Keeper

Posts: 29101

215

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Pattern matching is a fascinating thing.

Typically, you have a source pattern that is compiled before use. You can do a one-shot compile-and-go and that's less coding, but since compiling is overhead, if you intend to do repeated matches, a one-time compile is better.

The actual match operation is done by running the match pattern as instructions to the matcher which is a finite-state machine. In other words, a specialized bytecode interpreter dedicated to running the matches. It's quite efficient.

I did a quick peek at some of the class sources. In OpenJDK7, the String replaceAll() method actually sets up a Matcher and returns its replaceAll() results. In OpenJDK8, the Matcher's replaceAll() allocates a StringBuffer and replaces into it. Other JRE's may be doing things differently, so Your Mileage May Vary.

Yes, that's what I said. A StringBuffer, not a StringBuilder. Presumably to ensure that the match is atomic.

The problem with getting rid of the "undesirables" is that sooner or later someone will decide that YOU are an undesirable.

Paul Clapham

Marshal

Posts: 28532

114

I like...

posted 3 years ago

1
Number of slices to send:

Optional 'thank-you' note:

Send

Tan Quang wrote:Oh that's right, I hadn't thought of the string array before, because it was a bit difficult to read and a bit difficult to determine the index (in the case of my old code).

If you're going to have large amounts of text like that then it may be better to use a Properties file. This is essentially a Map<String, String> which is backed by a text file, When you need to change the text or add new text entries then you can just modify the text file, a much simpler process than adding more code to your program.

Mike Simmons

Master Rancher

Posts: 5291

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Tim Holloway wrote:I did a quick peek at some of the class sources. In OpenJDK7, the String replaceAll() method actually sets up a Matcher and returns its replaceAll() results. In OpenJDK8, the Matcher's replaceAll() allocates a StringBuffer and replaces into it. Other JRE's may be doing things differently, so Your Mileage May Vary.

Yes, that's what I said. A StringBuffer, not a StringBuilder. Presumably to ensure that the match is atomic.

I don't think so - like most StringBuffers and StringBuilders, it's used as a local variable only, and so there's no way to access it from any other thread. Synchronization is wasted there. Only a minor waste, since the lock will never be contended, but still a waste. I suspect it's just a case where the person who wrote the method was used to using StringBuffer and just did what they were used to. It looks like by OpenJDK 11 someone corrected it to use StringBuilder.

Tan Quang

Ranch Hand

Posts: 284

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Paul Clapham wrote:If you're going to have large amounts of text like that then it may be better to use a Properties file. This is essentially a Map<String, String> which is backed by a text file, When you need to change the text or add new text entries then you can just modify the text file, a much simpler process than adding more code to your program.

Oh ... I will think about switching to properties files when it is too bulky because there are many codes like this.

Tan Quang

Ranch Hand

Posts: 284

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Tim Holloway wrote:Pattern matching is a fascinating thing.

Typically, you have a source pattern that is compiled before use. You can do a one-shot compile-and-go and that's less coding, but since compiling is overhead, if you intend to do repeated matches, a one-time compile is better.

The actual match operation is done by running the match pattern as instructions to the matcher which is a finite-state machine. In other words, a specialized bytecode interpreter dedicated to running the matches. It's quite efficient.

I did a quick peek at some of the class sources. In OpenJDK7, the String replaceAll() method actually sets up a Matcher and returns its replaceAll() results. In OpenJDK8, the Matcher's replaceAll() allocates a StringBuffer and replaces into it. Other JRE's may be doing things differently, so Your Mileage May Vary.

Yes, that's what I said. A StringBuffer, not a StringBuilder. Presumably to ensure that the match is atomic.

That right, on Java <= 8, both String.replace() and String.replaceAll() use replaceAll() internally. It is only improved from Java 9 and above.

Tan Quang

Ranch Hand

Posts: 284

I like...

posted 3 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Mike Simmons wrote:I don't think so - like most StringBuffers and StringBuilders, it's used as a local variable only, and so there's no way to access it from any other thread. Synchronization is wasted there. Only a minor waste, since the lock will never be contended, but still a waste. I suspect it's just a case where the person who wrote the method was used to using StringBuffer and just did what they were used to. It looks like by OpenJDK 11 someone corrected it to use StringBuilder.

True, their use of StringBuffer there is unnecessary.
As linked to the StackOverflow question I posted on above, they submitted a report that needed to change the source code of String.replace() because using both regex and repleaceAll() was "unnecessary" and "expensive". They later revised the source code of String.replace() to be quite similar to the source code of StringUtils.replace() but used StringBuffer instead of StringBuilder.
Both the String.replace() and String.replaceAll() methods have changed a lot from Java 9 and above, which is sad for Java 8.

Consider Paul's rocket mass heater.