Originally Posted on https://chetangupta.net/strings/
In any Programming languages to perform any kind of computation you always work with Numeric(Int, float, double etc), Boolean and Strings kind of data. They are said to be core or primitive data types which you need even for creating complex structures/classes, and know these in detail would always make help you design better code, and choose correct combination of data to get accurate outputs from your functions/programs.
In this Article we will be diving deep into concept of String. Understand how they work in Java, and what all operations you can perform on them, so without any dues lets get started!
What is a string in java?
String?, first question that should pop into you mind is what are they? and how do they look like? and answer from them are quiet simple, just like in our English Grammer, String are nothing but collections are some characters (alphabets and symbols) put together. It doesn't a string that you make sense or not, till is a group of characters its a string. For example “This is a sample String”, or even “zxczxvc adsad qwe” is also a string.
Different programming languages have different way to represent string, for example some represent it like ‘hello world’ or “hello world”, point to be noted is that string can start from single quotes - ‘ or double Quotes - “”. Also by Language design, there is a possibility language doesn't support string types directly, they might support Character as the primitive type and Strings could be formed with combination of these Characters. If a Language supports Character type then character always starts with single quotes and String will Start with double quotes.
Anyways enough with the background, coming back to the language we are here to learn which is TADA… Java, String here is not a primitive datatype, even thou it comes packed with the language. Java Supports Character type ie. char
data type. lets look at its example :
Example of Character type :
char thisIsa = 'a';
char thisIsA = 'A';
Just for your information thisIsa
is not equal to thisIsA
. There is lot to learn about character we wont be digging deep into it (you can learn more about them from here → chetangupta.net).
Example of String type :
String thisIsA = "A";
String thisIsa = "a";
Same go there thisIsa
is not equal to thisIsA
, as Strings are case sensitive. hope you got the gist and some background about strings now…
If so, riddle me this?
char thisIsA = 'A';
String thisIsAlsoA = "A";
Since we have said String are collection of sequence of Character will String variable thisIsAlsoA
is equal to thisIsA
? If your curious enough think, if you couldn't find the stick around till the end of the article…
How to create a string object?
Let’s go a layer deep in String creation process in Java. The way we have created String using double quotes is called creating String Literal
.
Example :
String a = "A";
String greet = "Hello World!";
String pokemon = "Pikachu";
Other way to create it is using String Object
, since we know in Java object creation is done using new
operator thus we will create string like String apple = new String(``"``apple``"``);
Some more example :
String a = new String("A");
String greet = new String("Hello World!");
String pokemon = new String("Pikachu");
You might think what is the difference? and also why should I create it using Object method, look more lengthy? Well that you need to understand what actually is happening under the hood in Java memory model.
When we create String Literal, JVM looks in the String Pool to check same value already exist or not. If found it would return the object reference, if not then it create new string object with given value and stores in String pool.
P.S : To learn more about String Pool checkout chetangupta.net.
And When we create object using new
operator, then two things happen, the string value that we passed would check and placed in String pool, but the object of String is created into heap memory. Thus this method would take longer time to execute, other side to keep in mind is that it will construct a new object every time.
Okay, knowing this background is good but how does it explain the new
operator use case ?
Let’s understand :
A String literal is a String Object, but a String Object is not always a String literal. Literal also represents fixed values i.e. constants, this means String literals once created cannot be modified thus its often said that Java Strings are immutable. we will discuss more about them later. But String Object is different its is designed to be mutated.
Thus whenever we are dealing string with requires lot of updation we will prefer String Objects over String Literals.
Always prefer to use StringBuffers
or StringBuilders
to make String Objects that are required lots of modifications. Go for StringBuilders
cause they are modern than StringBuffers and have more functionality.
Example :
StringBuilder greetBuilder = new StringBuilder("");
greetBuilder.append("hello");
greetBuilder.append(" ");
greetBuilder.append("World");
greetBuilder.append("!");
String greet = greetBuilder.toString();
System.out.println(greet) // output : hello World!
Hoping you now have understanding of the role of String Objects vs String Literals.
Strings Operations and Methods :
Now here comes the interesting part, let’s explore what the possible operation we can perform over String data-type :
Length of String:
If you want to know the length of the string i.e. total number of characters int the String, then use function length()
over string example :
String apple = "Apple";
println(apple.length()); // output --> 5
Character at Index :
Since String is a sequence of character we can get character according to a given index. So we can get character at a index using charAt(index)
Example :
// if we want to get `l` from string apple then
String apple = "Apple";
println(apple.charAt(3)); // output --> `l`
Note : Do understand that Strings are immutable i.e. you can read value at index but cannot update value on the given index, i.e there is not setCharAt(index)
like function on String.
Concatenation/Adding of Strings :
We can add two string i.e. also called Concatenation into single string using concat()
function .
Note : the new concatenated string that we get after operation will be new string, cause String are immutable.
Example :
String apple = "Apple";
String orange = "Orange";
println(apple.concat(orange)); // output --> AppleOrange
println(orange.concat(apple)); // output --> OrangeApple
SubString :
Substring means taking a part of string based on given indexes. There are two overloaded function method for this :
subString(int startIndex) : String
: it will return string from start index to end of stringsubString(int startIndex, int endIndex):String
: it will return string from start index to the end index
Example :
String orange = "Orange";
println(orange.substring(/*StartIndex*/4)); // output --> ge
println(orange.substring(/*StartIndex*/1,/*EndIndex*/5)); // output --> rang
Contains :
We use Contains to check if String contains another string or not. Note: in background it checks for CharSequence for faster evaluation than check each words. (learn more about that from here: chetangupta.net.
Example :
String orange = "Orange";
println(orange.contains("range")); // output --> true
println(orange.contains("apple")); // output --> false
Join Strings :
Join as name suggest it would join two strings, but you might wonder what is the difference between join and concat?
- Just look the parameters they are receiving, for Join you can transfer ‘n’ number for strings but for concat it would work only two strings,
- Join as ability to provide delimiter/separator i.e. symbol to use when it is merging the String.
- Unlike concat, Join is a static function over String class .
Example :
String apple = "Apple";
String orange = "Orange";
String berry = "StrawBerries";
String seperator = "|";
println(String.join(seperator,apple,orange,berry));
// output --> Apple|Orange|StrawBerries
Comparing to Strings:
Just for your knowledge when we perform a comparing operation, we are censored wether two items are equal, first item is greater than(>) second item or first item is smaller than(<) second item.
In Programming, comparison function returns integer value to represent same scenarios.
if item1 == item2 then returns 0,
if item1 > item2 then returns 1,
if item1 < item2 then returns -1,
In String, these comparison is lexicographically(dictionary order) based i.e
if string1 == string2 then returns 0
if string1 > string2 then returns 1, (string 1 comes ahead in dictinoary order)
if string1 < string2 then returns -1, (string 2 comes ahead in dictinoary order)
In Java, we have two functions for comparing string in Java.
compareTo()
: This compares two string but is case sensitive.compareToIgnoreCase()
: This compares two string but is case in-sensitive.
Lets see example :
// case sensitive comparison
String apple = "Apple";
println(apple.compareTo("Apple")); // output --> 0
println(apple.compareTo("apple")); // output --> -32
// case in-sensitive comparison
String apple = "Apple";
println(apple.compareToIgnoreCase("Apple")); // output --> 0
println(apple.compareToIgnoreCase("apple")); // output --> 0
Casing in String :
We can control casing of our String character using two function :
toUpperCase()
: this function upper case all of the characters in out string,toLowerCase()
: similarly this function lower case all of the characters in out string.
String apple = "ApPlE";
println(apple); // output --> ApPlE
println(apple.toUpperCase()); // output --> APPLE
println(apple.toLowerCase()); // output --> apple
Stripping White Space :
String values can often have white spaces in them, from beginning or in the end. Note : Space between the Characters are not white space. we can use trim()
function to remove them.
String appleTrimmed = " apple ".trim();
println(appleTrimmed) // output --> apple
println(appleTrimmed.length()) // output --> 5
Replace Character in String :
If you want to replaces character from old char to new character.
Note : Keep in mind replacing character is a string is not same process as setting character at some index in a string. Rule that Strings are immutable will also be applied there, thus when you try to replace a character you get a entire new string with replaced character.
Example :
String apple = "apple";
println(apple.replace("a","pinea")); // output --> pineapple
String Concatenation using “+” operator
We have already seen two ways to join two string, one is concat and other is join, but a very common pattern you will encounter you will see for concatenating two strings would be “+”.
Using “+” operator works very same with concat function. Example :
String apple = "apple";
String orange = "Orange";
String berry = "StrawBerries";
println(apple.concat(orange).concat(berry)); // output --> appleOrangeStrawBerries
println(apple + orange + berry); // output --> appleOrangeStrawBerries
Which one should we use?
In our opinion comparing between “concat()” and “+”, “+” operator seems more readable, but be careful using it with numeric type data.
Example :
int orangeCount = 5;
int berryCount = 5;
println("fruits count:" + orangeCount + berryCount);
// wrong output --> fruits count:55
println("fruits count:" + (orangeCount + berryCount));
// right output --> fruits count:10
But as we suggested this as same issue as concat, since Strings are immutable we are basically creating new string better and faster way would be to use StringBuilder.
Creating Format Strings
We have covered lot of the core functionalities of the string, and we can clearly see a use-case in which we are appending strings with string, float, numbers etc… what if we can have placeholder values in string and on runtime these place holder could be replaced with values we want to put?
This is what String.format()
function does. We create a string with placeholder/specifiers and we populate them with the value, we also call such type of string as template strings.
We use different specifier for different type of data type, most commonly use once are :
- %b for boolean,
- %s for Strings,
- %d for Integers only numbers
- %f for floating point numbers
and others are there which you read more from here : chetangupta.net
Example :
println(String.format("%s has %d apple, and %d oranges", "John Doe", 3, 2));
// output --> John Doe has 3 apple, and 2 oranges
Note : values are placed into the template position according to their occurrence in function parameter,
Compare to concat or plus operator this method is useful when you are working with very long strings. Performance wise its not as great as StringBuilder cause it need to parse entire string and parameters before generating string.
Escape character in Strings
While we are working with long string, if we try to break it into two parts most obvious thing which would come to your mind would be to press enter and continue with your string in next line, but if you try to do that compiler will treat you Syntax error, for instance :
String message = "
I want to make
very lengthy String!
";
To do these kind of functionality, you need to use special characters for \n
in the string for adding new line in the string.
String message = "\nI want to make\nvery lengthy String!";
/*
* Output :
* I want to make
* very lengthy String!
*/
Escaped Character :
\t → add horizontal tab
\n → add new line
\’ → add single quotes in String
\” → add double quotes in String
\\ → add backslash in the String
Examples :
String message = "Hello\tWorld!";
println(message); // output --> Hello World!
message = "Hello\nWorld!";
println(message);
/*
* Output -->
* Hello
* World!
*/
message = "Hello\'World\'!";
println(message); // output --> Hello'World'!
message = "Hello\"World!\"";
println(message); // output --> Hello"World!"
message = "Hello\\World!\\";
println(message); // output --> Hello\World!\
Java Strings are Immutable
If you read entire article thoroughly you have head us repeating this statement again and again. This is because its the main mechanism by which Java Strings function.
Key reason of having String as immutable is to have benefit of caching, performance and safety.
String is the most widely used data structure in any program and JVM optimizes the amount of memory allocated for them by storing only one copy of each String in the String pool.
Thus caching the String literals and reusing them saves a lot of heap space because different String variables refer to the same object in the String pool saving crucial memory resource and improve performance of the application in general.
One added advantage of String being immutable makes them thread-safe so Strings in concurrent environment are safe to read and update.
Bonus : Riddle Solution
char thisIsA = 'A';
String thisIsAlsoA = "A";
is thisIsAlsoA
is equal to thisIsA
? this was the riddle we asked.
If you think Yes, then sorry mate! you are wrong. Conceptually you are right this should be correct but the issue is that Java is not dynamically typed language its a statically typed. Thus Types play major role here. Since type of String and char didn't match it failed. It wouldn’t have failed in language like JavaScript.
one more thing… String is collection of char right?, thus we can convert our String into Character Array using toCharArray()
, since Array is a collection thus justifying our statement programatically and if we compare first index position of the array with the character it would be equal. Nice !
Conclusion
Lets have small recap :
- We learned String is collection of sequence of Characters
- In Java we can create String in two ways - String literals and String Objects
- There are various operation that we can perform on strings - concat, length, compare and many more
- What are special escape characters in Strings
- Most importantly why String are Immutable!
We hope you have learned a lot and untangled all the mysteries and doubts around Java String. Until next time happy hacking!