Why Intern? | Overflow |
Interning and substring | Under The Hood |
Interning and the void String | Manual Interning |
The Intern Gotcha | Learning More |
Intern and new | Links |
Intern and garbage Collection |
Empty Strings resulting from String. substring are not automatically interned either. Because of this, the resulting empty substring can still indefinitely encumber a long base String preventing it from being garbage collected.
String s = new String( "hello" );instead
String s = "hello";This is the opposite of interning. You are deliberately creating a duplicate distinct (but identically valued and definitely not interned) hello String object. There are two legitimate uses for doing that:
This brings up yet another related question. Is s == s. substring( 0 ) compelled to be false? Yes!
One other place will see new String used legitimately is
String password = new String ( jpassword.getPassWord() );getPassword returns a char[], so it is not the silliness it first appears to be. It does this to permit you to empty the char array after use in high security situations.
Consider piece of code like this: String s = new String( Hello ); The compiler puts the literal Hello in the class file is such a way that it will become an interned String when the class is loaded. When you stupidly use new String you create a new String on the heap, one with an address different from the interned version. (In Oracle’s JVM, the interned Strings are stored in a special pool of RAM called the perm gen, where the JVM also loads classes and stores natively compiled code. However, the intered Strings behave no differently than had they been stored in the ordinary object heap.) Had you written sensible code like this: String s = Hello; you would not have created a duplicate String Object. You would not have defeated the interning. s would point directly to the interned String Hello.
With JDK 1.2+, an interned String can be garbage collected if there are no more references to it and it is not a compile time constant. This means if you programmatically recreate the String (e.g. with a StringBuilder) and reintern it, a new different String object, with a different identityHashCode will become the master unique String object. This quirk does not cause any practical problems. When you compare two interned strings containing the same characters with == they still always come out true.
java.lang.OutOfMemoryError: String intern table overflow means you have too many interned Strings. Some older JVM ’s may limit you to 64K Strings, which leaves perhaps 50,000 for your application. The IBM (International Business Machines) Java 1.1.8 JRE (Java Runtime Environment) has this limit. This is an Error not an Exception if you want to catch it. Here is the source for a simple Java program called InternTest.
Also be aware interning inhibits garbage collection of interned Strings.
The collection of Strings registered in this HashMap is sometimes called the String pool. However, they are ordinary Objects and live on the heap just like any other (perhaps in an optimised way since interned Strings tend to be long lived). The String Object lives on the heap and a reference to it lives in the HashMap. There is so separate pool of interned String objects.
Whenever a String is interned, it is looked up in the HashMap to see if it exists already. If so the user gets passed a reference to the master copy. Normally he will use that copy in preference to his. His duplicate copy then will likely soon have no references to it and will be eventually garbage collected. If the String has never been seen before, a reference to it will be added to the HashMap and intern will hand him a reference to his own String, now registered as the unique master. Note that the intern process does not make a copy of the String, it just keeps a reference to the unique master copies.
All the Strings, interned and ordinary live on the heap. When there are no references left to a String except the intern HashMap registry reference, it will be garbage collected since intern keeps only a weak reference to it.
When you say new String, it is not automatically interned. Thus there may then be duplicates on the heap. If you later use intern on that String, those duplicates won’t be cleaned up. Only when you intern all copies of a String and discard references to the uninterned versions do you maintain but a single copy.
However, in the most recent JVM s, the interned string cache is now usually implemented in soft references fashion, so that interned strings may become eligible for garbage collection as soon as they are no longer strongly referenced. Here is how you might manage the dedup internening proceses yourself, similar to the way the JVM does it.
For example, let as assume you were reading a CSV file of names and addresses and storing it internally in a Collection of some sort. Since many people live in the same city, RAM will soon become cluttered with hundreds of duplicate String object copies of the names of local cities.
Create a HashMap (not a HashSet) to look up by city a master String object for each city. Every time you get a city, you look it up in the HashMap. If it is there, replace your reference with a reference to the master copy. Your String object duplicate will then become eligible for garbage collection. If it is not in the HashMap, add the city String to the HashMap.
When you are finished with the adding cities, you can discard the HashMap. The master city Strings you put in the HashMap will still exist, will still be unique, will still behave as if they had been String. interned, except those without any other references will become eligible for garbage collection.
This page is posted |
http://mindprod.com/jgloss/interned.html | |
Optional Replicator mirror
|
J:\mindprod\jgloss\interned.html | |
Please read the feedback from other visitors,
or send your own feedback about the site. Contact Roedy. Please feel free to link to this page without explicit permission. | ||
Canadian
Mind
Products
IP:[65.110.21.43] Your face IP:[34.231.180.210] |
| |
Feedback |
You are visitor number | |