Today while doing the performance tuning of my application using JProfiler , I came across that synchronized hashmap was behaving ridiculously slow in the multi threaded environment. I was so shocked by seeing that as we are performing most of the time only read operations and very few write operations on the map , so there is not much requirement of synchronized hashmap (Collections.syncronized(map)) , but in case if any write operation happens then it might create a problem in my application.
So, after this i have been left with only two options either go to the unsynchronized map or to use the java concurrent API that is available since java version 1.5. So , i thought of giving it a try and did some research on it on Google. I was overwhelmed by seeing the response for the support for ConcurrentHashMap and then i thought I will give a try to it by replacing synchronized hashmap in my application with the ConcurrentHashMap. I was able to did it successfully and to my surprise the bottlenecks that JProfiler was showing me earlier are not available anymore.
Below is my finding while doing research on Concurrent HashMap on google.
ConcurrentHashMap is a pretty ignored class. Not many people know about it and not many people care to use it. The class offers a very robust and fast method of synchronizing a Map collection.
I have read a few comparisons of HashMap and ConcurrentHashMap on the web. Let me just say that they’re totally wrong. There is no way you can compare the two, one offers synchronized methods to access a map while the other offers no synchronization whatsoever.
What most of us fail to notice is that while our applications, web applications especially, work fine during the development & testing phase, they usually go tits up under heavy (or even moderately heavy) load. This is due to the fact that we expect our HashMap’s to behave a certain way but under load they usually misbehave.
Hashtable’s offer concurrent access to their entries, with a small caveat, the entire map is locked to perform any sort of operation.
While this overhead is ignorable in a web application under normal load, under heavy load it can lead to delayed response times and overtaxing of your server for no good reason.
This is where ConcurrentHashMap’s step in. They offer all the features of Hashtable with a performance almost as good as a HashMap. ConcurrentHashMap’s accomplish this by a very simple mechanism.
Instead of a map wide lock, the collection maintains a list of 16 locks by default, each of which is used to guard (or lock on) a single bucket of the map. This effectively means that 16 threads can modify the collection at a single time (as long as they’re all working on different buckets). Infact there is no operation performed by this collection that locks the entire map.
Retrieval operations on a ConcurrentHashMap do not block unless the entry is not found in the bucket or if the value of the entry is null. In such a case the map synchronizes on the bucket and then tries to look for the entry again just in case the entry was put or removed right after the get in synchronized mode.
Removal operations do require a bit of overhead. All removal operations require the chain of elements before and after to be cloned and joined without the removed element. Since the value of the map key is volatile ,if a thread already traversing the bucket from which a value is removed reaches the removed element, it automatically sees a null value and knows to ignore such a value.
Traversal in a ConcurrentHashMap does not synchronize on the entire map either. Infact traversal does not synchronize at all except under one condition. The internal LinkedList implementation is aware of the changes to the underlying collection. If it detects any such changes during traversal it synchronizes itself on the bucket it is traversing and then tries to re-read the values. This always insures that while the values recieved are always fresh, there is minimalistic locking if any.
Iteration over a ConcurrentHashMap are a little different from those offered by other collections. The iterators are not fail-fast in the sense that they do not throw a ConcurrentModificationException. They also do not guarantee that once the iterator is created it will list/show all elements that are added after its creation. The iterators do however guarantee that any updates or removal of items will be reflected correctly in their behaviour. They also guarantee that no element will be returned more than once while traversal.
Happy Coding :)