Java Objects and Collections

Everything in Java apart from primitive data types are Objects, every exception, every event, every array extends from java.lang Object.

toString() method

When you want a to read something more meaningful about an object of your own class you can override the toString() method.

overriding toString() example

public class myTest() {
   String classInfo = "This is my new class that overrides the toString method";

   public String toString() {
      return ("This is some information about myTest class" + classInfo);
   }
}

public class Test() {
   public static void main(String[ ] args) {
      myTest m1 = new myTest();

      System.out.println(m1);
      System.out.println(m1.toString());
   }
}

Note: we can put lots more information in the returning String if we wish

Overriding equals()

There are cases when you want to compare two objects that contain the same attributes and because the equals() method in the Object class uses only the = = operator for comparison, the two objects are only considered equal if the references refer to the same object. This method for comparing has a problem when using Hashtables, lets say we have a number of car objects with the same attributes and you put these into a Hashtable, when you search the Hashtable to get a number of objects with the same attributes, you will only retrieve the one which matches the reference you have, this is because the equals() method only uses the = = operator and hence only one object will match within the Hashtable

Not overriding equals()

public class stringTest {
   public static void main(String[] args) {

      Moof one = new Moof(8);
      Moof two = new Moof(8);

      if ( one.equals(two) ) {
         System.out.println("One equals two");
      } else {
         System.out.println("One does not equal Two");   // Will be outputted, they are not equal
      }
   }
}

class Moof {
   private int moofValue;

   // Constructor
   Moof(int val) {
      moofValue = val;
   }

   // Method
   public int getMoofValue() {
      return moofValue;
   }
}

Note: even though objects one and two are the same (same attributes) they are not equal because we use the Object class equals which uses the == operator to compare and as objects one and two do not point to the same reference they are not considered equal

overriding equals() public class stringTest {
   public static void main(String[] args) {

   Moof one = new Moof(8);
   Moof two = new Moof(8);

   if ( one.equals(two) ) {
      System.out.println("One equals two");    // using the overridden equals objects are equal
   } else {
      System.out.println("One does not equal Two");
   }
}

class Moof {
   private int moofValue;

   // Constructor
   Moof(int val) {
      moofValue = val;
   }

   // Method
   public int getMoofValue() {
      return moofValue;
   }

   // equals override, so that we can match objects which have the same attributes
   public boolean equals(Object o) {
      if ((o instanceof Moof) && (((Moof)o).getMoofValue() == this.moofValue)) {
         return true;
      } else {
         return false;
      }
   }
}

Note: now we override equals so that the objects are now equal

The equals() method has a contract which has a number of rules that must be followed (obtained from the Java docs)

equals() and hashCode() are bound together by a joint contract, that states two equal objects should have the same hashcode as well, so to be very safe if you override the equals() method then you should override the hashCode() method.

Overriding hashCode()

The hashcode value of an object is used in collection classes, it's basically a object ID number, however it is not unique. The hashcode is used to store the object in collections such as HashMap and HashSet, this code is also used to locate the object as well.

hash code example String s1 = "Hello";
String s2 = "hello";

System.out.println("The hashcode for " + s1 + " is " + s1.hashCode() );   // results in 69609650
System.out.println("The hashcode for " + s2 + " is " + s2.hashCode() );   // results in 99162322

How do hashcode's work, is that the collection will use a number of buckets that will contain hashcode's that point to data. For example lets say we have four people objects, each object is hashed by using an hashing algorithm and then placed into a bucket

  Key HashCode Algorithm HashCode
  Alex A(1) + L(12) + E(5) + x(24) = 42
  Bob B(2) + o(15) + b(2) = 19
  Dirk D(4) + i(9) + r(18) + k(11) = 42
  Fred F(6) + r(18) + e(5) + d(4) = 33

In real life the hash algorithm is a lot more advanced than this, as we can see Alex and Dirk will go into one bucket, Bob in another and Fred in another, so we have a bucket that contains two people objects which is acceptable. So now when you perform a search you no longer need to check each object, you simple hash the object you want to search for then go straight to that bucket and then only search through that bucket cutting down the number of objects to check against thus speeding up processing., in this case if you searched for Alex we would go straight to bucket 42 then search the two objects in there for Alex (thus only two searches instead of 4).

Now the problems arise if you do not pass the right name to search for (missing letter, upper case instead of lower case, thus the hashcode produced for you search would not match any bucket, now you see why if two objects are considered equal, their hashcode's must also be equal, Otherwise you'd never be able to find the object since the default hashcode method in class Object virtually always comes up with a unique number for each object. so if two objects are equal their hashcode's must be equal as well.

overriding hashcode class HasHash {
   public int x;
   HasHash(int xVal) {
      x = xVal;
   }

   // Override the equals method   
   public boolean equals(Object o) {
      HasHash h = (HasHash) o;      // don't try without instanceof test, use the code from the example above

      if ( h.x == this.x) {
         return true;
      } else {
         return false;
      }
   }

   // Override the hashCode method
   public int hashCode() {
      return (x * 17);             // does not need to complicated - see below
      // return 1234;              // could even just return the same value to make objects equal
   }
}

What we are trying to achieve is a even distribution of data across the buckets, thus speeding up searches. Again the hashCode contract states the following rules

In short this means

Condition Required Not required (but allowed)
x.equals(y) == true x.hashCode() == y.hashCode()  
x.hashCode() == y.hashCode()   x.equals(y) == true
x.equals(y) == false   no hashCode() requirement
x.hashCode() != y.hashCode() x.equals(y) == false  

Collections

There are a few basic operations you'll normally use with collections

There are six core interfaces: Collection, Set, List, SortedSet, Map and SortedMap

There are ten concrete classes

Map Implementations Set Implementations (Collections Class) List Implementations (Collections Class)
HashMap HashSet ArrayList
HashTable LinkedHashSet Vector
TreeMap TreeSet LinkedList
LinkedHashMap    

The collections come in three basic flavors each having subflavors of Sorted, Unsorted, Ordered and Unordered

Ordered means that you iterate through the collection in a specific order (not random), for an example you would iterate through an array starting at index position 0, this does not means that the object you retrieve are in any sorted order.

Sorted means you retrieve the objects that are sorted naturally, what i mean here is Strings objects would be alphabetically ordered, int would be numerically ordered.

List (a list cares about the index)
ArrayList

Think of this as a growable array, it is a ordered collection (uses a index) but it is not sorted, this interface implements a RandomAccess interface

Choose this over a LinkedList when you need fast iteration but aren't as likely to be doing a lot of insertion and deletion

Vector

Vector and Hashtable were the original collections. A Vector is the same as a ArrayList but a Vector methods are synchronized for thread safety. Vectors also implement a RandomAccess interface.

Use a ArrayList if you don't need synchronization at it decreases performance.

LinkedList

LinkedList is ordered by index position, like ArrayList but doubly-linked up.

LinkedLists are ideal for stacks and queues, its also a good candidate for fast insertion and deletion

Set (a set cares about the uniqueness)
HashSet

A HashSet is an unsorted, unordered Set. It uses the hashcode of the object being inserted, so the better the hashcode the better performance you will get.

Use this when you want a collection with no duplicates and you don't really care about the order.

LinkedHashSet

A LinkedHashSet is an ordered version of a HashSet that maintains a doubly-linked List across all elements.

use this collection when you care about the order

TreeSet It uses a Red-Black tree structure that guarantees that elements will be in ascending order, according to the natural order of the elements (String would be a,b,c,d,etc , int would be 1,2,3,4), there will also be no duplicates.
Map (a map cares about unique identifiers)
HashMap

The HashMap gives you an unsorted, unordered Map

Use when you need a Map but don't care about the order, fast updates (key/value pairs), allows one null key and many null values.

Hashtable

Hashtable is the synchronized version of a HashMap

If you don't need synchronization then use a HashMap

LinkedHashMap LinkedHashMap maintains insertion order or last accessed.
TreeMap A TreeMap is a sorted Map by the natural order of elements

A quick reference regarding collections

Class
Map
Set
List
Ordered
Sorted
HashMap
X
No
No
Hashtable
X
No
No
TreeMap
X
Sorted
By natural order or custom comparison rules
LinkedHashMap
X
By insertion order or last access order
No
HashSet
X
No
No
TreeSet
X
Sorted
By natural order or custom comparison rules
LinkedHashSet
X
By insertion order or last access order
No
ArrayList
X
By Index
No
Vector
X
By Index
No
LinkedList
X
By Index
No

For examples on how to create and use collections have a look at data structures.

Utilities Packages has detailed information on how to create and use Vectors, Stacks and Hashtable.

Garbage Collection

Java's garbage collector provides an automatic solution to memory management, in most cases it frees you from having to add memory management logic to your code, the only downside is that you have no control when it runs and when it doesn't.

The purpose of the garbage collector is to find and delete any objects that are no longer in use (Java calls them reachable) and free the memory so that it can be used again. The garbage collector is under the control of the JVM, only the JVM decides when it runs. You can only ask the JVM to run the garbage collector but it decides if it runs now of later, generally the JVM will run the garbage collector when it's low on memory.

Every Java program will at least run one thread (main()), but can have many threads running. A thread can be either alive or dead so an object is eligible for garbage collection when no live thread can access it. So when the JVM discovers an object that cannot be reached by any live thread it will consider that object eligible for deletion and it might even delete it at some point (it might never delete it).

So a reachable object is a object that is available to a live thread.

Here are some examples on how to make an object eligible for garbage collection

nulling a reference

public class GarbageTruck {
   public static void main(String [] args) {
      StringBuffer sb = new StringBuffer("Hello");

      // object sb is not eligible for garbage collection
      System.out.println(sb);

      // object is now available for garbage collection
      sb = null;
   }
}

Reassigning a Reference

public class GarbageTruck {
   public static void main(String [] args) {
      StringBuffer s1 = new StringBuffer("Hello");
      StringBuffer s2 = new StringBuffer("Goodbye");

      // object sb is not eligible for garbage collection
      System.out.println(s1);

      // Now the StringBuffer "Hello" is eligible for collection
      s1 = s2;
   }
}

Isolating a reference

public class Island {
   Island i;

   public static void main(String [] args) {
   
      Island i2 = new Island();
      Island i3 = new Island();
      Island i4 = new Island();

      i2.i = i3;
      i3.i = i4;
      i4.i = i2;

      i2 = null;
      i3 = null;
      i4 = null;

      // Once here i2, i3 and i4 are eligible for collection
   }
}

The first thing to remind you about is that you cannot force a garbage collection, you politely ask the JVM to collect the garbage but it may fall on deaf ears.

Asking to garbage collect import java.util.Date;

public class stringTest {
   public static void main(String[] args) {

      Runtime rt = Runtime.getRuntime();

      System.out.println("Total JVM memory: " + rt.totalMemory());
      System.out.println("Before JVM memory: " + rt.freeMemory());

      Date d = null;

      for (int i = 0; i < 10000000; i++) {
         d = new Date();
         d = null;
      }

      System.out.println("After JVM memory: " + rt.freeMemory());
      rt.gc();                                                    // politely ask to collect garbage
      // Runtime.getRuntime().gc();                               // other way to run the gc
      System.out.println("After GC memory: " + rt.freeMemory());
   }
}

Java provides you a mechanism to run some code just before the object is deleted, this code is located in the method finalize() (inherited from class Object) this sounds like a good idea but you cannot count on the garbage collector to ever delete an object (so the finalize method might not run). It is recommended that you do not override the finalize() method at all, if you do overide it then it should be protected so that subclasses have access to the method.