:: End of manually writing unit tests. Why not generate test cases automatically and smartly?

CUTE :: A Concolic Unit Testing Engine for C and Java

Open System Laboratory | Computer Science Department | University of Illinois at Urbana Champaign
Case Studies

1  Case Studies

We use four sets of case studies to illustrate the effectiveness of jCUTE in finding potential concurrency related bugs. The tool and the code for each case study along with directions to reproduce the bugs reported in this paper can be found at http://osl.cs.uiuc.edu/~ksen/cute/. The experiments were run on a 2.0 GHz Pentium M processor laptop with 1 GB RAM running Windows XP.

2  Java 1.4 Collection Library

We tested the thread-safe Collection framework implemented as part of the java.util package of the standard Java library provided by Sun Microsystems. A number of data structures provided by the package java.util are claimed as thread-safe in the Java API documentation. This implies that the library should provide the ability to safely manipulate multiple objects of these data structures simultaneously in multiple threads. No explicit locking of the objects should be required to safely manipulate the objects. More specifically, multiple invocation of methods on the objects of these data structures by multiple threads must be equivalent to a sequence of serial invocation of the same methods on the same objects by a single thread.
We chose this library as a case study primarily to evaluate the effectiveness of our jCUTE tool. As Sun Microsystems' Java is widely used, we did not expect to find potential bugs. Much to our surprise, we found several previously undocumented data races, deadlocks, uncaught exceptions, and an infinite loop in the library. Note that, although the number of potential bugs is high, these bugs are all caused by a couple of problematic design patterns used in the implementation.

Experimental Setup

The java.util provides a set of classes implementing thread-safe Collection data structures. A few of them are ArrayList, LinkedList, Vector, HashSet, LinkedHashSet, TreeSet, HashMap, TreeMap, etc. The Vector class is synchronized by implementation. For the other classes, one needs to call the static functions such as Collections.synchronizedList, Collections.synchronizedSet, etc., to get a synchronized or thread-safe object backed by a non-synchronized object of the class. To setup the testing process we wrote a multithreaded test driver for each such thread-safe class. The test driver starts by creating two empty objects of the class. The test driver also creates and starts a set of threads, where each thread executes a different method of either of the two objects concurrently. The invocation of the methods strictly follows the contract provided in the Java API documentation. We created two objects because some of the methods, such as containsAll, takes as an argument an object of the same type. For such methods, we call the method on one object and pass the other object as an argument. Note that more sophisticated test drivers can be written. A simplified skeleton of the test-driver that we used is given below:
 public class MTListTest extends Thread {
  List s1,s2;
  public MTListTest(List s1, List s2) {
    this.s1 = s1;  this.s2 = s2; 
  }

  public void run() {
    int c = Cute.input.Integer();
    Object o1 = (Object)Cute.input.Object("java.lang.Object");
    switch(c){
     case 0: s1.add(o1); break;
     case 1: s1.addAll(s2); break;
     case 2: s1.clear(); break;
     .
     .
    } 
  }

  public static void main(String[] args) {
    List s1 = Collections.synchronizedList(new LinkedList());
    List s2 = Collections.synchronizedList(new LinkedList());
    (new MTListTest(s1,s2)).start();
    (new MTListTest(s2,s1)).start();
    (new MTListTest(s1,s2)).start();
    (new MTListTest(s2,s1)).start();}
 }
 
The arguments to the different methods are provided as input to the program. If a class is thread-safe, then there should be no error if the test-driver is executed with any possible interleaving of the threads and any input. However, jCUTE discovered data races, deadlocks, uncaught exceptions, and an infinite loop in these classes. Note that in each case jCUTE found no such error if methods are invoked in a single thread. As such the bugs detected in the Java Collection library are concurrency related.

Results

The summary of the results is given in the Table 1. Here we briefly describe an infinite loop and a data race leading to an exception that jCUTE discovered in the synchronized LinkedList class and the synchronized ArrayList class, respectively.
We present a simple scenario under which the infinite loop happens. The test driver first creates two synchronized linked lists by calling
List l1 = Collections.synchronizedList(new LinkedList());
List l2 = Collections.synchronizedList(new LinkedList());
l1.add(null);
l2.add(null);

The test driver then concurrently allows a new thread to invoke l1.clear() and another new thread to invoke l2.containsAll(l1). jCUTE discovered an interleaving of the two threads that resulted in an infinite loop. However, the program never goes into infinite loop if the methods are invoked in any order by a single thread. jCUTE also provided a trace of the buggy execution. This helped us to detect the cause of the bug. The cause of the bug is as follows. The method containsAll holds the lock on l2 throughout its execution. However, it acquires the lock on l1 whenever it calls a method of l1. The method clear always holds the lock on l1. In the trace, we found that the first thread executes the statements
modCount++;
header.next = header.previous = header; 

of the method l1.clear() and then there is a context switch before the execution of the statement size=0; by the first thread. The other thread starts executing the method containsAll by initializing an iterator on l1 without holding a lock on l1. Since the field size of l1 is not set to 0, the iterator assumes that l1 still has one element. The iterator consumes the element and increments the field nextIndex to 1. Then a context switch occurs and the first thread sets size of l1 to 0 and completes its execution. Then the other thread starts looping over the iterator. In each iteration nextIndex is incremented. The iteration continues if the method hasNext of the iterator returns true. Unfortunately, the method hasNext performs the check nextIndex != size; rather than checking nextIndex < size;. Since size is 0 and nextIndex is greater than 0, hasNext always returns true and hence the loop never terminates. The bug can be avoided if containsAll holds lock on both l1 and l2 throughout its execution. It can also be avoided if containsAll uses the synchronized method toArray as in the Vector class, rather than using iterators. Moreover, the statement nextIndex != size; should be changed to nextIndex < size; in the method hasNext. Note that this infinite loop should not be confused with the infinite loop in the following wrongly coded sequential program commonly found in the literature.
List l = new LinkedList();  l.add(l); System.out.println(l);

We next present a simple scenario under which jCUTE found a data race leading to an uncaught exception in the class ArrayList. The test driver first creates two synchronized array lists by calling
List l1 = Collections.synchronizedList(new ArrayList());
List l2 = Collections.synchronizedList(new ArrayList());
l1.add(new Object());
l2.add(new Object());

The test driver then concurrently allows a new thread to invoke l1.add(new Object()) and another new thread to invoke l2.containsAll(l1). During testing, jCUTE discovered data races over the fields size and modCount of the class ArrayList. In a subsequent execution, jCUTE permuted the events involved in a data race and discovered an uncaught ConcurrentModificationException exception. However, the program never throws the ConcurrentModificationException exception if the methods are invoked in any order by a single thread. Note that the Java API documentation claims that there should be no such data race or uncaught ConcurrentModificationException exception when we use synchronized form of array list. jCUTE also provided a trace of the buggy execution. This helped us to detect the cause of the bug. It is worth mentioning that jCUTE not only detects actual races, but also flips to see if the data race can be fatal, i.e., that it can lead to uncaught exceptions.
NameRun time# of # of% Branch # of Functions# of Bugs Found
in secondsPathsThreadsCoverage Tested data races/deadlocks/infinite loops/exceptions
Vector 5519 20000 5 76.38 16 1/9/0/2
ArrayList 6811 20000 5 75 16 3/9/0/3
LinkedList 4401 11523 5 82.05 15 3/3/1/1
LinkedHashSet 7303 20000 5 67.39 20 3/9/0/2
TreeSet 7333 20000 5 54.93 26 4/9/0/2
HashSet 7449 20000 5 69.56 20 19/9/0/2
Table 1: Results for testing synchronized Collection classes of JDK 1.4

3  NASA's Java Pathfinder's Case Studies

In [6], several case studies have been carried out using Java PathFinder and Bandera. These case studies involve several small to medium-sized multithreaded Java programs; thus they provide a good suite to evaluate jCUTE. The programs include RemoteAgent, a Java version of a component of an embedded spacecraft-control application, Pipeline, a framework for implementing multithreaded staged calculations, RWVSN, Doug Lea's framework for reader writer synchronization, DEOS, a Java version of the scheduler from a real-time executive for avionics systems, BoundedBuffer, a Java implementation of multithreaded bounded buffer, NestedMonitor, a semaphore based implementation of bounded buffer, and ReplicatedWorkers, a parameterizable job scheduler. Details about these programs can be found in [6]. We also considered a distributed sorting implementation used in [4]. This implementation involves both concurrency and complex data inputs.
We used jCUTE to test these programs. Since most of these programs are designed to run in an infinite loop, we bounded our search to a finite depth. jCUTE discovered known concurrency related errors in RemoteAgent, DEOS, BoundedBuffer, NestedMonitor, and the distributed sorting implementation and seeded bugs in Pipeline, RWVSN, and ReplicatedWorkers. The summary of the results is given in the Table 2. In each case, we stopped at the first error. Note the although the running time of our experiments is many times smaller than that in [6,4], we are also using a much faster machine.
It is worth mentioning that we tested the un-abstracted version of these programs rather than requiring a programmer to manually provide abstract interpretations as in [6]. This is possible with jCUTE because jCUTE tries to explore distinct paths of a program rather than exploring distinct states. Obviously, this means that we cannot prove a program correct if the program has infinite length paths. Java PathFinder and Bandera can verify a program in such cases if the state space of the abstracted program is finite.
NameRun time# of # of % Branch Lines# of Bugs Found
in secondsPathsThreadsCoverage of Code data races/deadlocks/assertions/exceptions
BoundedBuffer 11.41 43 9 100.0 127 0/1/0/0
NestedMonitor 0.46 2 3 100.0 214 0/1/0/0
Pipeline 0.70 3 5 64.29 103 1/0/1/0
RemoteAgent 0.45 2 3 87.5 55 1/1/0/0
RWVSN 2.19 8 5 68.18 590 1/0/1/0
ReplicatedWorkers 0.34 1 5 25.93 954 0/0/1/0
DEOS 35.23 111 6 64.75 1443 0/0/1/0
Table 2: Java PathFinder's Case Studies (un-abstracted)

4  Needham-Schroeder Protocol

The Needham-Schroeder public-key authentication protocol [7] aims at providing mutual authentication through message exchanges between two parties: an initiator and a responder; details of the protocol can be found elsewhere [7]. Lowe reported an attack against the original protocol and also proposed a fix, called Lowe's fix [1].
We tested a concurrent implementation of the protocol using jCUTE. jCUTE found the attack in 406 iterations or about 95 seconds of search.
We compare these results with the ones reported previously [2,3] for the same protocol. The explicit-state C model-checker VeriSoft [2] analyzed a concurrent implementation of the protocol with finite input domain. Verisoft was unable to find the attack within 8 hours, evolutionary testing (with manual tuning) found the attack after 50 minutes (on a somewhat slower machine). DART [3] found the attack on a sequential implementation of the protocol with a somewhat stronger intruder model1 in 18 minutes. In comparison, jCUTE found the attack on a concurrent implementation of the protocol with a proper intruder model in only 95 seconds, which is an order of magnitude faster than the fastest previous approach. This performance difference is due to jCUTE's efficient algorithm that only explores distinct causal structures.

5  TMN Protocol

The Tatebayashi, Matsuzaki, and Newman (TMN) Protocol [8] is a protocol for distribution of a fresh symmetric key. In this protocol when an initiator wants to communicate with a responder, it uses a trusted server to obtain a secret symmetric session key. The details of the protocol can be found in [8].
In this protocol, an intruder can establish a parallel session through eavesdropping and obtain the secret key [5]. We tested a concurrent implementation of the protocol using jCUTE. jCUTE found the attack in 522 iterations or about 127 seconds of search.

References

[1]
G. Lowe. An Attack on the Needham-Schroeder Public-Key Authentication Protocol. Inf. Processing Letters, 1995.
[2]
P. Godefroid and S. Khurshid. Exploring Very Large State Spaces Using Genetic Algorithms. In Tools and Algorithms for the Construction and Analysis of Systems, 2002.
[3]
P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In Proc. of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation (PLDI), 2005.
[4]
S. Khurshid, C. S. Pasareanu, and W. Visser. Generalized symbolic execution for model checking and testing. In Proc. 9th Int. Conf. on TACAS, pages 553-568, 2003.
[5]
G. Lowe and A. W. Roscoe. Using csp to detect errors in the TMN protocol. Software Engg., 23(10):659-669, 1997.
[6]
C. S. Pasareanu, M. B. Dwyer, and W. Visser. Finding feasible abstract counter-examples. International Journal on Software Tools for Technology Transfer (STTT'03), 5(1):34-48, 2003.
[7]
R. Needham and M. Schroeder. Using Encryption for Authentication in Large Networks of Computers. Communications of the ACM, 21(12):993-999, 1978.
[8]
M. Tatebayashi, N. Matsuzaki, and J. David B. Newman. Key distribution protocol for digital mobile communication systems. In Proceedings on Advances in cryptology (CRYPTO '89), pages 324-334. Springer-Verlag, 1989.

Footnotes:

1Note that a stronger intruder model makes it easier for the intruder to find an attack. This in turn makes the search space smaller resulting in faster testing time.


File translated from TEX by TTH, version 3.71.
On 28 Jan 2006, 15:11.
Copyright © 2006 University of Illinois at Urbana Champaign. All rights Reserved