CSci 3501 Fall 2008, UMM

CSci 3501 Algorithms and Computability - Lab 3.

September 16. Due Monday, September 22 at 11:59pm

What to submit and when:

All submissions are electronic: by e-mail to elenam at morris.umn.edu and CC to all lab partners. Please do not delete your e-mail from "Sent mail" or your mailbox until the end of the semester.
When working on the lab, please comment your work so that it is clear what contributions of each person are.
At the end of the lab each group should send me the results of their in-class work. Please indicate if this is your final submission.
If your submission at the end of the lab time was not final, please send me(CC to the lab partner(s)) a final copy before the due time. Please use the subject "3501 Lab N", where N is the lab number.

Lab assignment

Continue working in pairs.

Overview and goals

In this lab we will continue studying efficiency of quicksort (see http://en.wikipedia.org/wiki/Quicksort). The goal is to develop and study approaches to make the quicksort split data as evenly as possible. The approaches include a randomized pivot selection, a median-of-three pivot selection, and use of insertion sort when the array is nearly sorted. You will continue experimenting with quicksort on different types of data (completely random, ordered, partially ordered) and compare it to the pre-defined mergesort in the number of comparisons. The goal is to learn practical approaches to efficient algorithm implementation.
Note that other ways of speeding up quicksort may reduce the program's running time by cutting down on recursive calls or by providing more efficient memory usage. However, they do not reduce the number of element comparisons, and thus will not be included in this lab.

Tasks

Use your implementation of quicksort and the testing code from the previous lab. You will be using the same kind of data as in the lab last week, i.e. arrays of 10,000 elements filled in as follows:

Generated at random
Sorted in increasing order
10 sorted sequences of 1,000 elements each
100 sorted sequences of 100 elements each

Run all your tests (see below) 5 times on each of these sets.

You need to implement the modifications of quicksort listed below. Please write a new copy of quicksort for each of the three modifications. Then compare the results to the original quicksort and to the mergesort (use the same data for all three sorting algorithms). Make sure to test (for each modification!) that the resulting array is sorted. Record the results (the number of comparisons).

Randomized quicksort: choose the pivot at random at every step of the algorithm. For simplicity just exchange the randomly chosen pivot with the element in the position r, this way you don't have to rewrite your algorithm (see p. 154 for more details).
The median-of-three pivot selection: to select a pivot, pick three subarray elements at random, then choose their median as the pivot (see exercise 7.4-6 p. 159). Use compareTo for comparison of the three elements since their comparison contributes to the total cost. Note that this approach becomes less efficient as the array size decreases. Use a threshold value k to switch to the usual pivot choice when the portion of the array passed to quicksort is less than k. Try different values of k and choose an optimal one (approximately).
Switching to insertion sort at the end: when the array is nearly sorted, stop the quicksort without finishing the sorting, and then use insertion sort on the entire array to finish the process (see exercise 7.4-5 p. 159). As in the previous problem, choose a threshold subarray size to leave the array for the insertion sort.

What to submit

Please submit all your code with the results. Also, please write up detailed observations about each of the modifications: did it improve the efficiency? If yes, on which kind of data? How do the modifications compare to each other? Are any of them close to (or better than) mergesort?

CSci 3501 course web site.