The Sorting Competition is a multi-lab exercise on choosing the fastest sorting algorithm for a given type of data. By "fast" we mean the actual running time and not the Big-Theta approximation. You will get description of the data and a sample input file to practice, but the final test will be done on a different file.
A portion of your grade will depend on your place in the competition. Three groups that get the best time in the final run will get extra credit (the amount depending on the place).
The assignment will have two lab periods: one to get preliminary competition results and the second one for the final competition. In addition there will be a correctness analysis assignmment, a presentation, and a final write-up.
Make sure to make your Eclipse workspace read-protected to keep your "trade secrets" secret.
The idea for this data and sorting criterion has been suggested by Lucas Ellgren. >
Data for this program consists of real numbers between 0 (inclusive) and 1 (exclusive). One third of the data is uniformly distributed in the range.
Another third of the data forms several clusters of numbers such that the numbers in each cluster are all within .0001 of each other. Each cluster is approximately 3% of the total array. Clusters are uniformly distributed over the original range.
The final third of the data forms clusters of numbers within .00000001 of each other. Once again, each cluster is 3% of the total and are uniformly distributed in the range.
After the data is generated it is shuffled.
The goal is to sort the data in the non-decreasing order.
Below is a sample of (sorted) data, although a real-example data would be at least few thousands elements:
0.00611718
0.20771484
0.33271706
0.41008081
0.62512620
0.62512621
0.62512622
0.62512623
0.62512624
0.62512625
0.73087819
0.85592000
0.85592001
0.85592002
0.85592003
0.85592004
0.85592005
0.93986539
0.96370480
0.96775591
DecimalFormat formatter = new DecimalFormat("0.00000000");
....
System.out.println(formatter.format(value));
Please submit all your Java files to em (CC your group) by email. You must have authors' names in every file and there should be comments clarifying what you are doing, but no other documentation is required for this round.
Your files will be renamed (to anonymize the test runs) and ran on a different data file than the one you were using for practice. Timing results will be posted. You will only be given your own group number so that you know how you did, but don't know whom the other results belong to.
After the sorting programs trial run, continue working on your sorting program. For the final submission include, in addition to your Java files, a write-up that describes your sorting algorithm. Discuss its efficiency both in theory (the Big-O) and in practice: effects of programming choices, such as an array versus ArrayList, using specific Java methods, etc.
If you changed anything in your program after the initial run, please briefly indicate what you changed and why. You also might want to include things that you tried that didn't work out.
Results of round 2
Results of round 2 with 15 loops (JVM
warmup)
Every person will be assigned a group to review. Your goal is to verify that they followed all of the rules. If any of the rules were broken, you should also assess if this has given the group unfair advantage in the sorting time. If all the rules were followed, please briefly justify your finding, for instance: "the resulting array is sorted because the numbers were stored as integers and Java-provided quicksort was applied to sort the integers".
Please send your review to me and to all members of the team you are reviewing. If you don't know the other team members' e-mail addresses, please send your submission just to me, and I will forward it. Please make it clear in the subject that it needs to be forwarded.
Correctness analysis: for the program given to you please verify that all the rules have been followed, including the following:
Additionally you may analyze a sorting program from any group other than the one that the one you are assigned. This will not give you extra credit, unless you were the first to discover a problem that the assigned reviewing group missed. This part can be done individually or in groups.
If the group that was analysing your algorithm found any issues, you have an opportunity to respond to them (by e-mail, CC me) if you disagree with their findings. I will be happy to discuss this with you. Note that if you don't send a response before Thursday (the presentation time) then I assume that you agree with their findings.
Please upload your slides to the Wiki
You need to prepare a 3-5 minute presentation about your algorithm. Prepare 3-5 slides about your algorithm (PDF or OpenOffice so that you can display them in the lab). If you two group members are preparing the presenatation, both have to be presenting. Know who is saying what and practice your presentation at least once. Be prepared to answer questions.
If you are presenting on someone else's algorithm, please answer all of the questions for their algorithm.
Your presenation has to address:
Keep in mind that lists and tables work better for slides than long sentences.
Please take notes when listening to the presentation; you will need those for the final write-up.
The final write-up has two parts: the description of your algorithm (if you presented on another group's algorithm then of their algorithm), and a summary of what algorithms overall turned out to be the most effective for this problem and why.
Each of the two parts should be about a page long.
The algorithm description should follow the guidelines for the presentation, only be more detailed. Examples (of data storage, etc.) help.
The algorithm comparison should list all approaches that were used for the problem by all teams (e.g. "three groups used variations of radix sort") and their comparative effectiveness. Note that data represntation plays a significant part in efficiency. What have you learned through the competition?