CSci 3501 Algorithms and Computability - Lab 9.

Due Wednesday, Oct 27th at 11:59pm

What to submit and when:

Lab assignment

Work in pairs

Lab overview and goals

The goal of the lab is to get practice with: converting regular expressions to NFAs; Java regular expressions, final write-up on the sorting algorithm.

Lab tasks

Unless specified otherwise, the alphabet is the set of 0 and 1.

Convert a regular expression to an NFA (8 points)

Use the tutorial Regular Expressions and Converting to an NFA to convert the regular expressions given below to NFAs. Note that JFLAP uses + as a symbol for union and * for the star operation. Also make sure to set parentheses correctly in your expression.

JFLAP gives you hints on the steps and provides a button "(D)e-expressionify Transition" to break down regular expressions into smaller subexpressions. You need to add the empty transitions to complete each step.

As you are converting a regular expression to an NFA, write down the intermediate expressions as a note on the JFLAP screen: use the selection tool (the one that you use to mark initial and final states), right-click anywhere on the JFLAP screen, and choose "add note". For instance, if you are breaking down an expression 00+11 into 00 and 11, write 00+11 -> 00, 11.

Below are two regular expressions to convert:

  1. ((0+1)0)*
  2. (00)*+(10)*

Please export and submit your resulting expression.

Use Java regular expressions to define a pattern (21 points)

Study the Java regular expressions tutorial (ignore the Test Harness section, we use our own testing class). Note that instead of using the Pattern and Matcher classes directly, you will be using the class RegExTester that provides methods for searching, splitting, and replacing strings based on regular expressions. Study examples in main in that class.

Important: by default regular expressions will to find a matching sequence within the given input; they do not match the entire input. For instance, a regular expression "DFA" will find a match in an input "DFAs are cool".

Also important: Java strings use \ as a special character. Therefore you need \\ for predefined regex classes, e.g. "\\d+" matches one or more digit. See a link to more predefined classes in question 2 below.

Your tasks are as follows. Assume case-insensitive matches, unless specified otherwise.

  1. Find all occurrences of letter combinations "dfa" and "nfa" in a text.
  2. Find all integer numbers (i.e. sequences of digits with no other characters) in a text. Use a predefined class \d to match a single digit (see example here: http://java.sun.com/docs/books/tutorial/essential/regex/pre_char_classes.html ).
  3. Find all negative integer numbers, i.e. numbers preceded by a - sign
  4. Find all words that start with a letter t (hint: use boundary matches http://java.sun.com/docs/books/tutorial/essential/regex/bounds.html
  5. Find all words in a text that have letters t and/or i in the them (in any order)
  6. Use the doRegExSplit to split a string into items separated by |
  7. Use the doRegExReplace method to replace all occurrences of "today" by "yesterday".

Make sure to submit your code (with some method calls commented out) and a copy of your test data for each regular expression (as a separate file or in comments).

Correctness analysis and final write-up of the sorting algorithm (points TBA)

The final results (with names) are posted here.

Correctness analysis is due Monday by midnight to me and to the other team (to both team members). If you don't know the other team members' e-mail addresses, please send your submission just to me, and I will forward it. Please make it clear in the subject that it needs to be forwarded.

Correctness analysis: for the program given to you please verify that all the rules have been followed, including the following:

  1. The pre-processing phase is not performing tasks that should be done in sorting. I know this is informal, but all algorithms are different. Finding out the range of word lengths is OK, verifying the presence or the absence of certian characters is OK, doing any sort of counting is not OK. A non-sorted data structure (such as an array or ArrayList) must be used for storing the initial data.
  2. The loop contains a copying phase and a sorting phase. The copying phase is performing a copy of the original array/Arraylist. The list to sort is unsorted after the copying phase.
  3. The timing starts before the loop and ends after the loop. The loop runs the specified number of times.
  4. The resulting array at the end of each loop contains all the original elements and is sorted according to the sorting criteria (i.e. the "alphabetical" ordering is consistent with compareTo) no matter what text input is given to the program.

Additionlaly you may analyze a sorting program from any group other than the one that the one you are assigned. This will not give you extra credit, unless you were the first to discover a problem that the assigned reviewing group missed. This part can be done individually or in groups.

Final write-up (due Wednesday Oct 27th at midnight). You need to send me a write-up (as a separate file or as an e-mail message) and 2-3 slides (PDF or OpenOffice so that we can display them in the lab) that you will be presenting on Thursday in the lab. Both group members will be presenting. Know who is saying what and practice your presentation at least once. Please keep it short (3-4 minutes), be prepared to answer questions.

Guidelines for the final write-up:

  1. Decscribe your algorithm. For each sub-algorithm that you use (such as counting sort or insertion sort) make it clear what the sorting keys are. If you use radix sort, explain what sub-algorithm is used for sorting. If you use a sorted data structure, please clearly explain what it is and how the data is ordered.
  2. Specify what structure you are using for storage and how you copy it.
  3. For each of your algorithms explain what its running time is in terms of Big-theta of the worst case. If you are making a claim that your time on the given data is in a different Big-theta class than the worst case, please explain why. Also, estimate constants by the number of times the array gets traversed.
  4. Explain why the sorting is correct with respect to the given criteria.
  5. List all of the additional non-constant memory used and what it is used for.
  6. Address any correctness concerns that you get from the group that checked correctness of your algorithm. If you agree with their analysis then simply say so.
  7. Please list the place that your algorithm got in the competition.
  8. How well would you say your approach worked? Why?
  9. If you would like, write down what you could have changed to make the program run faster.

Your slides should be a summary of your final write-up. You don't need to explain the correctness, unless there are issues. You should give the summary of reasons for your time analysis. Make sure to list the time that your program got in the final test and its place. How well would you say your approach worked? Why?

Make sure to put your names and the course name on the slides. Keep in mind that lists and tables work better for slides than long sentences.

Please send the slides to me with your final write-up. I would like to post the slides on the course web page. If you prefer that your slides are not posted, please let me know.

Using JFLAP and naming your files

What to submit


CSci 3501 course web site.