COMP 2400 Unix Tools, Spring 2007

General Information


The prerequisites for this class is COMP 2370 or permission from the instructor. You should be able to write simple programs in either C/C++ or Java. You must have a good understanding of basic data structures such as lists, stacks, trees and hashtables.

This is a class on advanced programming in the UNIX environment using C, C++, Java and UNIX tools. There will be programming assignments designed to make you use key features of SVR4, expose you to the use of IDEs, debuggers, profilers, version control and static program analysis tools. Programming assignments involve writing code in C, C++, Java and various scripting languages. You are expected to attend the lectures, study the textbook and find and read additional material available on the Internet and study the documentation available for your GNU/Linux system.

Lecture Hours and Location

The lectures will be held Mondays and Wednesdays from 10am to noon in John Greene Hall 216.
In order to get a door code for the Computer Lab please visit the labcode webpage. The code is updated on weekly basis.

Office Hours

Christian Grothoff - JGH 214
Monday, Wednesday 1pm-4pm and by appointment.

Textbook and Syllabus

"UNIX Systems Programming" by David A. Curry is the main textbook for this course.
"Version Control with Subversion", on-line book (no need to buy a copy). You must know how to use subversion as a developer (not as an administrator) in order to submit your assignments. Use this book as a reference if you encounter problems. Basic knowledge of chapters 1-3 should be sufficient.

Specific topics that will be covered:

Assignments and Grading

Various programming assignments are offered for the course. Assignments can be done individually or in groups of up to six students. There are two categories of assignments based on the main programming language used for the assignment: C assignments and Java assignments. Each assignment has a certain number of total points associated with it. The number of points that the assignment counts for is this total number of points divided by the number of students working on the assignment together in a group. Each student can submit C and Java assignments worth up to 30 points (30 points for both languages). Within the C and Java categories, you are free to choose your projects and your group members.
You are free to suggest additional projects that fit your own goals. For your own suggestions, you must provide a one-page project description. Based on your project description, the instructor will either reject the proposal or assign it a total number of points.
Note that for all projects you should feel free to discuss the projects during the project phase in class and during office ours.
All projects will be graded using the following metrics:

Academic dishonesty includes, but is not limited to: plagiarism, cheating in exams, unauthorized collaboration and falsifying academic records. Violation of any of these may result in a grade penalty on assignments, an "F" in the course, dismissal from an academic unit, revocation of admission, suspension from the University as well as being roasted over a slow fire.
The Java assignments are due before the midterm, the C assignments are due before the final.

The different kinds of assignments are weighted as follows:

Programming assignments60 Pts
Midterm15 Pts
Final15 Pts
Class Participation10 Pts

There are theoretically a total of 100 Pts possible Grades will be given as follows:

A> 90 Pts
B> 75 Pts
C> 60 Pts
D≥ 45 Pts
F< 45 Pts


You must have access to a Debian GNU/Linux system for this class. If you do not have a machine where you can install this operating system, you can always use the department's HP lab which has all of the necessary software installed.

Submission of Assignments

Each student will get access to a subversion repository. Assignments must be committed to that repository by the respective deadline. Students are encouraged to use the repository for version control while still working on the assignment. Only the last version commited before the deadline will be used for grading.
In order to access your subversion repository, you must first request an account. For this, you first need to generate an encrypted password. On any GNU/Linux or UNIX machine (or even a Microsoft system with Apache) enter

    $ htpasswd -nb $USER PASSWORD
where PASSWORD is your desired password. You will not be able to change the password later. Send the output of the command to to request an account. Once your account has been created, you should do an initial check out:
    $ svn checkout$USER
    $ cd $USER
For group projects, you should e-mail the logins of the group members and a project name to the instructor. You will then be given the name of a directory to which all group members have access. You should then proceed to create some initial file for the first project and commit it:
    $ touch test
    $ svn add test
    $ svn commit -m "comment"
Make sure to commit the final version with all files (hint: svn status) before the deadline. It is also a good idea to do a seperate checkout and verify that the result works.

Java Projects

Hashtable with iterators (10)
Using the signatures from java.util.HashMap, implement your own hashtable with a constructor (no arguments) and the methods get, put, remove, size and iterator. The iterator method should take no arguments and return an implementation of java.util.Iterator. Make sure to implement comprehensive testcases, evaluate the performance, document the code (in particular the public interface) and produce a jar file with the implementation.
Persistent suffix-tree (40)
Implement a Suffix-tree. Your SuffixTree class should have a constructor taking a String as the argument and a method exactMatch(String a) which returns the first offset of the String a in the string used in the construction of the suffix tree if there is a match, otherwise -1. You should also implement a second method fuzzyMatch(String a, int fuzzy) which matches the Strings allowing up to fuzzy character mismatches (insertions, deletions or mutations). Make sure to implement comprehensive testcases, evaluate the performance (compare with String.indexOf), document the code (in particular the public interface) and produce a jar file with the implementation.

C Projects

A* algorithm (60)
Implement a library in pure C code that provides an implementation of the A* path finding algorithm using an external function providing the distance metric and an external implementation of a general graph. Your library should use callbacks to obtain adjacency information and distances.
Implement a graphical user interface for loading an image, having the user interactively select two points. Then use your A* library to find the shortest paths between the two points. Have every pixel correspond to a node in the graph. Connect each pixel to its eight neighboring pixels. Use the manhattan distance function for the heuristic distance metric. Use one plus the sum of the squares of the RGB color differences between the pixels for the actual distance.
Fast class-file parser library (60)
Write a parser for Java class files. Design a C or C++ API to make the results available to client applications. Demo your library by implementing a (possibly simplified) version of javap
Fast RTF-file parser library (60)
Write a parser for Rich Text Format (RTF) files. Design a C or C++ API to make the results available to client applications. Demo your library by implementing a program to extract the information from the "Information Group" section of the format as well as the (unformatted) document text.
Flexible GtkCellRenderer (60)
A GtkCellRenderer is an GTK type for rendering cells. Existing GtkCellRenderer implementations allow rendering text, images, progress bars and simple selection boxes. Implement a GtkCellRenderer that can render most GtkWidgets. The selection of the widget and its contents should depent on the data in the respective GtkTreeModel.
Chess game with AI using Glade (60)
Using the GTK+ library for the user interface, implement a simple chess game where players can play against the computer. Your AI should explore (valid) moves using a heuristic estimate on the value of a particular position. Your heuristic can be simplistic (for example, based on values of pieces, number of possible moves, checkmate).
Data recovery tool (60)
Implement a tool that scans a disk image for file system artifacts (inodes, directories, partition tables), re-assembles blocks that might be files in memory and then uses GNU libextractor to determine if the data is likely to correspond to an actual file. If so, the tool should give the user the option to recover the file (by writing its data to a different volume).
Your tool should support multiple file systems (such as ext2, fat32, NTFS, etc.) and use mmap for efficient IO operations. Note that you may not be able to mmap the entire disk image due to virtual memory constraints.


This is the list of topics for the course. The focus of the course is on the programming projects. As a result, expect a significant portion of the lecture to be spent on discussions of the projects and work on projects and small exercises.

Class 1: Introduction and eXtreme programming (03/26/2007)

Lecture 1 slides
Useful links

Class 2: UNIX and the UNIX shell (03/28/2007)

Material from the textbook
Useful links
An Introduction to the Unix Shell
UNIX commands
man, cat, less, more, diff, echo, head, tail, find, sort, uniq, grep, tr, sed, awk, wc, tee, time
UNIX commands homework
cd, pwd, mkdir, rmdir, cp, mv, rm, ln, touch, date, sleep, dd, df, du, ftp, ping, telnet, kill, ps, top, nice, last, uname, who

Class 3: Distributions and software installation (04/02/2007)

Useful links
Debian package management tools, Debian New Maintainer's Guide
UNIX commands homework
at, chmod, chown, chgrp, groups, passwd, basename, file, finger, mount, mknod, newgrp, nl, nohup, gzip, bzip2, tty, uname, vi, crontab

Class 4: Java basics (04/04/2007)

Useful links

Class 5: Debugging in Java (04/09/2007)

Useful links
junit, FindBugs

Class 6: Encapsulation, hiding and API design (04/11/2007)

Useful links
Information hiding, Access Protection Modifiers in Java, Access Protection Modifiers in C#, Access Protection in C++

Class 7: Profiling in Java (04/16/2007)

Useful links
Eclipse Test and Performance Tools Platform, Java Application Profiling using TPTP, JVM Tool Interface

Class 8: Code documentation and coding conventions (04/18/2007)

Lecture 8 slides
Useful links
GNU indent, GNU Coding Standards, Doxygen, JavaDoc

Class 9: Build and packaging systems for Java (04/23/2007)

Lecture 9 slides
Useful links
IzPack, JAR

Class 10: Midterm (04/25/2007)

Class 11: Build and packaging systems for C (04/30/2007)

Lecture 11 slides
Useful links
Learning the GNU development tools, Program Library HOWTO, Using LD, the GNU linker, GNU autoconf, automake and libtool

Class 12: SVR4 APIs - part I (05/02/2007)

Lecture 12 slides
Material from the textbook
Chapters 2, 3, 4, 5 and 6

Class 13: MVC, GTK and Glade (05/07/2007)

Lecture 13 slides
Useful links
MVC, Glade

Class 14: Debugging in C (05/09/2007)

Lecture 14 slides
Useful links
gdb mini intro

Class 15: SVR4 APIs - part II (05/14/2007)

Lecture 15 slides
Material from the textbook
Chapters 7, 8, 9, 10 and 11

Class 16: C development tools (05/16/2007)

Lecture 16 slides
Useful links
valgrind, Coverity Prevent (available in the Linux lab)

Class 17: Profiling in C (05/21/2007)

Lecture 17 slides
Useful links
GNU gprof

Class 18: Document formats for technical documentation (05/23/2007)

Lecture 18 slides
Useful links
GNU Texinfo, DocBook, ctan

Class 19: Review (05/30/2007)

Class 20: Final Exam (06/04/2007 at 9am!)


Grades can be requested at any time during office hours.

Christian Grothoff
Last modified: Sun Mar 16 17:34:57 MDT 2008