latest news

04.23.2008

Sample data available


04.23.2008

Contest open for entries


04.22.2008

Join the NEWSGROUP for contest updates and announcements.


03.28.2008

Contest Publicly Announced

further details...

additional links

other information

contest announcement

UMD



1st Annual UMD GPGPU Programming Contest

Motivation
Despite the plethora of great work being done recently in General Purpose GPU programming, there are still many common tasks for which there is no publicly available code to be found online. We feel that this inhibits the progress of Science by forcing people to reinvent the wheel far too often. Furthermore, the verification through reproducibility that is so critical to the Scientific process is nearly impossible without open source code.

In order to improve this situation, we are sponsoring a GPGPU programming contest, and will be releasing the entries, at the conclusion of the contest, under version 3 of the GPL. Note: As per the requests of some participants, we've decided to allow each participant to release their entry under GPLv3, GPLv2, or the Creative Commons License.

The particular task we are asking contestants to address is sparse matrix multiplication. We'll be evaluating entries on both vector/sparse matrix and sparse matrix/sparse matrix multiplications, using a variety of different inputs.

As the contest progresses, we'll be updating the LeaderBoard regularly, so contestants will have some idea of where they stand. Contestants are welcome to make as many entries as they want, so we encourage you to submit early and tweak your design. We will, however, ask you to register a user account so we can keep track of a version history of all of your submissions.

We realize that some of the specifications below are still a little vague. We promise that we'll have everything finalized when we open the contest for submissions in the coming weeks. Please check back here often for updates.
Contest Schedule
Event Date
Start Accepting Submissions Wed. April 23, 2008
Leader Board Opens Mon. April 28, 2008
Deadline for Entry Mon. Aug 21, 2008
Winners Announced Mon. Aug 28, 2008
Technical specifications
  Submission will be run on a Linux system with an 8800GTX card, 2x quad core CPU
Particpants can choose to write their entries in CUDA (v1.1) or GLSL (OpenGL 2.1). We will compile entries with gcc 4.1.
Submission guidelines
  Our submission system will be linked from this page. All submission should contain a tarred gzipped file containing all of the following:
  • All necessary driver code in any language that will compile with the default installation of gcc4.1. If you are interested in using another language, please contact us and we'll see what can be done. All shaders should be written in GLSL, and all CUDA kernels should compile under version 1.1 of the CUDA SDK.
  • A compile script, named "compile_submission.sh." This script should take no arguments, but may call a seperate Makefile.
  • A run script, named "run_submission.sh," which takes three arguments, the name of the left multiplicand, the name of the right multiplicand and the name of the output (product). For example, if the entry is to mulitply the matrix in file A.crs by the matrix in B.crs and write the product to the file AB.crs, then "run_submission.sh" will be invoked as:
    • ./run_submission.sh A.crs B.crs AB.crs
  • A text file, named "readme.txt", identifing the participant's name and any associated institutions. Please feel free to include any other information here that you think might be insightful.
Input format
  Input files will be ASCII text files in the Compressed Row Storage format. Inputs will be of various types: both square and non-square matrices, sizes which are both powers of 2 and not. Feel free to include subroutines optimized to different input types, but be aware that judging will take all types of input into account. Sample data is posted below.
Our input files will have four lines. The first has the number of rows and columns, the second has the non-zero values of the matrix, the third line is the column number of each entry, and the final line is the row value. All entries are separated by spaces, not commas, and we have indexed from one, not zero.
Output format
  Output files will also be ASCII text in the Compressed Row Storage format. Sample data is posted below.
Sample data
  Update: We have replaced the previous sample files with new ones because of an amibugity relating to all-zero rows in the matrices. Please see the explanation below.

Update 2: The sample data has been replaced again to reduce the precision of the matrix entries.

The following sample files are in Compressed Row Storage format, as mentioned above. Each of these archives has three examples, one each in "small," "medium" and "large." These are not necessarily the sizes we will use for judging. We will, for instance, be using some matrices much larger than the "large" shown here. Nor are they the only types of problems we will use. For example, there may be non-power of two square matrices.

Each matrix-matrix example data set consists of three files, "A.crs," "B.crs" and "AB.crs." These are the two multipliers and the product, respectively. The matrix-vector examples are named "A.crs," "x.crs" and "Ax.crs," and are also the two multipliers and product, respectively.

The page linked to above describing the CRS format does not specify how to handle rows of the matrix which have no non-zero elements. We have adopted the convention of repeating an entry in the row index vector in this case. That is, if you see the same value in the row index twice, the matrix has a row of all zeros followed by a row interpreted normally. This file has three small example matrices, and their corresponding .crs files which we believe will help to clarify the issue.

This page deals with Octave's handling of Compressed Column Storage (more or less the same as the CRS of the transpose of a matrix). There is a code snippet there that you may find useful when reading CRS files, but be advised that they are using CCS and 0-indexing unlike us.
Scoring
  Entries will be judged based on total, wall-clock run time of the program. Each entry will be tested on a variety of inputs (please see "Input," above). Furthermore, each test will be run 10 times, and the average of each run will be assigned for that test. The average of all tests will be used as the score for the entry. We will also have rankings available for both vector/matrix and matrix/matrix, although the final winner will be the submission with the lowest overall average run time. At the conclusion of the contest, a breakdown of the scores of each test will also be made public, as well as the corresponding inputs, so that the public can identify submissions which perform particularly well on certain types of input.
Prizes
  Nvidia has been kind enough to provide a couple of Quadro 5600 FX 's for prizes in the competition. We will determine shortly how these will be allocated to the 1st 2nd and 3rd place contestants.
Conditions
  As stated above, the purpose of holding this competition is to build a collection of GPGPU code that is publicly available. Therefore, at the conclusion of the contest, all submissions will be released under version 3 of the GPL license. Please keep this in mind when preparing your submissions, and do not include any proprietary or commercial code that may preclude a GPL license.
Contact Info
  Please contact us at either {rob [at] cs [dot] umd [dot] edu} or {jared [at] cs [dot] umd [dot] edu} if you have any questions, and we'll get back to you as soon as we can. Please include "GPGPU contest" in the subject to make sure we can respond promptly.
Disclaimer
  All decisions of the contest organizers are final, including scoring of entries. The organizers make no promises about any outcome, and are not responsible for any grievances that may arise as a result of entering this contest.