Research

NOTE (Aug 20, 2008): Exciting updates to this list of projects coming very soon!
 
Current projects.
Automated tools and techniques to tune programs for current and future large-scale machines.
OSKI: OSKI is a collection of low-level C primitives that provide automatically tuned sparse matrix operations, for use by solver libraries and applications. OSKI implements the SPARSITY framework, with extensions including a high-level library interface, new tuning heuristics, and new kernels. I am the lead developer.
Project page
Analytical and statistical performance modeling. I am broadly interested in performance modeling. For sparse matrix kernels, I have developed techniques for computing machine-specific upper- and lower-bounds on performance. I have also applied statistical machine learning techniques to problems in automatic tuning. For details, see my publications.
ROSE is a tool developed for building customized own source-to-source translators for C and C++. The lead developer for ROSE is Dan Quinlan at LLNL. I now also contribute to the core ROSE infrastructure. Moreover, I am applying it to optimize large-scale (1+ MLOC) applications, and using it to build an empirical tuning compiler framework for automatically tuning general applications.
Project page

JitterBug is a tool to help elicit bugs in MPI applications that rely on non-deterministic communication (e.g., send- and receive-any operations). I implemented JitterBug using Martin Schulz's PNMPI framework, which extends the standard PMPI profiling interface to support multiple tool layers. Our PADTAD 2006 paper on this work received a best paper award.
Project summary | Paper (PDF) | Slides
Earlier projects.
(selected)
PHiPAC (Portable High-Performance ANSI C) was the first automatic tuning framework for dense matrix-matrix multiply, and was the early inspiration for my dissertation work. It consisted of a parameterized code generator for matrix-multiply, whose output was a stylized C, and a simple exhaustive search engine to find fast implementations on a given machine. Chapter 9 of my dissertation, as well as this paper, describe how I applied statistical machine learning techniques to PHiPAC's search process.
Project page | My publications
SWAMI (Shared Wisdom through the Amalgamation of Many Interpretations) is a framework for running collaborative filtering algorithms and evaluating the effectiveness of those algorithms. It uses the EachMovie dataset, provided by Compaq research.
Paper (PDF) | BibTeX | Software
Microbenchmarking the Tera MTA. Jason Riedy and I microbenchmarked the Tera MTA installed at the San Diego Supercomputing Center. Though it was a substantial amount of work with very interesting findings, we never published the results except for our class project report.
Report (PDF) | Slides (PDF)