| Research |
![]() |
OSKI:
OSKI is a collection of low-level C primitives that provide
automatically tuned sparse matrix operations, for use by
solver libraries and applications. OSKI implements the SPARSITY
framework, with extensions including a high-level library
interface, new tuning heuristics, and new kernels. I am the
lead developer. Project page |
![]() |
Analytical and statistical performance
modeling. I am broadly interested in performance
modeling. For sparse matrix kernels, I have developed
techniques for computing machine-specific upper- and
lower-bounds on performance. I have also applied statistical
machine learning techniques to problems in automatic
tuning. For details, see my publications. |
![]() |
ROSE
is a tool developed for building customized own
source-to-source translators for C and C++. The lead
developer for ROSE is Dan Quinlan at LLNL. I now also
contribute to the core ROSE infrastructure. Moreover, I am
applying it to optimize large-scale (1+ MLOC) applications,
and using it to build an empirical tuning compiler framework
for automatically tuning general applications. Project page |
![]() |
JitterBug
is a tool to help elicit bugs in MPI applications that rely
on non-deterministic communication (e.g., send- and
receive-any operations). I implemented JitterBug using
Martin Schulz's PNMPI
framework, which extends the standard PMPI profiling
interface to support multiple tool layers. Our PADTAD
2006 paper on this work received a best paper award. Project summary | Paper (PDF) | Slides |
![]() |
PHiPAC
(Portable High-Performance ANSI C) was the first automatic
tuning framework for dense matrix-matrix multiply, and was
the early inspiration for my dissertation work. It consisted
of a parameterized code generator for matrix-multiply, whose
output was a stylized C, and a simple exhaustive search
engine to find fast implementations on a given
machine. Chapter 9 of my
dissertation, as well as this
paper, describe how I applied statistical machine
learning techniques to PHiPAC's search process. Project page | My publications |
|
SWAMI
(Shared Wisdom through the Amalgamation of Many
Interpretations) is a framework for running collaborative
filtering algorithms and evaluating the effectiveness of
those algorithms. It uses the EachMovie dataset, provided by
Compaq research. Paper (PDF) | BibTeX | Software |
|
![]() |
Microbenchmarking the Tera MTA. Jason Riedy
and I microbenchmarked the Tera MTA installed at the San
Diego Supercomputing Center. Though it was a substantial
amount of work with very interesting findings, we never
published the results except for our class project report. Report (PDF) | Slides (PDF) |