The Information Retrieval and Data Science Group’s (I.R.D.S.) mission is to research and develop new methodology and open source software to analyze, ingest, process, and manage Big Data and to turn it into information. We contribute to the world’s largest and most often downloaded open source software projects, we apply tried and true techniques including content detection and analysis, crawling, deduplication, similarity, named entity recognition, construction of inverted indices, query analysis, search, relevancy and ranking, interactive query analysis, and management of large data sets. We have expertise in data collection, working with NASA, DARPA, DHS, NIH across a number of domains, Earth Science, Planetary Science, Astronomy, defense, and private industry.



He is a Principal Data Scientist and the Chief Architect in the Instrument and Data Systems section, at the Jet Propulsion Laboratory (JPL) in Pasadena, California and an Adjunct Associate Professor in the Computer Science Department within USC's Viterbi School of Engineering.
At JPL, he developed the third generation of the Apache Object Oriented Data Technology (OODT) data processing and information integration system. OODT is an open source, data-grid middleware used across many scientific domains, such as planetary science, cancer research (go figure), and computer modeling, simulation and visualization. For more detail on OODT you can check out his ICSE 2006 paper that appeared in the Software Engineering Challenges and Achievements track and his 2009 IEEE Space Mission Challenges for Information Technology (SMC-IT) paper describing the refactorization and re-architecting of the data processing framework.


