Statistical Data Mining
Harnessing the power of the grid
for extreme-scale predictive analytics
Pinpoint the real value in your data
By harnessing the extreme-scale computational power of the Frontier® Grid Platform, Parabon Crush revolutionizes statistical data mining, enabling you to answer deep and valuable questions about your data — fast!
Enhanced with a search algorithm developed for NASA, Crush employs both exhaustive and evolutionary regression analysis to discover optimal statistical models among the vast set of models that are possible over high-dimensional datasets. It is ideal for a variety of statistically challenging problems, from genome-wide association studies to forecasting stock prices. By systematically exhausting the space of possible models or smartly and deeply sampling it when exhaustive techniques are prohibitively time consuming, Crush finds solutions that are demonstrably superior to traditional heuristic approaches.
Using models generated by Crush, analysts can predict outcomes for a given set of new data. For example, by studying the demographic and genomic characteristics of a sample population, cancer researchers have used Crush to uncover hidden relationships that correlate to a test subject's metabolic response to a particular chemotherapy treatment. Armed with these models, doctors can now predict the likelihood of a given patient's response to this treatment based on his or her age, race, weight and DNA profile.
With Crush, the result is always the discovery of optimal statistical models that would never be found using traditional approaches.
Why is the Power of the Grid Necessary?
Unlike traditional statistical modeling tools, which use simple heuristics to rapidly produce answers that are often suboptimal, Crush can systematically exhaust the entire space of possible combinations in its search for the best model. And thanks to the power of the Frontier Grid Platform, Crush can perform many thousands of calculations simultaneously, reducing the amount of time the analysis would take from months or years, to hours or days.
Opportunistic Evolution Search Algorithm
As the number of independent variables (columns of input) grows, the number of possible models (combinations of variables) grows exponentially. Eventually, the search space reaches a point where it's impractical to exhaust every possible model. For these cases, Crush employs a novel algorithm, called Opportunistic Evolution (OE), which efficiently searches arbitrarily large model spaces deeply and effectively. This optimization technique is part of Parabon's Origin™ Evolutionary Software Development Kit, a suite of genetic algorithms specifically designed to maximally leverage grid-scale capacity.
Easy to Use
- Crush can be run right from within Microsoft® Excel® or, for larger datasets, from the command line.
- Progress can be monitored from any browser via the Frontier Dashboard.
- Jobs that would take years to complete on a single computer can be completed in hours or minutes, depending upon the grid capacity used.
Crush in Action: It's Easy, Fast and Powerful
Crush has been used in a wide variety of domains, from genetic analysis and generating psychometric models of consumer preferences, to predicting contract overruns and determining the causal factors for the spread of West Nile Virus. Its statistical models can be used in practically any domain to: