System that replaces human intuition with algorithms outperforms 615 of 906 human teams
Big-data analysis consists of searching for buried patterns that have some kind of predictive power. But choosing which “features” of the data to analyze usually requires some human intuition. In a database containing, say, the beginning and end dates of various sales promotions and weekly profits, the crucial data may not be the dates themselves but the spans between them, or not the total profits but the averages across those spans.
MIT researchers aim to take the human element out of big-data analysis, with a new system that not only searches for patterns but designs the feature set, too. To test the first prototype of their system, they enrolled it in three data science competitions, in which it competed against human teams to find predictive patterns in unfamiliar data sets. Of the 906 teams participating in the three competitions, the researchers’ “Data Science Machine” finished ahead of 615.
In two of the three competitions, the predictions made by the Data Science Machine were 94 percent and 96 percent as accurate as the winning submissions. In the third, the figure was a more modest 87 percent. But where the teams of humans typically labored over their prediction algorithms for months, the Data Science Machine took somewhere between two and 12 hours to produce each of its entries.
“We view the Data Science Machine as a natural complement to human intelligence,” says Max Kanter, whose MIT master’s thesis in computer science is the basis of the Data Science Machine. “There’s so much data out there to be analyzed. And right now it’s just sitting there not doing anything. So maybe we can come up with a solution that will at least get us started on it, at least get us moving.”
Between the lines
Kanter and his thesis advisor, Kalyan Veeramachaneni, a research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), describe the Data Science Machine in a paper that Kanter will present next week at the IEEE International Conference on Data Science and Advanced Analytics.
Veeramachaneni co-leads the Anyscale Learning for All group at CSAIL, which applies machine-learning techniques to practical problems in big-data analysis, such as determining the power-generation capacity of wind-farm sites or predicting which students are at risk for dropping out of online courses.
“What we observed from our experience solving a number of data science problems for industry is that one of the very critical steps is called feature engineering,” Veeramachaneni says. “The first thing you have to do is identify what variables to extract from the database or compose, and for that, you have to come up with a lot of ideas.”
In predicting dropout, for instance, two crucial indicators proved to be how long before a deadline a student begins working on a problem set and how much time the student spends on the course website relative to his or her classmates. MIT’s online-learning platform MITx doesn’t record either of those statistics, but it does collect data from which they can be inferred.
Read more: Automating big-data analysis
The Latest on: Big-data analysis
via Google News
The Latest on: Big-data analysis
- Frost & Sullivan Names 11 Industry Leaders in the Global Big Data Marketon May 12, 2020 at 4:01 pm
Frost & Sullivan's Global Big Data Analytics (BDA) Industry Radar found that organizations' requirement to enhance customer engagement ...
- Insights into the Worldwide Customer Analytics Industry to 2026 - Opportunity Analysis for the New Entrants - ResearchAndMarkets.comon May 12, 2020 at 8:18 am
The "Global Customer Analytics Market Analysis 2020" report has been added to ResearchAndMarkets.com's offering.
- Global Big Data Analytics Market Analysis 2020 Featuring Profiles of 11 Players Including SAS, IBM, Qlik, Splunk, Tableau, and TIBCOon May 12, 2020 at 6:30 am
The "Global Big Data Analytics Market, 2020" report has been added to ResearchAndMarkets.com's offering. This report focuses on the global ...
- Six Ways The Healthcare Industry Is Reaping Big Rewards From Big Dataon May 12, 2020 at 4:26 am
Big data is growing every day, and with it comes the chance for doctors to learn more about science, customer care, servicing and so much more.
- At UPS, big data is redefining the supply chainon May 11, 2020 at 12:46 pm
Billions of data points are gathered throughout the UPS network every week. Find out how the information collected is revolutionizing the logistics giant.
- Big Data and Analytics In Telecom Market Size, Historical Growth, Analysis, Opportunities and Forecast To 2026on May 11, 2020 at 10:30 am
HongChun Research added a research on the ' Big Data And Analytics In Telecom market' which encompasses significant inputs with respect to market share, market size, regional landscape, contributing ...
- COVID-19 Impact and Recovery Analysis | Global ITSM Market 2020-2024 | Effective IT Service Incident and Problem Management to Boost Market Growth | Technavioon May 11, 2020 at 5:15 am
The ITSM market size is expected to grow by USD 3.29 billion during 2020-2024. The report also provides the market impact and new opportunities created due to the COVID-19 pandemic. The impact can be ...
- Big Data Security Market Growth Opportunities on Competitive Landscape and Regional Analysis | Forecasts to 2029on May 8, 2020 at 4:59 am
Top Players are focusing on extensive product development and integrating across the value chain to reduce the overall cost of big data security, states a new report by market.us.
- Future Impact of COVID-19 on the Global Big Data Analytics Industry - Major Drivers and Restraints - ResearchAndMarkets.comon May 7, 2020 at 2:59 am
The "Post-pandemic Growth Opportunity Analysis of the Big Data Analytics Market" report has been added to ResearchAndMarkets.com's offering.
- Post COVID-19 Pandemic Growth Opportunity Analysis of the Worldwide Big Data Analytics Marketon May 7, 2020 at 2:44 am
The "Post-pandemic Growth Opportunity Analysis of the Big Data Analytics Market" report has been added to ResearchAndMarkets.com's offering. This research measures the future impact of COVID-19 on the ...
via Bing News