System that replaces human intuition with algorithms outperforms 615 of 906 human teams
Big-data analysis consists of searching for buried patterns that have some kind of predictive power. But choosing which “features” of the data to analyze usually requires some human intuition. In a database containing, say, the beginning and end dates of various sales promotions and weekly profits, the crucial data may not be the dates themselves but the spans between them, or not the total profits but the averages across those spans.
MIT researchers aim to take the human element out of big-data analysis, with a new system that not only searches for patterns but designs the feature set, too. To test the first prototype of their system, they enrolled it in three data science competitions, in which it competed against human teams to find predictive patterns in unfamiliar data sets. Of the 906 teams participating in the three competitions, the researchers’ “Data Science Machine” finished ahead of 615.
In two of the three competitions, the predictions made by the Data Science Machine were 94 percent and 96 percent as accurate as the winning submissions. In the third, the figure was a more modest 87 percent. But where the teams of humans typically labored over their prediction algorithms for months, the Data Science Machine took somewhere between two and 12 hours to produce each of its entries.
“We view the Data Science Machine as a natural complement to human intelligence,” says Max Kanter, whose MIT master’s thesis in computer science is the basis of the Data Science Machine. “There’s so much data out there to be analyzed. And right now it’s just sitting there not doing anything. So maybe we can come up with a solution that will at least get us started on it, at least get us moving.”
Between the lines
Kanter and his thesis advisor, Kalyan Veeramachaneni, a research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), describe the Data Science Machine in a paper that Kanter will present next week at the IEEE International Conference on Data Science and Advanced Analytics.
Veeramachaneni co-leads the Anyscale Learning for All group at CSAIL, which applies machine-learning techniques to practical problems in big-data analysis, such as determining the power-generation capacity of wind-farm sites or predicting which students are at risk for dropping out of online courses.
“What we observed from our experience solving a number of data science problems for industry is that one of the very critical steps is called feature engineering,” Veeramachaneni says. “The first thing you have to do is identify what variables to extract from the database or compose, and for that, you have to come up with a lot of ideas.”
In predicting dropout, for instance, two crucial indicators proved to be how long before a deadline a student begins working on a problem set and how much time the student spends on the course website relative to his or her classmates. MIT’s online-learning platform MITx doesn’t record either of those statistics, but it does collect data from which they can be inferred.
Read more: Automating big-data analysis
The Latest on: Big-data analysis
via Google News
The Latest on: Big-data analysis
- How big data analytics played a part in Vienna INEOS Challengeon October 15, 2019 at 12:12 am
(PHOTO/ The INEOS 1:59 Challenge/Bob Martin via REUTERS) Congratulations are in order for Eliud Kipchoge and his team for achieving a feat that no human has achieved before, it was indeed a ...
- Underwater Connector Market to 2027 - Global Analysis and Forecasts by Type, Connection, Applicationon October 14, 2019 at 5:26 pm
In the past three years, the region witnessed significant adoption of technologies such as Big Data, IoT, artificial intelligence ... Also, multiple primary interviews were conducted with industry ...
- Big Data and Analytics in Telecom Market to See Huge Growth by 2025 | Accenture, Cisco Systems, Informaticaon October 14, 2019 at 2:55 pm
The Big Data analytics in Telecommunication is expected to grow in the market owing to growing fraudulent activities in telecom Sectors and rise in demand for Quality of Services. Market Segmentation ...
- Big Data Skill sets that Software Developers will Need in 2020on October 14, 2019 at 10:28 am
More organizations rely on big data to help with decision making and to analyze and explore future trends. For current and future software development companies that want to be knowledgeable about ...
- Wrangling Big Data Into Actionable Intelon October 14, 2019 at 8:28 am
“Actionable intelligence is the next level of data analysis where analysis is put into use for near-real-time decision ... “And if we can do that through these big data architectures, then I think ...
- Big Data Software Market, Analysis by Industry Size, Share, Revenue Growth, Development and Demand Forecast To 2023on October 14, 2019 at 6:54 am
IBM (U.S.), one of the prominent player, has designed a big data platform that blends traditional technologies suited for structured, complementary to new technologies that address speed and ...
- Advanced Analytics Market Analysis, Strategic Assessment, Trend Outlook and Bussiness Opportunities 2019-2023on October 14, 2019 at 3:25 am
In simple terms, advanced analyticsuses different what-if scenarios to predict the risks and opportunities that the future holds. Data mining, big data, and predictive data analysis are all parts of ...
- Informa Markets' Asia Agri-Tech Expo 2019: Resistance, eco-friendly and big data is the future of agrarian matterson October 14, 2019 at 2:02 am
It can collect scientific data and aquatic organism growth management records from hatch to seafood products. So far, it has abundant database that is able to use deep learning analysis for farming ...
- Clement Perrette Talks Big Data and the Evolution of Fixed Income Asset Management”on October 13, 2019 at 5:20 am
In the realm of private equity investments, the role of Big Data has been somewhat more prevalent, and continues to play a pivotal role in the analysis of public companies. For finance professionals, ...
- 6 ways to be big data superstaron October 11, 2019 at 1:17 pm
In 2019, big data and analytics skills is the number one area of need in companies. According to AWS recruiting agency Jefferson Frank, technical skill areas in demand include programming languages ...
via Bing News