What Data Science Tools Have you used in the past 12 months? KDnuggets Poll Results are out!

The results of the 18th annual KDnuggets Software Poll were recently published. This poll asks “What Predictive Analytics, Data Mining, Data Science Software/Tools have you used in the past 12 months?”.  This poll attracted 2,900 voters, and it is also worth mentioning that it sometimes attracts controversy due to excessive voting by some vendors.  See all data at KDNuggets.

Some of the most relevant findings are:

  • Python has now overtaken R as a Data Science Tool – barely but still noticeable (53% to 52% use but Python grew 15% while R only grew 6%)
  • There are now 2 newcomers joining the top 10 list: Tensorflow and Anaconda
  • Use of Excel for Analytics purposes decreased by 16%
  • In terms of programming languages, Python, R and SQL run the show with usage of all 3 growing
  • Big Data Tools was simplified to only 4 categories: Hadoop Open Source, Hadoop Commercial, SQL on Hadoop Tools and Spark.  The highest growth tool is SQL on Hadoop and usage of Hadoop Open Source is decreasing

TOP SOFTWARE TOOLS

We have 2 newcomers this year: Anaconda and Tensorflow

Top 2 tools:

  • Use: Python (53%) and R (52%)
  • Growth: Tensorflow (197%), Anaconda (37%)

TOP LANGUAGES

Top 2 languages:

  • Use: Python (53%) and R (52%)
  • Growth: Python (15%), R (6%)

TOP BIG DATA TOOLS

screen-shot-2017-05-24-at-10-46-35-am.png

The tools on the survey have been simplified to 4: Hadoop Open Source, Hadoop Commercial, Spark and SQL on Hadoop Tools.

Top 2 Big data tools:

  • Use: Spark (23%) and Hadoop Open Source (15%)
  • Growth: SQL on Hadoop Tools (41%), Spark (5%)

It is important to notice the decrease of 32% in usage of Hadoop Open Source. I am not sure if there has been a real decrease, or if this is the result of the survey having changed splitting the hadoop category in 2: Open Source and Commercial.  Part of this “decrease” could be attributed to the fact that there are now 2 categories instead of 1.

DEEP LEARNING TOOLS

Top Deep Learning Tools:

  • Use: Tensorflow (20%) and Keras (9.5%)
  • Growth: Microsoft CNTK (278%), mxnet (200%)

What is Analytics – The 3 Things Series

AAEAAQAAAAAAAAt_AAAAJDczYTVkZTI5LTk2ODktNGNkMi05ZDM5LTZmZDhjODIyZTAwMw

Before I embark in explaining what is Analytics, as well as the different types of Analytics, let’s just talk for a second about Why Analytics. The field of Analytics was born with the goal of using data and the analysis of that data to improve performance in key business domains, which basically means having the ability to make better decisions, and to execute the right actions based on data insight.

So what is Analytics?

The field of Analytics involves all that is necessary to drive better decision making and add value, such as Data platforms (On Premise, On Private or Public Cloud), Access to Data (Structured and Unstructured), and Tools for Quantitative Analysis and Data Visualization. In other words, Analytics is all about turning data into insight which in the world of business means turning data into competitive advantage.

Analytics Types

There are three main types of Analytics:

1- Descriptive Analytics help you understand “What happened?”. The goal of descriptive analytics like the name implies is to describe or summarize raw data and turn it into something that makes sense to the human eye – typically through presenting the data in tables or reports, or in visualizations or charts. They are very useful for understanding past behaviors, and how the past might potentially influence future outcomes.  Basic statistics like averages, sums, percent change, or proportions fit into descriptive analytics. This is the simplest form of analytics but nevertheless it is extremely useful and necessary as a stepping stone into more sophisticated/valuable analytics.

2- Predictive Analytics help you understand “What could happen?”. There are two main goals: Finding relationships or patterns and predicting what could potentially happen. They help you try to understand the future while providing actionable insights based on data. Predictive Analytics don’t provide predictions that are 100% accurate, but provide estimates about the likelihood of a future outcome. They can be used throughout an organization to forecast sales or inventories, to detect fraud, to understand customer behavior, or in any scenario where relationships among data and “predicting the future” will help make better decisions. We are all familiar with our credit scores right? That is an example of predictive analytics where historical data on how well you manage your credit is used to predict a score that can then be used as a proxy for how much of a credit risk you might be.

3- Prescriptive Analytics help you determine “What should we do?”. Prescriptive Analytics are all about providing specific guidance about what to do. They make an effort to quantify the effect of future decisions looking at the possible outcomes of each scenario before the actual decisions are made. They use a combination of business rules, algorithms, and modelling procedures to provide possible outcomes. These are typically used in Supply Chain Management, Price Optimization, Workforce planning among others. They are very useful like the name implies to “prescribe” a direction after examining multiple possible scenarios.

In summary, Analytics help you turn your data into insight for better decision making and there are 3 main types of Analytics that you use depending on your goal. Descriptive analytics to understand what has happened in the past, Predictive analytics to understand relationships among data and provide predictions about what may happen in the future, and Prescriptive analytics to provide specific recommendations about what to do in specific scenarios.

Enterprise Business Intelligence Platforms – At a Glance

Industry analysis reports are always packed with great information, lots of it.  Data visualization however helps better understand the conclusions reached by Forrester in their The Forrester Wave™: Enterprise Business Intelligence Platforms, Q1 2015 report.  11 vendors in this industry are included.

Table 1 shows the summary scorecard, and details are presented below the table.

Table 1 - Enterprise Business Intelligence Scorecard

Table 1 – Enterprise Business Intelligence Scorecard

The top level comparisons involve 3 categories: Current Offering, Strategy, and Market Presence.

The Forrester Wave™: Enterprise Business Intelligence Platforms, Q1 2015

Vendors with above-average scores

Current Offering: IBM, Information Builders, Microsoft, Opentext, SAP, SAS

Strategy: IBM, Microsoft, Oracle, SAP, and SAS

Market Presence: IBM, Microsoft, Oracle, SAP and SAS

We can also look at the scores that went into every one of these categories and see how vendors compare at a more granular level.

Current Offering Details

Forrester Evaluation Business Intelligence Current Offerings

Vendors with above-average scores for Current Offering

Architecture: Information Builders, Microstrategy, Opentext, SAP, SAS

Development Environment: Microsoft, Opentext, and SAS

Functional Capabilities: IBM, Information Builders, Microsoft, SAP and SAS

Operational Capabilities: IBM, Information Builders, Microstrategy, Opentext

Strategy Details

Forrester Enterprise Business Intelligence Strategy Comparisons

Vendors with above-average scores for Strategy

Commitment: IBM, Microsoft, Oracle, SAP, SAS

Pricing: IBM, Information Builders, Microsoft, Opentext, Qlik and TIBCO

Transparency: Tableau

Product Direction: IBM, Information Builders, Microsoft, Oracle, SAP, and SAS

Market Presence Details

Forrester Enterprise Business Intelligence - market presence evaluation

Vendors with above-average scores for Market Presence

Company Financials: IBM, Microsoft, Oracle, Qlik, SAP, SAS

Global Presence Base: IBM, Microsoft, Microstrategy, Oracle, SAP, and SAS

Partnership Ecosystem: IBM, Microsfoft, Oracle, SAP, and SAS

Functional Applications: IBM, Oracle, SAP, and SAS

What Data Science Tool Have you used in the past 12 months?

The results of the 16th annual KDnuggets Software Poll were recently published. This poll asks “What Predictive Analytics, Data Mining, Data Science Software/Tools have you used in the past 12 months?”.  This poll attracted 2,800 voters, and it is also worth mentioning that it sometimes attracts controversy due to excessive voting by some vendors.

93 different tools were included in the poll. To determine which tools are included, they start with the companies in Gartner Magic Quadrant(tm) for Advanced Analytics and Forrester Wave(tm) for Big Data Predictive Analytics, and add companies/tools from last year poll, and relevant new ones in the market.

R is the tool most frequently used tool in this community of data scientists, but other tools are growing rapidly (Spark, KNIME, Python).

Top 10 data mining tools

 

In terms of programming languages, Python is the clear leader.

kdnuggetslanguages

 

And Hadoop is the clear leader in the Big Data Tools space.

big data tools