What is Analytics – The 3 Things Series

AAEAAQAAAAAAAAt_AAAAJDczYTVkZTI5LTk2ODktNGNkMi05ZDM5LTZmZDhjODIyZTAwMw

Before I embark in explaining what is Analytics, as well as the different types of Analytics, let’s just talk for a second about Why Analytics. The field of Analytics was born with the goal of using data and the analysis of that data to improve performance in key business domains, which basically means having the ability to make better decisions, and to execute the right actions based on data insight.

So what is Analytics?

The field of Analytics involves all that is necessary to drive better decision making and add value, such as Data platforms (On Premise, On Private or Public Cloud), Access to Data (Structured and Unstructured), and Tools for Quantitative Analysis and Data Visualization. In other words, Analytics is all about turning data into insight which in the world of business means turning data into competitive advantage.

Analytics Types

There are three main types of Analytics:

1- Descriptive Analytics help you understand “What happened?”. The goal of descriptive analytics like the name implies is to describe or summarize raw data and turn it into something that makes sense to the human eye – typically through presenting the data in tables or reports, or in visualizations or charts. They are very useful for understanding past behaviors, and how the past might potentially influence future outcomes.  Basic statistics like averages, sums, percent change, or proportions fit into descriptive analytics. This is the simplest form of analytics but nevertheless it is extremely useful and necessary as a stepping stone into more sophisticated/valuable analytics.

2- Predictive Analytics help you understand “What could happen?”. There are two main goals: Finding relationships or patterns and predicting what could potentially happen. They help you try to understand the future while providing actionable insights based on data. Predictive Analytics don’t provide predictions that are 100% accurate, but provide estimates about the likelihood of a future outcome. They can be used throughout an organization to forecast sales or inventories, to detect fraud, to understand customer behavior, or in any scenario where relationships among data and “predicting the future” will help make better decisions. We are all familiar with our credit scores right? That is an example of predictive analytics where historical data on how well you manage your credit is used to predict a score that can then be used as a proxy for how much of a credit risk you might be.

3- Prescriptive Analytics help you determine “What should we do?”. Prescriptive Analytics are all about providing specific guidance about what to do. They make an effort to quantify the effect of future decisions looking at the possible outcomes of each scenario before the actual decisions are made. They use a combination of business rules, algorithms, and modelling procedures to provide possible outcomes. These are typically used in Supply Chain Management, Price Optimization, Workforce planning among others. They are very useful like the name implies to “prescribe” a direction after examining multiple possible scenarios.

In summary, Analytics help you turn your data into insight for better decision making and there are 3 main types of Analytics that you use depending on your goal. Descriptive analytics to understand what has happened in the past, Predictive analytics to understand relationships among data and provide predictions about what may happen in the future, and Prescriptive analytics to provide specific recommendations about what to do in specific scenarios.

United’s CEO Fails to Understand the Power Shift brought on by Social Media

image.jpeg

By now everyone has seen the despicable way a United Airlines passenger was treated, forcibly being removed from a plane to release his seat to another passenger. I personally find the whole situation appalling but that is not what I want to discuss now. I want to talk about the way United’s CEO handled the situation, and how it clearly demonstrates a failure to understand how social media has shifted power from the few to the many.

Imagine if this situation had happened 20 years ago, before a world where social media was ingrained in our lives. Even if people on the plane had been able to take pictures or record video, what choices did they have to share them? The result would have been that only a handful of people would have heard the story. The CEO and executives in turn, would have also had a lot of power to control the message, and probably would have been able to get away with this without losing several million dollars in market valuation due to people’s outrage around the world.

But in fact what happened was that within minutes of this event, people around the world were seeing video and pictures of this physical assault, and people were waiting for, or in fact, expecting a statement from United’s leadership. So after the first failure at managing the overbooking situation, came the second: the CEO’s response.

  1. He apologizes for “re-accommodating” a passenger, assuming that would be the end of the history
  2. He proceeds to blame the passenger accusing him of being belligerent and indicating it is important to find out why “the passenger acted the way he did”. He also doubles down on congratulating his employees for a job well done. Tone deaf much?
  3. After losing almost a $1B in valuation at some point during the day he finally comes out with the statement he should have issued from the beginning stating “He is sorry. This shouldn’t have happened, and they will take measures for this not to happen again”.

Lesson to learn from this event? Corporations – and their leaders – cannot get away with a lot of the things they could have gotten away with in the past, and we have social media platforms to thank for that. Ideally, business leaders would care about their clients and their business, but even if as a leader truly in your heart you don’t care and your first reaction is to say “Not our fault. We did everything by the book”, know that most likely that statement is only going to amplify existing outrage.

Yes, you may think it is possible to do it seeing how some politicians get away with so much gaslighting these days, but they have something you don’t, and that is followers willing to be gaslighted because they are blinded by their passion for a political party. More likely than not your clients and others don’t have that level of passion for your business.

So next time something like this happens…. Stay away from the temptation to blame the victim and address the situation the right way, which includes any combination of:

  • We are sorry
  • The buck stops here
  • We are investigating
  • We are taking steps to ensure this doesn’t happen again

Don’t forget: social media has shifted the power from the few to the many.

What is the Business Value of Big Data? – The Three Things Series

*** The 3 Things Series aims to simplify – sometimes even oversimplify – technology concepts so that you learn 3 things about a topic ***. Opinions are my own.

Organizations embark in Big Data projects typically with 3 goals in mind: cost reductions, improved decision making and the ability to create new products and services.

big data business value

1- Cost Reduction 

As the quantities and complexity of data in organizations increase, so does the cost of storing and processing this data. Decisions about how much data to keep available for analysis, and how much “historic” data to move to tape or other less expensive resources, are then made. The problem with this strategy is that by limiting the data that can be analyzed, the insight that can be derived from this data is also limited.

In recent years, technology developments especially in Open Source, have made cost reduction a reality through the use of inexpensive technology such as Hadoop clusters (Hadoop is a unified storage and processing environment that allows for data and data processing to be distributed across multiple computers). Hadoop clusters give organizations the ability to keep more data available for analysis at a lower cost, and to easily add complex data types (images, sound, etc) to the pool of data to be analyzed

2- Improved Decision Making

Data analysis can be significantly improved by adding new data sources and new data types to traditional data. For example a data-driven retailer may see significant benefits in their inventory planning processes, if a new data source like weather data is added to the model to better predict sales and inventory requirements. An enriched model may be able to predict shortages of winter clothing by incorporating temperature into the existing models. Additional benefits can also be achieved, if more complex data is analyzed. For example, this same retailer may better target their ads in social media, if they evaluate not only their clients purchasing history, but also the actions they take in social media to interact with their brands and those of their competitors.

3-  Development of New Products and Services

The most strategic and innovative business benefits will probably be achieved by the ability to use new data or new sources of data to create new products and services. Let’s think for a minute about the data our cars generate (yes, we don’t necessarily see it, but more and more cars are equipped with sensors that collect a lot of data about our driving history). Using this data, insurance companies can offer policies that are dynamically priced based on an individual’s driving history (which is good news to you only if you are a safe driver of course).   Integrating weather data can also bring tremendous savings to an insurance company. Some insurance companies have been able to achieve significant savings per claim by letting their clients know that a storm is coming and recommending they don’t leave their cars exposed to the elements. (Again, assuming that as a client you listen to your insurance company recommendations).

In summary, when thinking of the business value of Big Data, think of  three areas of value:

  • Cost reductions
  • Improved decision making
  • Ability to create new products and services

 

What is Big Data? (The 3 Things Series)

*** The 3 Things Series aims to simplify – sometimes even oversimplify – technology concepts so that you learn 3 things about a topic ***. Opinions are my own.

The technology industry is full of “buzzwords”, with Big Data being one of the most used in recent years. Organizations have always dealt with data and have stored that data in databases, but we can see in the chart below how searches on Google have changed throughout the years comparing searches for “Databases” to searches for “Big Data”.

 

databases vs big data searches

 

Big Data in general refers to the ability to gather, store, manage, manipulate, and – the most important one – get insights out of vast amounts of data. And the typical question is “how big does data need to be so it is considered Big?” And the answer is…. it depends. When it comes to size, an organization’s Big Data may be another organization’s small data.

There are 3 things to remember that define “Big Data”:

  • Volume. It refers to size. So if you are capturing vast amounts of information, you probably have Big Data in your hands
  • Velocity. Are you working with data at rest? Or data in motion? For example if you are analyzing sales figures for the past year, that data is at rest (it is not changing constantly). But if on the other hand you are analyzing tweets to understand how your clients are reacting to a product announcement, this is data in motion as it is continuously changing. It may not be necessarily big if you are looking at daily data, but the fact that it is data in motion is relevant to the definition of Big Data
  • Variety. As the ability to capture, store and analyze more data has increased, so has the interest in analyzing data that is more complex in nature. For example, an insurance company may want to analyze the recordings of customer service calls to determine what characteristics of the conversation led to a policy sale, a retailer may want to analyze videos to determine how people navigate the store and how that impacts sales, or a hospital may want to analyze x-rays to find patterns and correlations between common symptoms in patients.

So when it comes to the definition of Big Data, remember 3 things, or the 3 Vs:

  •  Volume (size)
  • Velocity (Frequency of data update during analysis)
  • Variety (complexity of data to analyze – images, videos, texts, log files, etc)

 

 

 

Enterprise Business Intelligence Platforms – At a Glance

Industry analysis reports are always packed with great information, lots of it.  Data visualization however helps better understand the conclusions reached by Forrester in their The Forrester Wave™: Enterprise Business Intelligence Platforms, Q1 2015 report.  11 vendors in this industry are included.

Table 1 shows the summary scorecard, and details are presented below the table.

Table 1 - Enterprise Business Intelligence Scorecard

Table 1 – Enterprise Business Intelligence Scorecard

The top level comparisons involve 3 categories: Current Offering, Strategy, and Market Presence.

The Forrester Wave™: Enterprise Business Intelligence Platforms, Q1 2015

Vendors with above-average scores

Current Offering: IBM, Information Builders, Microsoft, Opentext, SAP, SAS

Strategy: IBM, Microsoft, Oracle, SAP, and SAS

Market Presence: IBM, Microsoft, Oracle, SAP and SAS

We can also look at the scores that went into every one of these categories and see how vendors compare at a more granular level.

Current Offering Details

Forrester Evaluation Business Intelligence Current Offerings

Vendors with above-average scores for Current Offering

Architecture: Information Builders, Microstrategy, Opentext, SAP, SAS

Development Environment: Microsoft, Opentext, and SAS

Functional Capabilities: IBM, Information Builders, Microsoft, SAP and SAS

Operational Capabilities: IBM, Information Builders, Microstrategy, Opentext

Strategy Details

Forrester Enterprise Business Intelligence Strategy Comparisons

Vendors with above-average scores for Strategy

Commitment: IBM, Microsoft, Oracle, SAP, SAS

Pricing: IBM, Information Builders, Microsoft, Opentext, Qlik and TIBCO

Transparency: Tableau

Product Direction: IBM, Information Builders, Microsoft, Oracle, SAP, and SAS

Market Presence Details

Forrester Enterprise Business Intelligence - market presence evaluation

Vendors with above-average scores for Market Presence

Company Financials: IBM, Microsoft, Oracle, Qlik, SAP, SAS

Global Presence Base: IBM, Microsoft, Microstrategy, Oracle, SAP, and SAS

Partnership Ecosystem: IBM, Microsfoft, Oracle, SAP, and SAS

Functional Applications: IBM, Oracle, SAP, and SAS

What Data Science Tool Have you used in the past 12 months?

The results of the 16th annual KDnuggets Software Poll were recently published. This poll asks “What Predictive Analytics, Data Mining, Data Science Software/Tools have you used in the past 12 months?”.  This poll attracted 2,800 voters, and it is also worth mentioning that it sometimes attracts controversy due to excessive voting by some vendors.

93 different tools were included in the poll. To determine which tools are included, they start with the companies in Gartner Magic Quadrant(tm) for Advanced Analytics and Forrester Wave(tm) for Big Data Predictive Analytics, and add companies/tools from last year poll, and relevant new ones in the market.

R is the tool most frequently used tool in this community of data scientists, but other tools are growing rapidly (Spark, KNIME, Python).

Top 10 data mining tools

 

In terms of programming languages, Python is the clear leader.

kdnuggetslanguages

 

And Hadoop is the clear leader in the Big Data Tools space.

big data tools

 

Women in Software Engineering – Data Visualizations

Tracy Chou, Software Engineer at Pinterest has become a leading voice for women in the tech industry by using data to call attention to how few of them are employed as engineers. She has uploaded a spreadsheet (https://github.com/triketora/women-in-software-eng), that companies can use to make public the number of female engineers in their ranks. The goal: to identify the scope of the problem as a first step toward making a stronger commitment to address it.

I used the data to create some visualizations.

Quantifying Silicon Valley’s Diversity Issue

Tableau’s New Features – Summary from Tableau Conference’s Keynote (2014)

Did you miss Tableau’s Conference this year? Here is a summary of the main new features coming up in future versions of Tableau products. The new features were presented by different speakers so they vary in level of detail. (Download available at the end of the post).

Future Investment in Tableau will be focused in seven areas:

  1. Visual Analytics
  2. Performance
  3. Data Preparation
  4. Storytelling
  5. Enterprise
  6. Cloud
  7. Mobile

Tableau's New Features Summary - 2014.001

I- Visual Analytics.

1-Type-in Shelves.

In addition to dragging fields to the shelves, you will also be able to type-in fields. Autocomplete alternatives will be presented.

 

Type-in Shelves - Tableau

Type-in Shelves

 

 

2- Freeform Calculations

You will be able to type calculations directly in the Columns or Rows shelves without having to create a new field. Autocomplete alternatives will be presented.

Free Form Calculations - Tableau

Free Form Calculations

 

 

3- Drag and Drop Calculations

This new feature also includes the ability to drag and drop fields directly into the calculation on the shelf.

Drag and Drop Calculations - Tableau

Drag and Drop Calculations

 

Once you are done creating your calculation on the shelf, you can take the newly created calculation and drag it into the Measures shelf so it becomes a permanent field in the data model.

New Field

New Field

 

4-New Calculations Editor

It looks much simpler than the old editor, but it is actually more powerful.  One of the things it allows you to do is to interact with the data while using the editor. It provides auto complete and as formulas are changed, visualizations are updated.

New Calculations Editor

New Calculations Editor

The editor also supports drag and drop, so fields can be dropped directly on the editor.

New Calculations Editor - Drag and Drop

New Calculations Editor – Drag and Drop

 

5-Drag and Drop Analytics

There is a new Analytics Pane. It contains the objects used to summarize, model and forecast data.

New Analytics Pane

New Analytics Pane

 

6- Instant Reference and Trend Lines

You can now drag and drop reference and trend lines.  Also forecasts, medians, quantiles, box plots and more.

Drag and Drop Trend Lines

Drag and Drop Trend Lines

Drag and Drop Forecasts

Drag and Drop Forecasts

Trend Lines and Forecast

Trend Lines and Forecast

7- Interactive table calculation editing

As powerful as table calculations are, they are sometimes confusing. Table across? table down? The new editor highlights how the data is being calculated once an option has been chosen on the editor. In the image below, there are three bars highlighted indicating how data points are being used in the calculation (Table Down).

Interactive Table Calculation Editing

Interactive Table Calculation Editing

8- Improved geographic search

You can type in any geographic data point and the view will focus on it.  Typing Greece zooms in to Greece on the map. And Country is not even part of the data. The geographical knowledge has been built into tableau.

 

Improved Geographic Search

Improved Geographic Search

 

9- Radial and Lasso Selections

Radial Selection

You will be able to select marks around a specific point.

Radial Selection

Radial Selection

Lasso Selections

You will be able to select exactly the marks you need (irregular shapes).

Lasso Selections

Lasso Selections

II. Performance

1- Multi-core query execution and vector operations support

The ability to leverage multi-core query execution increases the speed of the data engine. In demo systems queries are running 3 to 4 times faster.

2- Parallel queries

To improve performance on connections to live databases, tableau is working on sending multiple queries to these systems in parallel which reduces overall execution time building a dashboard. This also allows the database itself to share computations between related queries.

Dashboard - Multiple Queries

Dashboard – Multiple Queries

To build a dashboard like the one above, execution may look like this (Each bar is one query and tableau will run them one after the other).

Query Execution

Query Execution

 

By overlapping these queries and running them in parallel the overall time to build the dashboard will be reduced.

Query Execution in Parallel

Query Execution in Parallel

 

3- Continuous Tooltips

Currently, tooltips are displayed as you are interacting with a data point. As you move the mouse, the tooltip disappears and reappears once the mouse is over a new point.  Future tooltips will be continuously shown to reflect the new point you are interacting with without opening and closing.

 

Continuous Tooltips

Continuous Tooltips

4- Responsive Pan and Zoom

Currently as you focus on specific areas of your maps, the image disappears and reappears as you pan. In the future, as you pan to focus on specific areas of your maps, information will be immediately displayed on the screen so you can keep focus on your data.

Responsive Pan and Zoom

Responsive Pan and Zoom

5- Persisted Query Cache

To scale the speed across all the organization, this feature will allow tableau to share query results among multiple users so they won’t have to be recalculated for each user (i.e. A user calculates the median for a large dataset, and if another user also needs the same calculation, it will be shared with that user). The next time someone opens the workbook, the calculation will be available instantly as it will have be already calculated. Persisted query caches will be shared among all processes on all nodes of the cluster, so that all users can benefit from the cache equally.

III. Data Preparation

1- Automatic Split of a data field

The visual connector is being improved so that we can automatically split a data field. Take a look at the table below, where the data field “Loan Info” has 3 pieces of information. Tableau will be able to automatically detect this and split the data into the 3 corresponding fields.

Automatic Split of a Data Field

Automatic Split of a Data Field

After selecting the split option, Tableau will take the Loan Info field and split it into 3 fields that you can rename accordingly.

Split Data

Split Data

The option to split a data field  can be also be selected right on a viz:

Data Split on a Viz

Data Split on a Viz

2- Pre-processing of Excel data

Sometimes the data we get in an excel format is not ready for use in Tableau. We have to remove extra headers, empty lines and columns, and sometimes reshape it. Tableau will in the future automatically preprocess the data so that this work doesn’t have to be done manually.

Excel Data

Excel Data

We will see something like this when connecting to an excel file:

Screen Shot 2014-09-17 at 9.48.20 AM

 

In this example, we don’t want each year to be in a separate column, so the data can be further preprocessed by selecting the “Unpivot” option to reshape the data. And the resulting reshaped data will be ready for Tableau.

Screen Shot 2014-09-17 at 9.59.03 AM

 

Another interesting example is the pre-processing of survey data:

Survey Data

Survey Data

After loading and un-pivoting, data will be ready for analysis

Preprocessed Survey Data

Preprocessed Survey Data

3- Web Data Connector.

There will now be an option to connect Tableau to internal web services, Rest APIs, and JSON data.  You will need to write a script in Javascript and HTML, and then use the Web Data Connect option to point to the script. The data will then be loaded into Tableau.

Web Data Connector

Web Data Connector

 

A script example:

Script Example

Script Example

IV. Storytelling

1- View Thumbnails

In worksheets with multiple worksheets and dashboards, it is sometimes difficult to find which charts to add to the story. To facilitate this process, thumbnails are being added.

Storytelling Thumbnails

Storytelling Thumbnails

2- Format Options

Control over fonts and colors will be given to the user. Stories won’t have to be just gray.

Format Options

Format Options

3- Linear Navigator 

There will be an option to add a navigator appropriate for linear story telling. (Small boxes with numbers encouraging people to step through the story).

Navigator

Linear Navigator

4- Custom Navigator

You will be able to change the navigator size and position.

Custom Navigator

Custom Navigator

 

V. Enterprise

1- High Availability Management (Ease of Use)

High availability will be easier to manage, with faster failover detection and unlimited active data engine nodes.

2- Kerberos Support

Tableau will support Kerberos (a user authentication standard used across enterprises to provide single sign on authentication from the client all the way to the database).

3- Smartcard Support

If your organization uses smartcards, you will now be able to login directly from Tableau desktop.

4- APIs

APIs have been created to enable new extensibility, automation, and integration of your tableau deployments. Right now there are 3 APIs available: Javascript, Data Extract, and REST. In the next versions, features will be added to include the most requested functions for these APIs such as publishing content and assigning permissions programmatically.

5-Breadcrumbs

For each view on Tableau Server, information on where this view comes from will be provided (which workbook, which project). This way you will be able to navigate server more easily.

Breadcrumbs

Breadcrumbs

6- Workbook Pages

Every workbook will have a page with a workbook description, and details on all the views on the workbook, its data sources and permissions.

Views 

Views

 

 

Data Sources 

Data Sources

 

Permissions

Permissions

 

VI- Cloud

Tableau considers that 4 things are required to have a true cloud offering:

1- Ability to connect to cloud sources

Tableau already offers live cloud to cloud connection for data that lives on the cloud so you can see data updated up to the minute. (Google analytics, Google Big Query). More data sources will be added as they are important sources of business data.

2- Ability to connect to business applications in the cloud (Salesforce – Oauth)

You can keep salesforce data up to date via oauth.

Oauth

Oauth

3- Ability to connect to on premise data sources

Currently the top two data sources used with with Tableau Online are Excel and sql server. The challenge is how to keep on premise data fresh. A new capability called Tableau Data Sync is being added, and it will help you publish on premise data to tableau online. Your machine will act as a secure agent to push local data up to the cloud on a schedule that you define.

4- Getting analytics to where you do your work

It is important to provide the ability to push dashboards to wherever they are needed.  Recently the ability to embed interactive dashboards into salesforce was added. Now an add-in has been released that works with heroku and the salesforce canvas toolkit. All visualizations embedded in salesforce will be updated automatically.

Embedded in SalesForce

Embedded in SalesForce

VII- MOBILE

Tableau Mobile will have a new native User Interface. It has been fully rewritten to be faster and to show information coming live from the server fast.

1- Calculations are being brought to mobile

You will be able to tap on the measure, select “Create Calculated Field” and the calculation editor will open up. All new drag and drop capabilities will be added to mobile. Products are being designed to be web, mobile and desktop ready at the same time.

Mobile Calculations

Mobile Calculations

2- Favorites

Favorites will be displayed front and center.

Favorites

Favorites

3- ability to swipe through worksheets

4- Offline Access to Favorite Worksheets

You will be able to view your favorite dashboards even when you are offline.   You can select how often you want your worksheets to be fetched, and Tableau Mobile will fetch them when you are on the network.

Offline Access

Offline Access

5- New Visualization Mobile App (Elastic)

What do you do if you have just a tablet and you get an email with a table in it. You want to understand and see your data, but don’t have Tableau or a Server.  Tableau is building a new app that will provide a new experience to explore your data, as easy as using a pencil. It will be fun and fast. And it will stretch the definition of mobile analytics.  This project is called elastic and you can sign up to receive updates on this project at http://www.tableausoftware.com/be-elastic.

Open File with Elastic

Open File with Elastic

Screen Shot 2014-09-17 at 5.05.13 PM

Interact with the Data

 

 

I hope you find this information useful. You can download a file with this summary: Tableau’s New Features – Keynote 2014

If you want to listen to the whole keynote, it is posted at http://tcc14.tableauconference.com/keynote.

Homeless Coder releases app: Trees for Cars

Read Leo’s story, and support him by buying the app: Trees for Cars for $0.99.

About 3 months ago, Patrick McConlogue decided to help a homeless man: Leo “Journeyman” Grand by offering either $100 in cash, or three Javascript books, a basic laptop, and personal tutoring so he could learn to code.  Leo chose to code.

There were many skeptics and even haters who felt there was no point to the whole effort. Well..  Leo just released a mobile app called “Trees for Cars.” It is available on iOS and Android for 99 cents, and he gets about 70 cents for every download. He is looking forward to using that money to be able to afford a home, find a job, and/or pay for school.

Trees for Cars is a mobile carpooling app that connects drivers and riders. But even if you don’t carpool, wouldn’t it be great to spend less that you spend for a cup of coffee to support Leo and the effort to get more people to learn how to code?