Prologika Newsletter Fall 2014


aml

According to Gartner, one main data analytics trend this year and beyond will be predictive analytics. “Increasing competition, cost and regulatory pressures will motivate business leaders to adopt more prescriptive analytics, making business decisions smarter and more repeatable and reducing personnel costs”. Prologika has been helping our customers implement predictive analytics for revenue forecasting, outlier detection, and train data scientists in predictive analytics. This newsletter introduces you to Azure Machine Learning.


WHAT IS AZURE MACHINE LEARNING?

Microsoft unveiled the public preview of Azure Machine Learning (previously known as project Passau) in July 2014. Your humble correspondent has been participating in the private Preview Program to learn it firsthand and provide feedback. Microsoft Azure Machine Learning is a web-based service for cloud-based predictive analytics. You upload your data to the cloud, and then define datasets and workflows to create “experiments”. As you would recall, you can create organizational data mining models using the SQL Server Analysis Services and Excel data mining capabilities. Microsoft Azure Machine Learning to organizational DM models is what Power Pivot to Analysis Services is in a sense that it democratizes predictive analytics. It targets business users willing to create predictive models. The following figure shows a simple experiment to predict the customer’s likelihood to purchase a product.

bikebuyers

HOW DOES IT WORK?

The process of creating an experiment is simple:

  1. The user subscribes to the Azure Machine Learning service. For the pricing details of the public preview service, see this page (be aware that with all cloud-based offerings, everything is subject to change). Currently, the user is charged 38 cents per hour to use the service.
  2. The user defines the input data by uploading a file. Supported file formats include CSV, tab-delimited, plain text, Attribute Relationship File Format, Zip, and R Object or Workspace. In addition, Azure Machine Learning can connect to Azure cloud data sources, such as HDInsight or Azure SQL Server.
  3. The user creates an experiment which a fancy term for a workflow that defines the data transformations. For example, the user might want to filter the input dataset or use just a sample of the input data. In my case, I used the Split data transformation to divide the input dataset into two parts: one that is used for training the model, and the second one that is used for predictions.
  4. The user selects a Machine Learning algorithm(s) that will be used for predictions. In this case, I used the Two-Class Boosted Decision Tree algorithm. This is where users will need some guidance and training about which mining algorithm to use for the task at hand.
  5. The user trains the model at which point the model learns about the data and discovers patterns.
  6. The user uses a new dataset or enters data manually to predict (score) the model. For example, if the model was used to predict the customer probability to purchase a product from past sales history, a list of new customers can be used an input to find which customers are the most likely buyers. The Score Model task allows you to visualize the output and to publish it as a Web service that is automatically provisioned, load-balanced, and auto-scaled, so that you can implement business solutions that integrate with the Web service.

WHY SHOULD YOU CARE?

Implementing predictive solutions has been traditionally a difficult undertaking. Azure Machine Learning offers the following benefits to you BI initiatives:

  1. It simplifies predictive analytics by targeting business users.
  2. As a cloud-based offering, it doesn’t require any upfront investment in software and hardware.
  3. It promotes collaboration because an experiment can be shared among multiple people.
  4. Unlike SQL Server data mining, it supports workflows, similar to some high-end predictive products, such as SAS.
  5. It supports R. Hundreds of existing R modules can be directly used.
  6. It supports additional mining models that are not available in SQL Server Analysis Services and that came from Microsoft Research.
  7. It allows implementing automated business solutions that integrate with the Azure Machine Learning service.

 

As you’d agree, the BI landscape is fast-moving and it might be overwhelming. As a Microsoft Gold Partner and premier BI firm, you can trust us to help you plan and implement your data analytics projects.

Regards,

Teo Lachev

Teo Lachev
President and Owner
Prologika, LLC | Making Sense of Data
Microsoft Partner | Gold Business Intelligence

EVENTS & RESOURCES

Atlanta BI Group: Atlanta BI Group: DAX 101 on September 29th
Atlanta BI Group: Atlanta BI Group: Predictive Analytics From Theory to Practice on October 27th