Data, Data, Data: Thousands of Public Data Sources

We love data, big and small and we are always on the lookout for interesting datasets. Over the last two years, the BigML team has compiled a long list of sources of data that anyone can use. It’s a great list for browsing, importing into our platform, creating new models and just exploring what can be done with different sets of data.

via Data, Data, Data: Thousands of Public Data Sources | The Official Blog of BigML.com.


8 Essential Best Practices in Windows Azure Blob Storage

Binary Large OBject (BLOB) storage is the usual way of storing file-based information in Azure. Blobs are charged according to outbound traffic, storage space and the operations performed on storage contents. This means that the ways that you  manage Blob Storage will affect both cost and availability.

The Windows Azure platform has been growing rapidly, both in terms of functionality and number of active users. Key to this growth is Windows Azure Storage, which allows users to store several different types of data for a very low cost. However, this is not the only benefit as it also provides a means to auto scale data to deliver seamless availability with minimal effort.

via 8 Essential Best Practices in Windows Azure Blob Storage.


How to use blob storage – Windows Azure feature guide

This guide will demonstrate how to perform common scenarios using the Windows Azure Blob storage service. The samples are written in C# and use the Windows Azure Storage Client Library for .NET (Version 2.0). The scenarios covered include uploading, listing, downloading, and deleting blobs. For more information on blobs, see the Next steps section.

via How to use blob storage – Windows Azure feature guide.


Introduction to R, a video series by Google

Google released a 21-part short video series that introduces R. Most of the videos are about two minutes, with none of them going over six, and each one is a on focused task or concept. So this could be a good way to start. Just open R, start a video, and follow along.

via Introduction to R, a video series by Google.


The Apache Hadoop UI – Tutorials and Examples for Hadoop, HBase, Hive, Impala, Oozie, Pig, Sqoop and Solr — Hadoop tutorial: how to access Hive in Pig with HCatalog in Hue

What is HCatalog?

Apache HCatalog is a project enabling non-Hive scripts to access Hive tables. You can then directly load tables with Pig or MapReduce without having to worry about re-defining the input schemas, caring about the data location or duplicating it.

Hue comes with an application for accessing the Hive metastore within your browser: Metastore Browser. Databases and tables can be navigated through and created or deleted with some wizards.

The wizards were demonstrated in the previous tutorial about how to Analyse Yelp data. Hue uses HiveServer2 for accessing the Hive Metastore instead of HCatalog. This is because HiveServer2 is the new secure and multi concurrent server for Hive and it already includes a fast Hive Metastore API.

HCatalog connectors are however useful for accessing Hive data from Pig. Here is a demo about accessing the Hive example tables from the Pig Editor.

via Hue – Hadoop User Experience – The Apache Hadoop UI – Tutorials and Examples for Hadoop, HBase, Hive, Impala, Oozie, Pig, Sqoop and Solr — Hadoop tutorial: how to access Hive in Pig with HCatalog in Hue.


Running a TPC-C workload on SQL Server

When you want to simulate a TPC-C based workload, you have to do 2 different things:

  • Creating the necessary database with the initial data
  • Run the TPC-C against the created database

Let’s have a more detailed look on both of these steps. Before you can create the actual database, you have to tell the tool with which database system you are working. Hammerora supports the following database systems:

  • Oracle
  • MySQL
  • Microsoft SQL Server

You can set your actual database through the menu option Benchmark/Benchmark Options:

via Running a TPC-C workload on SQL Server – SQLServerCentral.


Basketball analytics

Kirk Goldsberry talks the rise of analytics usage in the NBA. With cameras above every court recording player movements, there’s …

from FlowingData http://ift.tt/1bLNMPd


Excel: Using Inquire

Quietly, Microsoft released a new feature with Excel 2013 Professional called Inquire. Inquire allows you to do all kinds of things at an excel level, making it much easier to analyze workbooks for formulas, compare dependencies across the environment from both the workbook and worksheet level. In this post, I’d like to walk through the […]

from Intelligent SQL http://ift.tt/1hPxqNv


Everything You Need To Know About the World’s Top Billionaires

If you ever wondered about Billionaires, this infographic has all the information you need to be inspired whilst learning a lot.

The post Everything You Need To Know About the World’s Top Billionaires appeared first on Lifehack.

from Lifehack http://ift.tt/1cWdiFf


15 Best Available Bourbons

There’s no denying that bourbon is having a moment. It’s become the basis for an obscene number of cocktails, and any bar worth its weight in complimentary pretzels is stocking the stuff, often exclusively. Why? The pride of Kentucky wins out over other whiskies because it’s a little sweeter, a little smoother, and a whole lot easier to mix. It’s also relatively affordable — very good bottles are available at very good prices. But thanks to its newfound popularity, some of the top-tier bottles — Pappy Van Winkle’s family reserve, George T. Stagg — are now shockingly expensive and, increasingly, hard to track down. Luckily, there’s still a wide variety to bourbons at accessible prices that are readily available in nearly every state. Which one to choose? Here’s a list to help you out.

via 15 Best Available Bourbons – Gear Patrol.