The phenomenal growth in demand for Big Data talent is apparently set to continue. A recent survey of Fortune 500 companies, by consultants New Vantage Partners, found that 85% have either launched Big Data projects or are planning to do so, and that their spending on data analysis will jump by an average of 36% over the next several years. No wonder, then, that Harvard Business Review, in an article last October, called data analytics “the sexiest job of the 21st century.”
We are using the Internet wrong. Smartphones turn people into horrible listeners. And cat videos aren’t as riveting as we think they are.
These are just some of the revelations writer Paul Miller had during a year of self-imposed exile from the Internet.
Miller came back online May 1 after giving up the Internet for a year and documenting his experiences for tech site The Verge. After a nerve-wracking start (including finding 22,000 e-mails in his inbox), Miller is settling comfortably back into the Web’s black hole of information and nonstop chatter.
We talked to Miller about what he learned on the other side, what’s changed online in the past year, and how his dream of being a cyborg won’t involve Google Glass.
With the preview of Windows Azure Virtual Machines, we have two new special types of blobs stored in Windows Azure Storage: Windows Azure Virtual Machine Disks and Window Azure Virtual Machine Images. And of course we also have the existing preview of Windows Azure Drives. In the rest of this post, we will refer to these as storage, disks, images, and drives. This post explores what drives, disks, and images are and how they interact with storage.
I get strange looks when I talk to developers about the difference between developing an application to a product versus developing an application to a service. The application you write on premise is written to a piece of software purchased, installed and configured on a piece of computer hardware that you privately own. The application you write in the cloud is written to a set of services that are available to you as well as the public to exploit. So let’s explore how they are different.
We love data, big and small and we are always on the lookout for interesting datasets. Over the last two years, the BigML team has compiled a long list of sources of data that anyone can use. It’s a great list for browsing, importing into our platform, creating new models and just exploring what can be done with different sets of data.
Binary Large OBject (BLOB) storage is the usual way of storing file-based information in Azure. Blobs are charged according to outbound traffic, storage space and the operations performed on storage contents. This means that the ways that you manage Blob Storage will affect both cost and availability.
The Windows Azure platform has been growing rapidly, both in terms of functionality and number of active users. Key to this growth is Windows Azure Storage, which allows users to store several different types of data for a very low cost. However, this is not the only benefit as it also provides a means to auto scale data to deliver seamless availability with minimal effort.
This guide will demonstrate how to perform common scenarios using the Windows Azure Blob storage service. The samples are written in C# and use the Windows Azure Storage Client Library for .NET (Version 2.0). The scenarios covered include uploading, listing, downloading, and deleting blobs. For more information on blobs, see the Next steps section.
Google released a 21-part short video series that introduces R. Most of the videos are about two minutes, with none of them going over six, and each one is a on focused task or concept. So this could be a good way to start. Just open R, start a video, and follow along.
The Apache Hadoop UI – Tutorials and Examples for Hadoop, HBase, Hive, Impala, Oozie, Pig, Sqoop and Solr — Hadoop tutorial: how to access Hive in Pig with HCatalog in Hue
What is HCatalog?
Apache HCatalog is a project enabling non-Hive scripts to access Hive tables. You can then directly load tables with Pig or MapReduce without having to worry about re-defining the input schemas, caring about the data location or duplicating it.
Hue comes with an application for accessing the Hive metastore within your browser: Metastore Browser. Databases and tables can be navigated through and created or deleted with some wizards.
The wizards were demonstrated in the previous tutorial about how to Analyse Yelp data. Hue uses HiveServer2 for accessing the Hive Metastore instead of HCatalog. This is because HiveServer2 is the new secure and multi concurrent server for Hive and it already includes a fast Hive Metastore API.
HCatalog connectors are however useful for accessing Hive data from Pig. Here is a demo about accessing the Hive example tables from the Pig Editor.
via Hue – Hadoop User Experience – The Apache Hadoop UI – Tutorials and Examples for Hadoop, HBase, Hive, Impala, Oozie, Pig, Sqoop and Solr — Hadoop tutorial: how to access Hive in Pig with HCatalog in Hue.
When you want to simulate a TPC-C based workload, you have to do 2 different things:
- Creating the necessary database with the initial data
- Run the TPC-C against the created database
Let’s have a more detailed look on both of these steps. Before you can create the actual database, you have to tell the tool with which database system you are working. Hammerora supports the following database systems:
- Microsoft SQL Server
You can set your actual database through the menu option Benchmark/Benchmark Options:
- February 22, 2015
- February 3, 20154 Comments
- February 3, 2015
- February 2015
- January 2015
- September 2014
- August 2014
- March 2014
- February 2014
- November 2013
- May 2013
- September 2012
- December 2011
- November 2011
- October 2011
- September 2011
- August 2011
- July 2011
- June 2011
- February 2011
- January 2011
- December 2010
- November 2010
- October 2010
- September 2010
- August 2010
- July 2010
- May 2010
- March 2010
- February 2010
- December 2009
- November 2009
- October 2009
- September 2009
- July 2009