Burn Media Sites

Elon Musk’s “De-Woke” AI Grok Slammed for Antisemitic Outbursts

Elon Musk’s AI chatbot Grok has once again found itself at the centre of controversy after the most recent update to Grok 3 triggered…

Jumia Taps Mirakl Ads to Supercharge Retail Media Across Its African Marketplace

Jumia, one of Africa’s leading e-commerce platforms, has replatformed its retail media operations to Mirakl Ads, marking a strategic step in expanding its advertising…

Samsung Galaxy Z Fold7: Raising the Bar for Smartphones

The most advanced Galaxy Z series yet, seamlessly blending precision engineering and powerful intelligence to elevate everyday interactions – all in its thinnest and…

Ecentric and MoneyBadger Partner to Bring Bitcoin Payments to South African Retailers

Cape Town-based payments leader Ecentric has announced a strategic partnership with MoneyBadger to bring Bitcoin payments to the forefront of mainstream retail in South…

MzN Unveils AI “Digital Teammate” to Supercharge Small NGO Fundraising

MzN International, a global social enterprise focused on helping mission-driven organisations scale impact, has officially launched the first of four AI-powered “Digital Teammates” designed…

Cape Town Creatives Can Now Work for Free at Africa’s First Creative-Tech Hub

A new wave of opportunity has arrived for the city’s creative sector. The timbuktoo Creative Hub, the first of its kind in Africa, has…

Acer Unleashes Beastly New Predator BiFrost and Nitro GPUs with AMD Radeon RX 9000 Series

Acer is raising the bar for gaming and content creation with the launch of its latest Predator BiFrost and Nitro graphics cards, now powered…

Philips Evnia Drops Jaw-Dropping QD OLED Monitors: 240Hz, Ambiglow, and All the Good Stuff

Game On, Reality Off: Philips Evnia Unleashes QD OLED Mayhem Let’s cut to the chase: Philips Evnia just nuked the gaming monitor scene. The…

Microsoft launches new Surface devices in new AI era

Microsoft today announced the general availability of the all-new Surface Pro and the all-new Surface Laptop to empower users in South Africa to unlock…

TDK enhances IMUs for extreme temps

TDK has responded to developing market needs with a new range of advanced inertial measurement units (IMU) for automotive applications. The Japanese electronics giant…

Data centres and defence are reviving diesel

Data centres will command power equivalent to the entire Japanese power grid by 2030. It’s a startling prediction and one that infrastructure futurists, data…

The most recognisable tactical pickup truck evolves

Perhaps the most iconic of all light tactical vehicles is the Toyota Land Cruiser Technical. These pickup trucks have been a platform of choice…

Continue in 10 seconds

Skip

General Tech • 19 Oct 2012

Big Data: 4 things you need if you’re going to mine it effectively

By Justin Lovell

Read nextGoogle’s financial fiasco: No cause for panic

Big Data

With all the talk of “Big” data from vendors and their sale forces, consultants excited by new opportunities and business people grappling with whether or not their revenue will go up, it is vital to understand the “Big” picture and the best way to do this is see where architecturally everything fits together.

Getting an understanding of the integrated architecture of Big Data is vital if any organisation is to understand how much of their current investment in their information environments including items like hardware, software tools and people’s skill sets can stay, need to be replaced or be upgraded.

The following areas make up the Integrated Architecture:

1. Data Sources
Essentially it does not matter what source the data is from, all that needs to be in place is an interface into the MapReduce framework for it to be processed and stored.

2. Hadoop Ecosystem
Data Import\Export — Tools such as SCOOP provide a framework that allows for data to be transferred between RDBMS and HDFS solutions via integration to MapReduce data transfer programs. Data transfer is performed in parallel without any fault tolerance. Most RDBMS vendors are now providing native connectors like Microsoft SQL 2012 Hadoop Data Connector.

High Performance Parallel Data Processing — More complex than traditional T-SQL, either way code must be written in order to process data. Final outcome is a blob mapped and reduced data.

Querying Engines — Traditionally JAVA was language of choice to query stored MapReduced data. Tools like PIG have simpler syntax. PIG converts code to MapReduce to send off Hadoop to retrieve data, in half the performance time and faster to write.

File Storage — HDFS is asynchronous and designed to scale seamlessly by adding more hardware. Data is stored in delimited flat files. Loading data to HDFS similar to copying file on an operating system.

NoSQL Database — HBase. Data is grouped in a Table that has rows of data that can have totally different column structure. Serves as an Indexed Key Value store on top of the HDFS store.· Data

Warehouse — Hive. Closer to traditional RDBMS in that it provides JOIN operations for Hbase tables. Maintains a meta-data layer of data aggregation and ad hoc querying with code that resembles T-SQL but which is limited.

3. Data Warehouse \ Business Intelligence
When examining the Integrated Architecture it is clear that the concept of not requiring a data warehouse is not all in itself correct. Sure introducing Big Data does not mean that a Data Warehouse is required, but careful planning and integration will ensure that the outputs, the business insights are available.

Social media data consumer Klout has an integrated architecture which allows the end business user to query and analyse data via a Microsoft SQL Server Analysis cube. The performance and value of such has even lead Klout architectures to term the functionality available as “Query at the speed of thought”.

The aim is to leverage off what each environment is best at providing. For example why not use a Hadoop ecosystem to crunch and distil data to the relevant metrics required by the business and then model those outputs dimensionally, keep them historically available in a traditional data mart and provide the analytical tools with functionality to empower the business.

4. The People
At the end of the day the skill set pool increases in an organisation but this does not mean that everyone will or can specialise in everything, rather best stick to specialists either in-house or consultants. These include:

Source System Specialists
Hadoop Specialists
Data Warehouse Specialists
BI Developers \ Data Scientists
IT Infrastructure Specialists

And of course don’t forget the person\team who pays for all of this!

In an effort to practically understand the relevant concepts and architectures, hopefully considering the two areas of Big Data Concepts and Integrated Architecture can help anyone understand where they fit into the big picture. With this understanding we can hopefully move away from seeing just the Big Yellow elephant in front of us and realise there are a lot more other interesting animals in the zoo.

Justin Lovell

Google’s financial fiasco: No cause for panic

Google • 19 Oct 2012

Jumia Taps Mirakl Ads to Supercharge Retail Media Across Its African Marketplace

Samsung Galaxy Z Fold7: Raising the Bar for Smartphones

Ecentric and MoneyBadger Partner to Bring Bitcoin Payments to South African Retailers

Cape Town Creatives Can Now Work for Free at Africa’s First Creative-Tech Hub

Acer Unleashes Beastly New Predator BiFrost and Nitro GPUs with AMD Radeon RX 9000 Series

Philips Evnia Drops Jaw-Dropping QD OLED Monitors: 240Hz, Ambiglow, and All the Good Stuff

Microsoft launches new Surface devices in new AI era

TDK enhances IMUs for extreme temps

Data centres and defence are reviving diesel

The most recognisable tactical pickup truck evolves

Big Data: 4 things you need if you’re going to mine it effectively

Justin Lovell

News

Jumia Taps Mirakl Ads to Supercharge Retail Media Across Its African Marketplace

Samsung Galaxy Z Fold7: Raising the Bar for Smartphones

Absa’s AI Clone of Fifi Peters Pushes Banking Into a Bold New Future

We use cookies

Welcome to Memeburn