Posts

Showing posts with the label EMC Greenplum

FreeStructure Unveiled: Unique Event Celebrates the People of Big Data

Image
FreeStructure is a global movement focused on leveraging our growing volumes and understanding of data. Culminating in a three-day gathering in San Francisco, FreeStructure brings together data communities to explore data’s fullest potential to solve real problems being faced by businesses, communities and the planet. To see how your personal and professional life is affected by data or to join the conversation, visit http://www.FreeStructure.com/ -------------------------------------------------------------- Important Dates to Remember Call for Speakers ends on 2 April 2013 for following tracks   Big Data Fundamentals  Business of Big Data  Policy and Privacy  Visualization and Illustration  Case Studies  Create Your Own Track  Geeks of a Feather Event Date : July 16-18, 2013 at the Moscone Center in San Francisco --------------------------------------------------------------

Real Big Data Defined

Image
We live in a world of Big Data . This year alone, over a trillion gigabytes of new data will be created globally. Big Data presents a big challenge – but also exciting new opportunities for enterprises to rise above the competition.  Keeping this in mind, the teams at  EMC   and Greenplum launched  Real Big Data (  www.therealbigdata.com ) -  as a place for the latest news and discussions in the world of  Big Data . In the last few months, the team has built a series of guides with some very good information on Big Data .  You may download the same from the link below -   Big Data: A CIO’s Cut Out and Keep Guide -  Every now and then something comes along that has the potential to change the face of business as we know it. Currently big data is being touted as that thing, and CIOs everywhere need to get a grip on what it is and how it can benefit their company.  Big Data: Riding the Wave | A Guide for IT ...

What is Greenplum HD ?

Image
Greenplum HD is enterprise-ready Apache Hadoop from EMC that allows users to write distributed processing applications for large data sets across a cluster of commodity servers using a simple programming model. This framework automatically parallelizes Map Reduce jobs to handle data at scale, thereby eliminating the need for developers to write  scalable and parallel algorithms.  Greenplum HD is an open source Apache stack and includes the following components: Hadoop Distributed File System (HDFS): File system that distributes files  across the cluster.  MapReduce: Framework for writing scalable data applications.  Pig: Procedural language that abstracts lower level MapReduce.  Hive: Data warehouse infrastructure built on top of Hadoop.  HBase: Database for random, real time read/write access.  Mahout: Scalable machine learning and data mining library.  ZooKeeper: Hadoop centralized servi...

EMC Greenplum Chorus is open source - OpenChorus Project

Image
In the legacy analytics process, data scientists face challenges in accessing and sharing the right data. GreenplumChorus helps foster a complete data science ecosystem with best-of-breed analytics applications. As a social platform for collaborative data science, Greenplum Chorus users can increase productivity, decrease administrative burdens on IT infrastructures, and get better visibility and faster access to data through a single tool. EMC recently released the Greenplum Chorus source code under an Apache open source license through the  OpenChorus  Project. The OpenChorus Project will speed innovation and adoption of collaborative data science practices, helping organizations to drive greater business insight and economic value from Big Data. Making OpenChorus accessible to Data Scientists Greenplum and Kaggle joined forces to tackle the short supply and heavy demand for data scientists with an integration between the Kaggle data science c...

Data Scientist as a career option #datasci

Image

Visualizations from The Human Face of Big Data

Image
Data visualization  is the study of the visual representation of data, meaning "information that has been abstracted in some schematic form, including attributes or variables for the units of information". The Human Face of Big Data   is a global, crowdsourced media project focusing on humanity's new ability to collect, analyze and visualize vast amounts of data in real time.   Data visualizations help paint a picture of how Big Data affects and measures our lives. These revealing new interactive data visualizations were created for The Human Face of Big Data project Mission Control. A team of designers and data scientists from EMC Cloud Services, EMC Greenplum, and Tableau Software analyzed one billion unique global tweets from Twitter, along with other data, to amplify themes and stories from the project. A data set of approximately 170 billion unique data elements was drawn from the billion tweets and loaded into an EMC Greenplum Data Computing A...