Skip to Content

Overview of content related to 'hadoop'

Syndicate content

This page provides an overview of 1 article related to 'hadoop'. Note that filters may be applied to display a sub-set of articles in this category (see FAQs on filtering for usage tips). Select this link to remove all filters.

 'Inspecting article' image: copyright, used under license from shutterstock.com
Apache Hadoop is an open source software framework that supports data-intensive distributed applications licensed under the Apache v2 license. It enables applications to work with thousands of computational independent computers and petabytes of data. Hadoop was derived from Google's MapReduce and Google File System (GFS) papers. Hadoop is a top-level Apache project being built and used by a global community of contributors, written in the Java programming language. Yahoo! has been the largest contributor to the project, and uses Hadoop extensively across its businesses. On February 19, 2008, Yahoo! Inc. launched what it claimed was the world's largest Hadoop production application. The Yahoo! Search Webmap is a Hadoop application that runs on more than 10,000 core Linux cluster and produces data that is now used in every Yahoo! Web search query. There are multiple Hadoop clusters at Yahoo!, and no HDFS filesystems or MapReduce jobs are split across multiple datacenters. Every hadoop cluster node bootstraps the Linux image, including the Hadoop distribution. Work that the clusters perform is known to include the index calculations for the Yahoo! search engine. On June 10, 2009, Yahoo! made available the source code to the version of Hadoop it runs in production. Yahoo! contributes back all work it does on Hadoop to the open-source community. The company's developers also fix bugs and provide stability improvements internally, and release this patched source code so that other users may benefit from their effort. In 2010 Facebook claimed that they had the largest Hadoop cluster in the world with 21 PB of storage. On July 27, 2011 they announced the data had grown to 30 PB. Besides Facebook and Yahoo!, many other organizations are using Hadoop to run large distributed computations. Some of the notable users include: (Excerpt from Wikipedia article: Hadoop)

Key statistics

Metadata related to 'hadoop' (as derived from all content tagged with this term):

  • Number of articles referring to 'hadoop': 3 (0.2% of published articles)
  • Total references to 'hadoop' across all Ariadne articles: 10
  • Average number of references to 'hadoop' per Ariadne article: 3.33
  • Earliest Ariadne article referring to 'hadoop': 2007-07
  • Trending factor of 'hadoop': 0 (see FAQs on monitoring of trends)

See our 'hadoop' overview for more data and comparisons with other tags. For visualisations of metadata related to timelines, bands of recency, top authors, and and overall distribution of authors using this term, see our 'hadoop' usage charts. Usage chart icon

Top authors

Ariadne contributors most frequently referring to 'hadoop':

  1. jackson pope (see articles on this topic by this author)
  2. philip beresford (see articles on this topic by this author)
  3. marieke guy (see articles on this topic by this author)
  4. manolis mavrikis (see articles on this topic by this author)
  5. patricia charlton (see articles on this topic by this author)

Note: Links to all articles by authors listed above set filters to display articles by each author in the overview below. Select this link to remove all filters.

Title Article summary Datesort icon

IIPC Web Archiving Toolset Performance Testing at The British Library

Jackson Pope and Philip Beresford pick up the threads from an initial contribution in issue 50 and report on progress at The British Library in installing and performance testing the Web Curator Tool.

July 2007, issue52, feature article

CSVXML
Syndicate content


by Dr. Radut