evlingkar

Postingan

Pig UDF

- Oktober 01, 2014

Pig provides extensive support for user defined functions (UDFs) as a way to specify custom processing. Pig UDFs can currently be implemented in three languages: Java, Python, JavaScript, Ruby and Groovy. The most extensive support is provided for Java functions. You can customize all parts of the processing including data load/store, column transformation, and aggregation. Java functions are also more efficient because they are implemented in the same language as Pig and because additional interfaces are supported such as the Algebraic Interface and the Accumulator Interface. Limited support is provided for Python, JavaScript, Ruby and Groovy functions. These functions are new, still evolving, additions to the system. Currently only the basic interface is supported; load/store functions are not supported. Furthermore, JavaScript, Ruby and Groovy are provided as experimental features because they did not go through the same amount of testing as Java or Python. At runtime note t...

Baca selengkapnya

Hive Architecture

- Juni 30, 2014

Command line interface: It’s the default and the most common way of accessing hive. Hiveserver : Runs hive as a server exposing a thrift service,enabling access from a range of clients written in different languages. HWI : Hive web interface Shell: Shell is the command line interface.It allows interactive queries like MySQL shell connected to database.Also supports web and JDBC clients. Driver,compiler and execution engine take the HiveQL scripts and run in Hadoop environment. Driver: The component which receives the queries. This component implements the notion of session handles and provides execute and fetch APIs modeled on JDBC/ODBC interfaces. Compiler: The component that parses the query, does semantic analysis on the different queries blocks and query expressions and eventually generates an execution plan with the help of the table and partition metadata looked up from the metastore. Execution engine: The component which executes the execution plan cr...

Baca selengkapnya

Pig Overview

- Februari 28, 2014

Hive Vs Pig Feature Hive Pig Language SQL-like PigLatin Schemas/Types Yes (explicit) Yes (implicit) Partitions Yes No Server Optional (Thrift) No User Defined Functions (UDF) Yes (Java) Yes (Java) Custom Serializer/Deserializer Yes Yes DFS Direct Access Yes (implicit) Yes (explicit) Join/Order/Sort Yes Yes Shell Yes Yes Streaming Yes Yes Web Interface Yes No JDBC/ODBC Yes (limited) No Apache Pig and Hive are two projects that layer on top of Hadoop, and provide a higher-level language for using Hadoop's MapReduce library. Apache Pig provides a scripting language for...

Baca selengkapnya

Cari Blog Ini

evlingkar

Postingan

Featured Post

Neo4j Overview

Pig UDF

Hive Architecture

Pig Overview