Search This Blog

May 26, 2015

Hive: Insert OR Load data

1. Load data from another table
insert overwrite table batting
SELECT
regexp_extract(col_value,'^(?:([^,]*)\,?){1}',1) player_id,
regexp_extract(col_value,'^(?:([^,]*)\,?){2}',1) year,
regexp_extract(col_value,'^(?:([^,]*)\,?){9}',1) run
FROM temp_batting;
2. Load data from File
LOAD DATA INPATH '/user/hue/Batting.csv' OVERWRITE INTO TABLE temp_batting

May 25, 2015

HIVE: Create Table

CREATE TABLE TABLE_NAME (COL1 STRING, COL2 INT);

create table Batting
(PlayerId STRING, Year INT,Runs INT);

May 24, 2015

Hive: Select Statements

1. SELECT * FROM Table_Name;
2. SELECT * FROM Table_Name WHERE Column_Name =’’;
3. SELECT COUNT(*) FROM TABLE_Name;
4. SELECT COL_1, MAX(COL_2) FROM Table_Name
GROUP BY COL_1
5. SELECT a.* from Table_Name a
Join (select * from table_name ) b
On (a.col_1 = b.col_1)

May 23, 2015

Hive: Hive Commands

How to invoke Hive


hive


How to exit from Hive


quit;


List all databases present in Hive


show databases;


Execute commands in a database


use database_name;


List all tables present in a database


show tables


How to view table structure


describe table_name;


May 22, 2015

SQOOP: Jobs

How to get help of SQOOP JOB command


   sqoop help job


How to view saved SQOOP jobs


sqoop job --list


How to view content of saved SQOOP job


  sqoop job --show job_name


How to delete a saved SQOOP Job


 sqoop job --delete job_name

May 21, 2015

SQOOP: EXPORT data from HDFS to MYSQL

How to export data from HDFS (Hadoop) to MYSQL

The database and table must present. The HDFS file format is in tab delimiter in the following command.


sqoop export --connect jdbc:mysql://sandbox.hortonworks.com/custrm --username hive --password hive  --table customer --fields-terminated-by '\t' --lines-terminated-by '\n' --export-dir /user/root/customer/ -m 1

May 15, 2015

SQOOP: Import data from MYSQL database to HDFS and HIVE

To Get MYSQL Server hostname and port number
Use the following commands inside MySQL


use information_schema;
select * from GLOBAL_VARIABLES where variable_name like 'hostname';
select * from GLOBAL_VARIABLES where variable_name like 'port';


Display all the database names present in MYSQL Server

sqoop list-databases --connect  jdbc:mysql://localhost –username=hive –password=hive


IMPORT data from MYSQL TABLE to HIVE TABLE
sqoop import --connect jdbc:mysql://localhost/database_name --username=hive --password=hive --table table_name --hive-import

database_name  -- Replace this with MYSQL database name
table_name  -- Replace this with table name


IMPORT data from MYSQL TABLE with no primary key to HIVE TABLE

sqoop import --connect jdbc:mysql://localhost/database_name --username=hive --password=hive --table table_name --hive-import -m 1

database_name -- Replace with MYSQL Database
table_name  -- Replace with table name in MYSQL

-m 1 -- allocates one mapper.
–split-by column_name
you can use either -m option or --split-by option

To import data from MYSQL to an existing folder in HDFS

sqoop import --connect jdbc:mysql://localhost/Database_Name --username=hive --password=hive --table table_name  --append

database_name -- Replace with MYSQL Database
table_name  -- Replace with table name in MYSQL

--append option to write files in the existing folder.

IMPORT data using MYSQL Query to HDFS

sqoop import --connect jdbc:mysql://localhost/Database_Name --username=hive --password=hive -e’select * from employee where $CONDITIONS’  --target-dir /user/root/database/tablename
 
database_name -- Replace with MYSQL Database
--target-dir option path starts from the home folder.
 







May 10, 2015

Hadoop: Setting up single node Cluster on Linux desktop


Following are the helpful links for setting up Hadoop on Linux

Successfully setup the Hadoop cluster on LINUX desktop.

1. http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php
    http://qursaan-howto.blogspot.com/2014/12/how-to-install-hadoop-as-single-node-on.html
2. Editing read only file in VI editor 
    http://www.geekyboy.com/archives/629
command to save the read only files
:w !sudo tee %
O

3. VI editor commands for editing configuration files
    http://www.cs.rit.edu/~cslab/vi.html
    http://www.cs.colostate.edu/helpdocs/vi.html