Sunday 11 June 2017

Importing Data from Relational Databases into Hadoop

Although Apache Hadoop was created to method immense amounts of unstructured knowledge, particularly on the net, you may realize Hadoop sitting aboard a computer database. This stems from the prevalence of relative databases. you need to conjointly contemplate the actual fact that firms, people, and groups considering a migration to Hadoop Training ought to port their knowledge to Hadoop so MapReduce jobs will use it. though you'll be able to piece and manually execute knowledge migration, their area unit tools offered to try to this for you. One such tool is Sqoop, that was discharged by Cloudera, however, currently, it’s associated Apache project. because the name signifies, it scoops knowledge from a relative supply to HDFS, and contrariwise.

Before we tend to see however Sqoop imports information into Hadoop, let’s initial check out however we will be intimate while not a 3rd party tool. the apparent approach is to form a MapReduce job that pulls information from a structured information supply, like MySQL dB with direct JDBC access and write it into HDFS. the first issue with this approach is configuring and handling dB association pooling since approach too several mappers can try and maintain an association whereas propulsion information.

Now let’s see however we tend to try this exploitation Sqoop. the primary step is to transfer, install and piece SQOOP. check that you have got Hadoop one.4+ on your machine. Configuring SQOOP is pretty simple, however, browse Time for action – downloading and configuring Sqoop in Hadoop Beginner’s Guide for additional details. once putting in SQOOP, check that you have got the JDBC driver of your relative sound unit (MySQL for example) and replica it into SQOOP’s lib directory. Everything you wish is about up. currently, we are going to arrange to merely dump information from a MySQL table into structured files on HDFS.

Learn Now Hadoop Training Course reach us , Hadoop Training in Chennai , Hadoop Training Institute in Chennai , Best Hadoop Training in Chennai , Hadoop Training , Hadoop Training in Chennai with placement,  Big Data Hadoop Training in chennai

{ Create an “Employees” table in MySQL and populate it with some data. We will import this table into HDFS. Now run Sqoop to export this data:
Let’s examine what just happened. With a single sqoop command, we pulled data from MySQL database’s “Employees” table into Hadoop. The first option specifies the type of sqoop (import in our case). Next, we listed the JDBC URI for our MySQL database along with the table name. Sqoop does the rest. Sqoop will place the data pulled from the table into multiple files (based upon the number of mappers it spun) into the home directory.


$ sqoop import --connect jdbc:mysql://localhost/hadooptest
--username hadoopuser --password password --table employees

$ hadoop fs -ls employees

Found 6 items
-rw-r--r--   3 hadoop supergroup     
               0 2013-04-24 04:10 /user/hadoop/employees/_SUCCESS
drwxr-xr-x   - hadoop supergroup     
               0 2012-04-24 04:10 /user/hadoop/employees/_logs
-rw-r--r--   3 … /user/hadoop/employees/part-m-00000
-rw-r--r--   3 … /user/hadoop/employees/part-m-00001
-rw-r--r--   3 … /user/hadoop/employees/part-m-00002
-rw-r--r--   3 … /user/hadoop/employees/part-m-00003


1 comment: