Syntax : hadoop fsck / COMMAND_OPTION Description path Start checking from this path. -delete Delete corrupted files. -files Print out files being checked. -files -blocks Print out the block report -files -blocks -locations Print out locations for every block. -files -blocks -racks Print out network topology for data-node locations. -includeSnapshots Include snapshot data if the given path indicates a snapshottable directory or there are snapshottable directories under it. -list-corruptfileblocks Print out list of missing blocks and files they belong to. -move Move corrupted files to /lost+found. -openforwrite Print out files opened for write. HDFS support...
The traditional application management system, that is, the interaction of applications with relational database using RDBMS, is one of the sources that generate Big Data. Such Big Data, generated by RDBMS, is stored in Relational Database Servers in the relational database structure. When Big Data storages and analyzers such as MapReduce, Hive, HBase, Cassandra, Pig, etc. of the Hadoop ecosystem came into picture, they required a tool to interact with the relational database servers for importing and exporting the Big Data residing in them. Here, Sqoop occupies a place in the Hadoop ecosystem to provide feasible interaction between relational database server and Hadoop’s HDFS. Sqoop − “SQL to Hadoop and Hadoop to SQL” Sqoop is a tool designed to transfer data between Hadoop and relational database servers. It is used to import data from relational databases such as MySQL, Oracle to Hadoop HDFS, and export from Hadoop file system to relational databases. It is provided...
Differences between Hive and Pig: Hive Pig 1.Hive is declarative language works on HiveQL(HQL) which is similar to SQL. 1.Pig is procedural language. Works on Pig Latin 2.Used by data analytics people 2.Used by Researchers, and programmers. 3.Works only on structured data. 3.Works on structured, semi structured and unstructured data . 4.Hive operates on server side of the cluster. 4.Pig operates on Client side of the cluster. 5.Supports partitioning of data. 5.Doesnot support partitioning. 6.Doesnot load the data quickly but executes quickly 6.Loads the data quickly and effectively 7.Has separate metadata database on HDFS. 7.Doesnot have separate metadata database.Uses HDFS as its database. 8.Hive was first developed by facebook 8.Pig was first developed by ...
Comments
Post a Comment