Big Data , Machine Learning and AI: March 2016

Tuesday, March 22, 2016

Android Error and Solution

Error : Error:exception during working with external system:
Solution : Check the dependency classpath of gradle in build.gradle file
(There is version mismatch in my case )

Saturday, March 19, 2016

Adding logs in android Middleware

Changes in .mk file :

LOCAL_LDLIBS := -llog -landroid -lEGL -lGLESv1_CM

changes in file :

#include <android/log.h>
#define LOGI(...) ((void)__android_log_print(ANDROID_LOG_INFO, "webkit_test", __VA_ARGS__))

addings logs
LOGI("Test logs");

Finding Path in the BST

 #include <iostream>  
 #include <conio.h>  
 using namespace std;  
 class node  
 {  
 public:  
  int data ;  
  node(int in)  
  {data = in;  
   left = NULL ;  
   right = NULL;  
  }  
  node* left ;  
  node* right;   
 };  
 node* findpathrootnode(int a ,int b ,node* root)  
 {  
  if(root->data > a && root->data < b)  
  {  
  return root;  
  }  
  else  
  {  
  findpathrootnode(a,b,root->right);  
  }  
 }  
 void findleftpath(int a ,node * root)  
 {  
  if(root!= NULL && root->data > a)  
   findleftpath(a,root->left);  
  else if(root!= NULL && root->data < a)  
   findleftpath(a,root->right);  
   cout<<root->data << endl;   
 }  
 void findrightpath(int b , node * root)  
 {  
  cout<<root->data<<endl;  
  if(root!= NULL && root->data > b)  
   findrightpath(b,root->left);  
  else if(root!= NULL && root->data < b)  
   findrightpath(b,root->right);  
 }  
 main()  
  {  
  node * obj1 = new node(5);  
  node * obj2 = new node(6);  
  node * obj3 = new node(4);  
  node * obj4 = new node(7);  
  node * obj5 = new node(8);  
  node * obj6 = new node(9);  
  // make tree  
  obj1->left = obj3;  
  obj1->right = obj5;  
  obj2->left = NULL;  
  obj2->right = NULL;  
  obj4->left = obj2;  
  obj4->right = NULL;  
  obj3->left = NULL;  
  obj3->right = NULL;  
  obj5->left = obj4;  
  obj5->right = obj6;  
  obj6->left = NULL;  
  obj6->right = NULL;  
  //tree in done  
  node * newroot = findpathrootnode(4,7,obj1);  
  findleftpath(4,newroot);  
  findrightpath(7,newroot->right);  
  getch();   
  }

Wednesday, March 2, 2016

Encryption / decryption of DB in android ( Sqlchiper )

Encrypting database

below link will help you to encrypt db

http://sqlcipher.net/sqlcipher-for-android/

Decryption of database

once you encrypted your db , you may need to decrypt your database .

follow below steps to decrypt your database

you need linux system to decrypt your database or follow

http://thebugfreeblog.blogspot.in/2012/08/compiling-sqlcipher-for-windows.html this link

download sqlcipher_2.1.1.orig.tar.gz from the below site

https://launchpad.net/ubuntu/+source/sqlcipher/2.1.1-2

or run

$ wget https://launchpad.net/ubuntu/+archive/primary/+files/sqlcipher_2.1.1.orig.tar.gz

#untar using below command

$ tar -zxvf sqlcipher_2.1.1.orig.tar.gz

$ apt-get install build-essential

$ sudo apt-get install libssl-dev

# $ cd sqlcipher-2.1.1 , go to that folder

$ ./configure --enable-tempstore=yes CFLAGS="-DSQLITE_HAS_CODEC" LDFLAGS="-lcrypto"

$ make

$ ./sqlite3 encrypted.db ;

sqlite> PRAGMA key = 'password'; -- you need to pass the password which is used while creating db.

sqlite> ATTACH DATABASE 'plaintext.db' AS plaintext KEY ''; -- empty key will disable encryption

sqlite> SELECT sqlcipher_export('plaintext');

sqlite> DETACH DATABASE plaintext;

Sequence file in Hadoop

What is the sequence file in Hadoop?

· File which stores key& value in binary format

· As it is binary format , we can compress that , results it comsumes less Diskspce, less I/O operation, less bandwith

· It also resolves small file problem (whole data of the small file becomes the value of the sequence file )

Now we are going to look in to, how to convert large number of small files to sequence file

Below is the java code for writing sequence file

 public class SequenceFileWritter {  
     public static void main(String[] args) throws IOException {  
        String uri = args[1];  
        Configuration conf = new Configuration();  
        FileSystem fs = FileSystem.get(conf);  
        Path path = new Path(uri);  
        Text key = new Text();  
        Text value = new Text();  
        File infolder = new File(args[0]);  
        SequenceFile.Writer writer = null;  
        try {  
            FSDataOutputStream stm = fs.create(path);  
            writer = SequenceFile.createWriter(conf, stm, key.getClass(), value.getClass(),  
             SequenceFile.CompressionType.BLOCK, new DefaultCodec(), new Metadata());  
            File[] listOfFiles = infolder.listFiles();  
            System.out.printf("Folder is ", infolder.toString());  
            if (null != listOfFiles) {  
               System.out.printf("# of files ", listOfFiles.length);  
               for (int i = 0; i < listOfFiles.length; i++) {  
                  if (listOfFiles[i].isFile()) {  
                      key.set(listOfFiles[i].getName());  
                      value.set(listOfFiles[i].getPath());  
                      writer.append(key, value);  
                      System.out.printf("[%s]\t%s\t%s\n", writer.getLength(), key, value);  
                  } else if (listOfFiles[i].isDirectory()) {  
                      System.out.println("Directory " + listOfFiles[i].getName());  
                  }  
               }  
            } else {  
               System.out.printf("list of files is null ", " check ");  
            }  
        } finally {  
            IOUtils.closeStream(writer);  
        }  
     }  
 }

To read sequence file

 public class SequenceFileRead  
 {        
  public static void main(String[] args) throws IOException {  
     String uri = args[0];  
     Configuration conf = new Configuration();  
     Path path = new Path(uri);  
     SequenceFile.Reader reader = null;  
     FileSystem fs = FileSystem.get(conf);  
     try {      
     reader = new SequenceFile.Reader(fs, path, conf);  
     Writable key = (Writable) ReflectionUtils.newInstance(reader.getKeyClass(), conf);  
     Writable value = (Writable) ReflectionUtils.newInstance(reader.getValueClass(), conf);  
     while (reader.next(key, value)) {  
      String syncSeen = reader.syncSeen() ? "sync" : "";  
      System.out.printf("[%s]\t%s\t%s\n", syncSeen, key, value);  
     }  
     } finally {  
        IOUtils.closeStream(reader);  
        }        
     }  
 }

Hive partitions (Static and Dynamic)

Partition in Hive

What is Partition?

Partition is physical and logical separation of data in Hive

Why do we need partition?

to increase the performance of analysis

What are the different types of partition?

There are two types of partition in Hive

Static partition

Dynamic Partition

How to create partition table

Below is the query to create partition table

$ create EXTERNAL table stu_name_par (id int , name String) partitioned by (age int) row format delimited fields terminated by ‘,’ lines terminated by ‘\n’ stored as textfile;

Static Partition

How to load data

Below command used to load the data (Note: stu_name.txt contains only 2 fileds id & name)

$ load data local inpath ‘/root/TEST_DATA/Hive/Join_data/stu_name.txt’ into table stu_name_par partition (age = 25);

Dynamic partition

set the following property to enable dynamic partition (By default dynamic partition are disabled to avoid multiple creation of partition , many partition means many number of files which leads to many number of IO operation which is not recommended in Hadoop environment )

set hive.exec.dynamic.partition=true;

set hive.exec.dynamic.partition.mode=nonstrict;

set hive.exec.max.dynamic.partitions=1000;

set hive.exec.max.dynamic.partitions.pernode=1000;

You cannot load the flat file data in to the dynamic partition table.

You should have a table which has data

Following command used to create to create the table

$create EXTERNAL table stu_name_no_par (id int , name String,age int) row format delimited fields terminated by ‘,’ lines terminated by ‘\n’ stored as textfile;

$load data local inpath ‘/root/TEST_DATA/Hive/Join_data/age_grp_rand.txt’ into table stu_name_no_par;

We have loaded different age group student data.

Now, Create table which has partition

$ create EXTERNAL table stu_name_dynamic_par (id int , name String) partitioned by (age int) row format delimited fields terminated by ‘,’ lines terminated by ‘\n’ stored as textfile;

(Note: stu_name_dynamic_par & stu_name_par both tables are same , creating another table to clarity )

Loading data to the dynamic partition table

$ INSERT INTO TABLE stu_name_dynamic_par PARTITION (age) SELECT id,name,age FROM stu_name_no_par;

We can use the similar command for static partition also (Note : age column missing in projection)

INSERT overwrite TABLE stu_name_dynamic_par PARTITION (age = 26 ) SELECT id,name FROM stu_name_no_par where age = 26;

How to check my partitions

$ show partitions stu_name_dynamic_par

$ show partitions stu_name_par

Above command will give you the partitions in the table

Hadoop & its ecosystem Build/Runtime Error

Error :
classpathException in thread “main” java.lang.NoClassDefFoundError: org/apache/hadoop/io/SequenceFile$Writer$Option

Solution:
Hadoop version mismatch in client and cluster . Use same version in both places

below command used to check the version

$ hadoop version

Thread

Thread is most important concept in software development . It is very important to know about it .
What is Thread :

Thread is separate execution of code
one thing I wanted to tell ,Threads are EVIL ,don’t create it until and unless an absolute necessary .

Why (Advantage) Threads :
To have concurrent execution of code .

Why not (Disadvantages) :
1. Thread introduces context switch ,which will slow down the application performance
2. Need to use synchronization for the shared object

yes , I hope you know , why Thread is having separate STACK and why it is sharing HEAP …