G Praveen Kumar


Homepage of
G. Praveen Kumar

22918 38th AVE SE

Bothell, WA - 98021

USA

Phone : +1 765 637 1367

Email : gpraveenkumar5[at]ymail[dot]com

Homepage : http://gpraveenkumar.com

Resume | CV

If we knew what it was we were doing,
it would not be called research, would it?
-- Albert Einstein

Get in Touch :

Biography

Research Interests

Research Experience

Work Experience

Publications

Projects

Courses

Achievements

Positions of Responsibility

Blog

Biography

I am a Applied Scientist at Microsoft since March, 2016.

Previosly, I was PhD (ABD) Student at the Department of Computer Science, Purdue University. I work with Professors Dr. Jennifer Neville and Dr. Luo Si on problems related to Social Network and Machine Learning. My expertise are in the areas of Modelling, Classification, Clustering, Label Propagation, Data Sciences and the like. I aspire to become a Data Scientist. Prior to joining Purdue, I worked for an year as an Associate IT Consultant at ITC Infotech Bangalore, India. Prior to that, I completed my Bachelor’s degree in Computer Science and Engineering from National Institute of Technology, Durgapur, India with honours. Actively seeking for collabortors in the areas of my reasech interests.

Biography
I am a Applied Scientist at Microsoft since March, 2016. Previosly, I was PhD (ABD) Student at the Department of Computer Science, Purdue University. I work with Professors Dr. Jennifer Neville and Dr. Luo Si on problems related to Social Network and Machine Learning. My expertise are in the areas of Modelling, Classification, Clustering, Label Propagation, Data Sciences and the like. I aspire to become a Data Scientist. Prior to joining Purdue, I worked for an year as an Associate IT Consultant at ITC Infotech Bangalore, India. Prior to that, I completed my Bachelor’s degree in Computer Science and Engineering from National Institute of Technology, Durgapur, India with honours. Actively seeking for collabortors in the areas of my reasech interests.

Research Interests

Machine Learning

Large Language Models

Social Network Analysis

Data Mining

Information Retreival

Big Data Analytics

Natural Language Processing

Semantic Web

Text Mining

Cryptography and Network Security

Research Interests
Machine Learning Large Language Models Social Network Analysis Data Mining Information Retreival Big Data Analytics Natural Language Processing Semantic Web Text Mining Cryptography and Network Security

Research Experience

Research Assistant at CS Department
Jan 12 - Present

Improving Classification Accuracy by replicating the training data is a well studied problem in the text domain. Techniques like Marginalized Denoising Autocoders have improved classification performance by marginalizing or taking expectation over the training data without actually replicating them. Similar approaches to solve problems like label prediction, link prediction in the Network Domain have not been explored. Initial results we obtained for label prediction after replicating data by flipping labels, dropping nodes, dropping/rewiring edges are promising.
Project Guide: Dr. Jennifer Neville
CS Purdue University

Built a system that would capture code as students write programs in Alice Programming language. Built a tutor out of While module to give live feedback and decide student promotions based on the code captured. Developing a recommendation system to indicate common programming fallacies to prevent a student from the same and to improve the programming experience.
Project Guide: Dr. Luo Si,Dr. Buster Dunsmore, Dr. Steve Cooper
CS Purdue University, CS Purdue University, CS Stanford University

Built couple of modules in a Math Tutoring System for Students with Learning Disabilities. This intelligent tutoring system for math problems was built by using Adobe Flash. I also used to make regular school visit to work with students of Grade 3,4 and 5 to see how they interacted with the UI to incrementally improve the quality of the software based on student interaction.
Project Guide: Dr. Luo Si, Dr. Yan Ping Xin
CS Purdue University, College of Education Purdue University

Research Assistant at ITaP (IT at Purdue)
August 11 - Dec 11

Worked with the Scientfic Solutions Group at RCAC (Rosen Center for Advanced Computing) on a project - Useful to Usable (U2U): Transforming Climate Variability and Change Information for Cereal Crop Producers. Processed the scientific data collected over 30 years, developing a Joomla component using PHP for the users of the system to generate plots of the variable like temperature, rainfall etc., for a given location and integrating it with Drinet Hubzero. The URL of the website I built.

Research intern at Knowledge and Data Engineering, Germany
May 09 - July 09

Interned with the Knowledge and Data Engineering Research Group at University of Kassel. During the period, I worked, along with Professor Dr. Gerd Stumme and members of the group, on a research project "Semantic Analysis in Query Log Data" and built a "Similarity Framework" in Java that integrates Perl scripts for the computation of similar tags in Folksonomy or like data. The results of our works are published at ECML PKDD 2009.

Research Experience
Research Assistant at CS Department	Jan 12 - Present
Improving Classification Accuracy by replicating the training data is a well studied problem in the text domain. Techniques like Marginalized Denoising Autocoders have improved classification performance by marginalizing or taking expectation over the training data without actually replicating them. Similar approaches to solve problems like label prediction, link prediction in the Network Domain have not been explored. Initial results we obtained for label prediction after replicating data by flipping labels, dropping nodes, dropping/rewiring edges are promising. Project Guide: Dr. Jennifer Neville CS Purdue University
Built a system that would capture code as students write programs in Alice Programming language. Built a tutor out of While module to give live feedback and decide student promotions based on the code captured. Developing a recommendation system to indicate common programming fallacies to prevent a student from the same and to improve the programming experience. Project Guide: Dr. Luo Si,Dr. Buster Dunsmore, Dr. Steve Cooper CS Purdue University, CS Purdue University, CS Stanford University
Built couple of modules in a Math Tutoring System for Students with Learning Disabilities. This intelligent tutoring system for math problems was built by using Adobe Flash. I also used to make regular school visit to work with students of Grade 3,4 and 5 to see how they interacted with the UI to incrementally improve the quality of the software based on student interaction. Project Guide: Dr. Luo Si, Dr. Yan Ping Xin CS Purdue University, College of Education Purdue University
Research Assistant at ITaP (IT at Purdue)	August 11 - Dec 11
Worked with the Scientfic Solutions Group at RCAC (Rosen Center for Advanced Computing) on a project - Useful to Usable (U2U): Transforming Climate Variability and Change Information for Cereal Crop Producers. Processed the scientific data collected over 30 years, developing a Joomla component using PHP for the users of the system to generate plots of the variable like temperature, rainfall etc., for a given location and integrating it with Drinet Hubzero. The URL of the website I built.
Research intern at Knowledge and Data Engineering, Germany	May 09 - July 09
Interned with the Knowledge and Data Engineering Research Group at University of Kassel. During the period, I worked, along with Professor Dr. Gerd Stumme and members of the group, on a research project "Semantic Analysis in Query Log Data" and built a "Similarity Framework" in Java that integrates Perl scripts for the computation of similar tags in Folksonomy or like data. The results of our works are published at ECML PKDD 2009.

Work Experience

Data Scientist intern at Apple
May 15 - Aug 15

Worked at the iAd team on User Segmentation and Behavioural Targeting. Ideated and prototyped a new product - Lookalike Segments.. Used Latent Semantic Analysis (SVD) to find latent feature/traits among users. Segmented user based on their click behaviour pattern, their relation to Apps and latent features. Used Hive to preprocess data, Python and R for data processing. Built a visualization tool using D3 to qualitatively analyze and compare User and Lookalike Segment. Also implemented the more superior Probabilistic Latent Semantic Analysis technique.

Data Scientist intern at Linkedin
May 14 - Aug 14

Worked with the Data Sciences team on Clustering Fields of Study (Majors). Constructed networks of Fields of Study (FoS) using features like member skills, inferred classmates. Detected clusters of FoS using Louvain's Modularity (hierarchical community detection algorithm for graphs/networks) to improve and modify Linkedin's existing FoS taxonomy. Clusters obtained were significantly better than that of traditional hierarchical agglomerative clustering. Used visualization tools like Gephi and D3 to analyze the graph, clusters and taxonomy. Used Apache Pig and Hadoop to preprocess data, Python to do all the data processing and HDFS to store the results.

Associate IT Consultant at ITC Infotech, India
July 10 - June 11

I have worked as an IT Associate Consultant at ITC Infotech, India. I work with the Product Lifecycle Management team, on a project, for Brown Shoes, a global footwear company. I customize software called FlexPLM that runs on Windchill according to client’s requirement. Specifically, I build web pages incorporating necessary logic using JSP and java script.

Work Experience
Data Scientist intern at Apple	May 15 - Aug 15
Worked at the iAd team on User Segmentation and Behavioural Targeting. Ideated and prototyped a new product - Lookalike Segments.. Used Latent Semantic Analysis (SVD) to find latent feature/traits among users. Segmented user based on their click behaviour pattern, their relation to Apps and latent features. Used Hive to preprocess data, Python and R for data processing. Built a visualization tool using D3 to qualitatively analyze and compare User and Lookalike Segment. Also implemented the more superior Probabilistic Latent Semantic Analysis technique.
Data Scientist intern at Linkedin	May 14 - Aug 14
Worked with the Data Sciences team on Clustering Fields of Study (Majors). Constructed networks of Fields of Study (FoS) using features like member skills, inferred classmates. Detected clusters of FoS using Louvain's Modularity (hierarchical community detection algorithm for graphs/networks) to improve and modify Linkedin's existing FoS taxonomy. Clusters obtained were significantly better than that of traditional hierarchical agglomerative clustering. Used visualization tools like Gephi and D3 to analyze the graph, clusters and taxonomy. Used Apache Pig and Hadoop to preprocess data, Python to do all the data processing and HDFS to store the results.
Associate IT Consultant at ITC Infotech, India	July 10 - June 11
I have worked as an IT Associate Consultant at ITC Infotech, India. I work with the Product Lifecycle Management team, on a project, for Brown Shoes, a global footwear company. I customize software called FlexPLM that runs on Windchill according to client’s requirement. Specifically, I build web pages incorporating necessary logic using JSP and java script.

Publications [DBLP] [Microsoft Academic Search]

thesis

G. Praveen Kumar, Anirban Sarkar, Ilhyun Lee, Haesun Lee and Narayan C. Debnath “A Novel Approach for Hierarchical Clustering in Non - Binary Search Space”, In the Proceedings of 8th International Conference on Industrial Informatics (INDIN ’10), pp. 693 - 697, Osaka, Japan, 13-16th July, 2010. [ PDF ]

G. Praveen Kumar and Anirban Sarkar “Weighted Association Rule Mining and Clustering in Non-Binary Search Space”, In the Proceedings of the 7th International Conference on Information Technology: New Generations (ITNG ’10), pp. 238 - 243, Las Vegas, Nevada, USA, 12–14 April, 2010. [ PDF ]

G. Praveen Kumar, Arjun Kumar Murmu, Biswas Parajuli, and Prasenjit Choudhury “MULET : A Multilanguage Encryption Technique”, In the Proceedings of the 7th International Conference on Information Technology: New Generations (ITNG ’10), pp. 779 - 782, Las Vegas, Nevada, USA, 12–14 April, 2010. [ PDF ]

G. Praveen Kumar, Biswas Parajuli, Arjun Kumar Murmu, Prasenjit Choudhury and Jaydeep Howlader “A Lossless MOD-ENCODER Towards a Secure Communication”, In the Proceedings of the International Conference on Recent Trends in Information, Telecommunication and Computing (ITC ’10), pp. 330 - 332, Cochin, Kerela, India, 12–13 March, 2010. [ PDF ]

Dominik Benz, Beate Krause, G. Praveen Kumar, Andreas Hotho, Gerd Stumme, “Characterizing Semantic Relatedness of Search Query Terms” In A. Nürnberger, M. Berthold (eds.): Proc. Workshop on Explorative analytics of Information Networks at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2009), pp. 119 - 135, Bled, Slovenia, 11th September, 2009. [ PDF ] [ PPT ]

G. Praveen Kumar, Anirban Sarkar and Narayan C. Debnath “A New Algorithm for Frequent Itemset Generation in Non-Binary Search Space”, In the Proceedings of 6th International Conference on Information Technology: New Generations (ITNG ’09), pp. 149 - 153, Las Vegas, USA, 27–29 April, 2009 (Nominated for Best Paper Award). [ PDF ]

Publications [DBLP] [Microsoft Academic Search]
thesis G. Praveen Kumar, Anirban Sarkar, Ilhyun Lee, Haesun Lee and Narayan C. Debnath “A Novel Approach for Hierarchical Clustering in Non - Binary Search Space”, In the Proceedings of 8th International Conference on Industrial Informatics (INDIN ’10), pp. 693 - 697, Osaka, Japan, 13-16th July, 2010. [ PDF ] G. Praveen Kumar and Anirban Sarkar “Weighted Association Rule Mining and Clustering in Non-Binary Search Space”, In the Proceedings of the 7th International Conference on Information Technology: New Generations (ITNG ’10), pp. 238 - 243, Las Vegas, Nevada, USA, 12–14 April, 2010. [ PDF ] G. Praveen Kumar, Arjun Kumar Murmu, Biswas Parajuli, and Prasenjit Choudhury “MULET : A Multilanguage Encryption Technique”, In the Proceedings of the 7th International Conference on Information Technology: New Generations (ITNG ’10), pp. 779 - 782, Las Vegas, Nevada, USA, 12–14 April, 2010. [ PDF ] G. Praveen Kumar, Biswas Parajuli, Arjun Kumar Murmu, Prasenjit Choudhury and Jaydeep Howlader “A Lossless MOD-ENCODER Towards a Secure Communication”, In the Proceedings of the International Conference on Recent Trends in Information, Telecommunication and Computing (ITC ’10), pp. 330 - 332, Cochin, Kerela, India, 12–13 March, 2010. [ PDF ] Dominik Benz, Beate Krause, G. Praveen Kumar, Andreas Hotho, Gerd Stumme, “Characterizing Semantic Relatedness of Search Query Terms” In A. Nürnberger, M. Berthold (eds.): Proc. Workshop on Explorative analytics of Information Networks at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2009), pp. 119 - 135, Bled, Slovenia, 11th September, 2009. [ PDF ] [ PPT ] G. Praveen Kumar, Anirban Sarkar and Narayan C. Debnath “A New Algorithm for Frequent Itemset Generation in Non-Binary Search Space”, In the Proceedings of 6th International Conference on Information Technology: New Generations (ITNG ’09), pp. 149 - 153, Las Vegas, USA, 27–29 April, 2009 (Nominated for Best Paper Award). [ PDF ]

Projects

Label Prediction in Social Networks using NLP
Mar 15 - Present

Trying to predict the gender, political preferences and religious views of Users(Nodes) on Social Networks like Facebook. Initially used only network features and techniques like Gibbs Sampling for prediction. Started looking at textual Features to improve the prediction accuracy. For instance, by using Facebook wall posts of users and their connections, their gender can be predicted with 92% accuracy.
Project Guide: Dr. Dan Goldwasser
Assistant Professor, Department of Statistics, Purdue University

Quantitative Analysis of words and categories in Multiclass Regression
Dec 13 - Present

Trying to apply a joint high dimensional Bayesian Variable and Covariance Selection model to the multiclass textual classification. The word features are the variables and hence, variable selection problem corresponds to finding words that are good predictors overall and for specific categories. The covariance selection gives information about dependencies between the multiple categories.
Project Guide: Dr. Anindya Bhadra
Assistant Professor, Department of Computer Science, Purdue University

Empirical Analysis of Personal Email Network
Nov 13

Constructed and analyzed three different types of ego networks obtained from Gmail consisting of about \textit{seven and half years} of emails. Applied clustering and community detection algorithms to detect communities based of my email communications and compared them with communities detected from my facebook friendship network. Interestingly, I could recover a good number of them. [Project Link] [Report]

TREC - Knowledge Base Acceleration Track
May 13 -- Aug 13

Had to filter documents related to entities (140 Wikipedia and 20 Twitter) that are worthy of citation in their profiles. The challenges we \textit{two fold}, \textit{one} the data was huge around 6.5 TB of compressed data consisting of social data, news articles etc. and \textit{two}, the entities had very few training examples, in the order of 10. Built a model similar to one-vs-all classifier and F1 measure was close to 0.6.
Project Guide: Dr. Luo Si
Associate Professor, Department of Computer Science, Purdue University

Supervised LDA for Masquerader Detection
Feb 13 -- Apr 13

Extended a work of the PhD Thesis of Malek Ben Salem, that builds user-profiles based on search behaviour with a predefined taxonomy of applications and processes to detect masquerader attacks and intrusion detection. Built a novel method by using a variation of LDA to build the taxonomy automatically . Also showed that by using the latent classes obtained from the model as feature, we could build classifiers that give the same performance as those that used all the feature, essentially a huge feature space reduction.
Project Guide: Dr. Seregy Kishner
Associate Professor, Department of Computer Science, Purdue University

Indiana Social Search
May 12 - Present

Built a website in PhP Indiana Social Search, to I crawl and classify news articles and tweets from Google News and Twitter respectively, into predened categories. This system is going to be integrated with the famous INDURE project. I am also working on extracting trend from the articles classify to make a trend cloud of the popular happenings in the state of Indiana and extracting mean ingful summaries for the crawled news articles. The LINK to the website.
Project Guide: Dr. Luo Si
Associate Professor, Department of Computer Science, Purdue University

Sampling and Analysis of Social Network Activity Graphs
Sep 11 - Dec 11

Mining Information from social networks gives valuable information about user activity and interaction. Constructed social network activity graphs of senders and receivers from the Purdue email data. Sampled data over two day window spans and computed various graph properties like the average degree, density etc. for these windows and the aggregate graph. Compared and contrasted email user activity with those of friendship networks like facebook.
Project Guide: Dr. Jennifer Neville and Dr. Ramana Rao Kompella
Assistant Professors, Department of Computer Science, Purdue University

Data Mining in Non-Binary Data Sets
July 08 - April 2010

Binary dataset representation gives information about an item being present or not in the search space, but does not provide any information about the strength of its presence which can be more effective in drawing association rules close to real life situations. Hence, we developed an algorithm for mining frequent itemsets and association rules from non-binary search space. As an extension, we generated weighted association rules. Further, we developed clustering algorithms for non-binary search space.
Project Guide: Dr. Anirban Sarkar and Dr. Narayan C. Debanath
Asstistant Professor, Deptartment of Computer Application, NIT Durgapur and Professor, Deptartment of CS, Winona State University, USA

Data Mining in Mobile Networking
Jul 09 – Apr 10

Data from Mobile Networks was analysed for predicting user movement, customer recommendation, business forecast and analysis. Predicting the user movement is an issue of major concern in mobile communication for better handoff mechanism and ensuring quality of service. Grouped User Profile based on Cells matrices and hierarchically clustered them. Also grouped frequent cells together based on user movement. Built a framework in java for performing necessary computations.
Project Guide: Mr. Parag Kumar Guhathakurtha
Assistant Professor, Department of Computer Science and Engineering, NIT Durgapur

Compression and Encryption for Secure Communication
Jul 09 - Nov 09

The ever increasing internet traffic constantly urges the need for enhancing communication security. So, we developed an algorithm for performing encryption and lossless compression at the same time in order to increase bandwidth utilization and to secure data transmission. We essentially converted the message into a bi-tuple using mapping techniques and encoded only one elements of the tuple.
Project Guide: Mr. Prasenjit Chowdhury and Mr. Jaydeep Howlader
Asstistant Professor, Department of Computer Application and Asstistant Professor, Department of Information Technology, NIT Durgapur

Semantic Analysis in Query Log Data
May 09 - July 09

Mining for semantic information from search engine query logs bears great potential for both the optimization of search engines and bootstrapping Semantic Web applications. Further, the formalization of log data into Logsonomies retains semantics information. Therefore we analysed and semantically characterized query term relatedness by grounding it to WordNet and compared it to prior results of Folksonomies.
Project Guide: Dr. Gerd Stumme and Dr. Andreas Hotho
Professor and Senior Researcher, Department of EE/CS, University of Kassel, Germany

MULET : A Multilanguage Encryption Technique
Mar 09 - Oct 09

The use of a multilingual approach in cryptography was not prevalent. So we focused on encryption of plain text over a range of languages supported by Unicode. We used mapping techniques to make the algorithm fast, efficient and easier to implement. Further, the replacement strategy used ensures better security. We believe this will facilitate the localization of Cryptographic Software tools.
Project Guide: Mr. Prasenjit Chowdhury
Assistant Professor, Department of Computer Application, NIT Durgapur

Document Clustering using Lexical Chains
Dec 09 – Jan 10

Lexical chains can be used to group documents together based on a common idea contained in the documents. Quality of clustering was improved by considering hypernyms, hyponyms etc. to build synsets and consequently lexical chains. We also addressed a situation where a document has a set of lexical chains common with one document and another set of lexical chains common with another document and so on. A Hierarchy of clusters can best depict such situation. Cliques can obtain such hierarchies from documents considered as nodes of a graph.
Project Guide: Dr. B. Ravindran
Associate Professor, Department of CSE, Indian Institute of Technology, Madras

Formal verification of softwares using Spin Model Checker
Mar 08 – Feb 09

Was involved in research on a project for Formal verification of Software. The Spin Model Checker is used to verify the integrity of software. Extracted the state transition diagrams and used the language Promela to get the properties verified. This may be extended to verify application in Web 2.0 and verification of network protocols.
Project Guide: Mr. Prasenjit Chowdhury
Assistant Professor, Department of Computer Application, NIT Durgapur

Projects
Label Prediction in Social Networks using NLP	Mar 15 - Present
Trying to predict the gender, political preferences and religious views of Users(Nodes) on Social Networks like Facebook. Initially used only network features and techniques like Gibbs Sampling for prediction. Started looking at textual Features to improve the prediction accuracy. For instance, by using Facebook wall posts of users and their connections, their gender can be predicted with 92% accuracy. Project Guide: Dr. Dan Goldwasser Assistant Professor, Department of Statistics, Purdue University
Quantitative Analysis of words and categories in Multiclass Regression	Dec 13 - Present
Trying to apply a joint high dimensional Bayesian Variable and Covariance Selection model to the multiclass textual classification. The word features are the variables and hence, variable selection problem corresponds to finding words that are good predictors overall and for specific categories. The covariance selection gives information about dependencies between the multiple categories. Project Guide: Dr. Anindya Bhadra Assistant Professor, Department of Computer Science, Purdue University
Empirical Analysis of Personal Email Network	Nov 13
Constructed and analyzed three different types of ego networks obtained from Gmail consisting of about \textit{seven and half years} of emails. Applied clustering and community detection algorithms to detect communities based of my email communications and compared them with communities detected from my facebook friendship network. Interestingly, I could recover a good number of them. [Project Link] [Report]
TREC - Knowledge Base Acceleration Track	May 13 -- Aug 13
Had to filter documents related to entities (140 Wikipedia and 20 Twitter) that are worthy of citation in their profiles. The challenges we \textit{two fold}, \textit{one} the data was huge around 6.5 TB of compressed data consisting of social data, news articles etc. and \textit{two}, the entities had very few training examples, in the order of 10. Built a model similar to one-vs-all classifier and F1 measure was close to 0.6. Project Guide: Dr. Luo Si Associate Professor, Department of Computer Science, Purdue University
Supervised LDA for Masquerader Detection	Feb 13 -- Apr 13
Extended a work of the PhD Thesis of Malek Ben Salem, that builds user-profiles based on search behaviour with a predefined taxonomy of applications and processes to detect masquerader attacks and intrusion detection. Built a novel method by using a variation of LDA to build the taxonomy automatically . Also showed that by using the latent classes obtained from the model as feature, we could build classifiers that give the same performance as those that used all the feature, essentially a huge feature space reduction. Project Guide: Dr. Seregy Kishner Associate Professor, Department of Computer Science, Purdue University
Indiana Social Search	May 12 - Present
Built a website in PhP Indiana Social Search, to I crawl and classify news articles and tweets from Google News and Twitter respectively, into predened categories. This system is going to be integrated with the famous INDURE project. I am also working on extracting trend from the articles classify to make a trend cloud of the popular happenings in the state of Indiana and extracting mean ingful summaries for the crawled news articles. The LINK to the website. Project Guide: Dr. Luo Si Associate Professor, Department of Computer Science, Purdue University
Sampling and Analysis of Social Network Activity Graphs	Sep 11 - Dec 11
Mining Information from social networks gives valuable information about user activity and interaction. Constructed social network activity graphs of senders and receivers from the Purdue email data. Sampled data over two day window spans and computed various graph properties like the average degree, density etc. for these windows and the aggregate graph. Compared and contrasted email user activity with those of friendship networks like facebook. Project Guide: Dr. Jennifer Neville and Dr. Ramana Rao Kompella Assistant Professors, Department of Computer Science, Purdue University
Data Mining in Non-Binary Data Sets	July 08 - April 2010
Binary dataset representation gives information about an item being present or not in the search space, but does not provide any information about the strength of its presence which can be more effective in drawing association rules close to real life situations. Hence, we developed an algorithm for mining frequent itemsets and association rules from non-binary search space. As an extension, we generated weighted association rules. Further, we developed clustering algorithms for non-binary search space. Project Guide: Dr. Anirban Sarkar and Dr. Narayan C. Debanath Asstistant Professor, Deptartment of Computer Application, NIT Durgapur and Professor, Deptartment of CS, Winona State University, USA
Data Mining in Mobile Networking	Jul 09 – Apr 10
Data from Mobile Networks was analysed for predicting user movement, customer recommendation, business forecast and analysis. Predicting the user movement is an issue of major concern in mobile communication for better handoff mechanism and ensuring quality of service. Grouped User Profile based on Cells matrices and hierarchically clustered them. Also grouped frequent cells together based on user movement. Built a framework in java for performing necessary computations. Project Guide: Mr. Parag Kumar Guhathakurtha Assistant Professor, Department of Computer Science and Engineering, NIT Durgapur
Compression and Encryption for Secure Communication	Jul 09 - Nov 09
The ever increasing internet traffic constantly urges the need for enhancing communication security. So, we developed an algorithm for performing encryption and lossless compression at the same time in order to increase bandwidth utilization and to secure data transmission. We essentially converted the message into a bi-tuple using mapping techniques and encoded only one elements of the tuple. Project Guide: Mr. Prasenjit Chowdhury and Mr. Jaydeep Howlader Asstistant Professor, Department of Computer Application and Asstistant Professor, Department of Information Technology, NIT Durgapur
Semantic Analysis in Query Log Data	May 09 - July 09
Mining for semantic information from search engine query logs bears great potential for both the optimization of search engines and bootstrapping Semantic Web applications. Further, the formalization of log data into Logsonomies retains semantics information. Therefore we analysed and semantically characterized query term relatedness by grounding it to WordNet and compared it to prior results of Folksonomies. Project Guide: Dr. Gerd Stumme and Dr. Andreas Hotho Professor and Senior Researcher, Department of EE/CS, University of Kassel, Germany
MULET : A Multilanguage Encryption Technique	Mar 09 - Oct 09
The use of a multilingual approach in cryptography was not prevalent. So we focused on encryption of plain text over a range of languages supported by Unicode. We used mapping techniques to make the algorithm fast, efficient and easier to implement. Further, the replacement strategy used ensures better security. We believe this will facilitate the localization of Cryptographic Software tools. Project Guide: Mr. Prasenjit Chowdhury Assistant Professor, Department of Computer Application, NIT Durgapur
Document Clustering using Lexical Chains	Dec 09 – Jan 10
Lexical chains can be used to group documents together based on a common idea contained in the documents. Quality of clustering was improved by considering hypernyms, hyponyms etc. to build synsets and consequently lexical chains. We also addressed a situation where a document has a set of lexical chains common with one document and another set of lexical chains common with another document and so on. A Hierarchy of clusters can best depict such situation. Cliques can obtain such hierarchies from documents considered as nodes of a graph. Project Guide: Dr. B. Ravindran Associate Professor, Department of CSE, Indian Institute of Technology, Madras
Formal verification of softwares using Spin Model Checker	Mar 08 – Feb 09
Was involved in research on a project for Formal verification of Software. The Spin Model Checker is used to verify the integrity of software. Extracted the state transition diagrams and used the language Promela to get the properties verified. This may be extended to verify application in Web 2.0 and verification of network protocols. Project Guide: Mr. Prasenjit Chowdhury Assistant Professor, Department of Computer Application, NIT Durgapur

» More Projects

Courses

Coursera

Machine Leanring

Social Network Analysis

Social and Economic Networks: Models and Analysis

Mining Massive Datasets

Big Data in Education

Computing for Data Analysis

Statistics One

Computer Network Management
May 08 - June 08

An intensive experience providing hands-on training in Network management addressing practical aspects such as Linux essentials, shell scripting, socket programming, installation and maintenance of HTTP, FTP, DNS, NFS servers at Nettech INC. in association with the Goa Institute of Management, Goa.

Courses
Coursera
Machine Leanring Social Network Analysis Social and Economic Networks: Models and Analysis Mining Massive Datasets Big Data in Education Computing for Data Analysis Statistics One
Computer Network Management	May 08 - June 08
An intensive experience providing hands-on training in Network management addressing practical aspects such as Linux essentials, shell scripting, socket programming, installation and maintenance of HTTP, FTP, DNS, NFS servers at Nettech INC. in association with the Goa Institute of Management, Goa.

Achievements

Received scholarship from NIT Durgapur and NITDAA (NITD Alumni Association) as funding for my internship at KDE group, University of Kassel, Germany.

Positioned 1^st in “Open Project”, the Project cum Paper presentation contest in “Mukti ‘10”, the Annual Technical Symposium on GNU/Linux and Free Software of NIT Durgapur, 5^th - 7^th February, 2010.

Adjudged 1^st in “Concepts” by IEEE Student Branch, NIT Durgapur for the best project abstract proposed amongst 40 abstracts.

Stood 1^st “The Brand Game” for designing and marketing a Mutual Fund firm in “aarohan2k9”, a National Level Techno-Management Festival of NIT Durgapur held during 26^th February - 1^st, March 2009.

Awarded 2^nd prize in “Konfigure”, the System Administration contest in “Mukti ‘09”, the Annual Technical Symposium on GNU/Linux and Free Software of NIT Durgapur, 2^nd - 8^th February, 2009.

Judged as 3^rd best undergraduate performer by Sun MicroSystems for the project “The Ultimate Exam Simulator” in Share 2008.

Secured a place in the Top 10 among 138 participants in the Network Management Training Program, organized by Goa Institute of Management, Goa and Nettech INC. .

Certified ‘Good’ Core Java professional by NIIT.

Received Certificate of Merit for scoring good percentage Marks in Standard XII and X.

Achievements
Received scholarship from NIT Durgapur and NITDAA (NITD Alumni Association) as funding for my internship at KDE group, University of Kassel, Germany. Positioned 1^st in “Open Project”, the Project cum Paper presentation contest in “Mukti ‘10”, the Annual Technical Symposium on GNU/Linux and Free Software of NIT Durgapur, 5^th - 7^th February, 2010. Adjudged 1^st in “Concepts” by IEEE Student Branch, NIT Durgapur for the best project abstract proposed amongst 40 abstracts. Stood 1^st “The Brand Game” for designing and marketing a Mutual Fund firm in “aarohan2k9”, a National Level Techno-Management Festival of NIT Durgapur held during 26^th February - 1^st, March 2009. Awarded 2^nd prize in “Konfigure”, the System Administration contest in “Mukti ‘09”, the Annual Technical Symposium on GNU/Linux and Free Software of NIT Durgapur, 2^nd - 8^th February, 2009. Judged as 3^rd best undergraduate performer by Sun MicroSystems for the project “The Ultimate Exam Simulator” in Share 2008. Secured a place in the Top 10 among 138 participants in the Network Management Training Program, organized by Goa Institute of Management, Goa and Nettech INC. . Certified ‘Good’ Core Java professional by NIIT. Received Certificate of Merit for scoring good percentage Marks in Standard XII and X.

Technical Skills

Programming languages: C/C++, Java, Python, Perl, PL/SQL, Visual Basic, Latex.

Data analysis and visualization: R, Matlab, Gephi, Lemur and Indri toolkits, RapidMiner, D3.

Big Data technologies: Hive, Apache Pig, Hadoop.

Web Development: HTML/DHTML, PHP, JSP, Ajax.

Tools: Eclipse, NetBeans, SVN, GitHub, Star UML, Star UML, Adobe Dreamweaver and Flash.

Hardware: Verilog.

Databases:HDFS, Titan, Oracle, MySQL, IBM DB2, MSSQL.

Logic: Prolog.

Platform Expertise: Linux, Unix, Windows and Macintosh.

Technical Skills
Programming languages: C/C++, Java, Python, Perl, PL/SQL, Visual Basic, Latex. Data analysis and visualization: R, Matlab, Gephi, Lemur and Indri toolkits, RapidMiner, D3. Big Data technologies: Hive, Apache Pig, Hadoop. Web Development: HTML/DHTML, PHP, JSP, Ajax. Tools: Eclipse, NetBeans, SVN, GitHub, Star UML, Star UML, Adobe Dreamweaver and Flash. Hardware: Verilog. Databases:HDFS, Titan, Oracle, MySQL, IBM DB2, MSSQL. Logic: Prolog. Platform Expertise: Linux, Unix, Windows and Macintosh.

Positions of Responsibility

Member of Computer Science Graduate Student Board, Purdue University.

Treasurer from September 2012 - August 2013

CS Senator to Purdue Graduate Student Government from March 12 to September 2012

PhD Representative to the Department from September 2011 to March 2012

Reviewer for Journal of Engineering and Computer Innovations (JECI) and Information Technology Research Journal (ITRJ)

Sponsorship Head of Maths 'N' Tech Club, NIT Durgapur from April 2009 to April 2010.

Executive Co-ordinator of Maths 'N' Tech Club, NIT Durgapur from September 2007 till April 2009

Event Sub-head of Aarohan2K9, a National Level Techno-Management Fest of NIT Durgapur.

Junior Fest Coordinator of Aarohan2K8, a National Level Techno-Management Fest of NIT Durgapur.