International Science Index

5
10007534
FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule
Abstract:
Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.
Paper Detail
54
downloads
4
10005886
An Activity Based Trajectory Search Approach
Abstract:

With the gigantic increment in portable applications use and the spread of positioning and location-aware technologies that we are seeing today, new procedures and methodologies for location-based strategies are required. Location recommendation is one of the highly demanded location-aware applications uniquely with the wide accessibility of social network applications that are location-aware including Facebook check-ins, Foursquare, and others. In this paper, we aim to present a new methodology for location recommendation. The proposed approach coordinates customary spatial traits alongside other essential components including shortest distance, and user interests. We also present another idea namely, "activity trajectory" that represents trajectory that fulfills the set of activities that the user is intrigued to do. The approach dispatched acquaints the related distance value to select trajectory(ies) with minimum cost value (distance) and spatial-area to prune unneeded directions. The proposed calculation utilizes the idea of movement direction to prescribe most comparable N-trajectory(ies) that matches the client's required action design with least voyaging separation. To upgrade the execution of the proposed approach, parallel handling is applied through the employment of a MapReduce based approach. Experiments taking into account genuine information sets were built up and tested for assessing the proposed approach. The exhibited tests indicate how the proposed approach beets different strategies giving better precision and run time.

Paper Detail
276
downloads
3
10004822
Adopting Flocks of Birds Approach to Predator for Anomalies Detection on Industrial Control Systems
Abstract:

Industrial Control Systems (ICS) such as Supervisory Control And Data Acquisition (SCADA) can be seen in many different critical infrastructures, from nuclear management to utility, medical equipment, power, waste and engine management on ships and planes. The role SCADA plays in critical infrastructure has resulted in a call to secure them. Many lives depend on it for daily activities and the attack vectors are becoming more sophisticated. Hence, the security of ICS is vital as malfunction of it might result in huge risk. This paper describes how the application of Prey Predator (PP) approach in flocks of birds could enhance the detection of malicious activities on ICS. The PP approach explains how these animals in groups or flocks detect predators by following some simple rules. They are not necessarily very intelligent animals but their approach in solving complex issues such as detection through corporation, coordination and communication worth emulating. This paper will emulate flocking behavior seen in birds in detecting predators. The PP approach will adopt six nearest bird approach in detecting any predator. Their local and global bests are based on the individual detection as well as group detection. The PP algorithm was designed following MapReduce methodology that follows a Split Detection Convergence (SDC) approach.

Paper Detail
352
downloads
2
10001680
A System for Analyzing and Eliciting Public Grievances Using Cache Enabled Big Data
Abstract:
The system for analyzing and eliciting public grievances serves its main purpose to receive and process all sorts of complaints from the public and respond to users. Due to the more number of complaint data becomes big data which is difficult to store and process. The proposed system uses HDFS to store the big data and uses MapReduce to process the big data. The concept of cache was applied in the system to provide immediate response and timely action using big data analytics. Cache enabled big data increases the response time of the system. The unstructured data provided by the users are efficiently handled through map reduce algorithm. The processing of complaints takes place in the order of the hierarchy of the authority. The drawbacks of the traditional database system used in the existing system are set forth by our system by using Cache enabled Hadoop Distributed File System. MapReduce framework codes have the possible to leak the sensitive data through computation process. We propose a system that add noise to the output of the reduce phase to avoid signaling the presence of sensitive data. If the complaints are not processed in the ample time, then automatically it is forwarded to the higher authority. Hence it ensures assurance in processing. A copy of the filed complaint is sent as a digitally signed PDF document to the user mail id which serves as a proof. The system report serves to be an essential data while making important decisions based on legislation.
Paper Detail
881
downloads
1
9996564
Architecture of Large-Scale Systems
Abstract:

In this paper various techniques in relation to large-scale systems are presented. At first, explanation of large-scale systems and differences from traditional systems are given. Next, possible specifications and requirements on hardware and software are listed. Finally, examples of large-scale systems are presented.

Paper Detail
2123
downloads