International Science Index

11
10007534
FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule
Abstract:
Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.
Paper Detail
54
downloads
10
9998188
Diagnosis of the Heart Rhythm Disorders by Using Hybrid Classifiers
Abstract:

In this study, it was tried to identify some heart rhythm disorders by electrocardiography (ECG) data that is taken from MIT-BIH arrhythmia database by subtracting the required features, presenting to artificial neural networks (ANN), artificial immune systems (AIS), artificial neural network based on artificial immune system (AIS-ANN) and particle swarm optimization based artificial neural network (PSO-NN) classifier systems. The main purpose of this study is to evaluate the performance of hybrid AIS-ANN and PSO-ANN classifiers with regard to the ANN and AIS. For this purpose, the normal sinus rhythm (NSR), atrial premature contraction (APC), sinus arrhythmia (SA), ventricular trigeminy (VTI), ventricular tachycardia (VTK) and atrial fibrillation (AF) data for each of the RR intervals were found. Then these data in the form of pairs (NSR-APC, NSR-SA, NSR-VTI, NSR-VTK and NSR-AF) is created by combining discrete wavelet transform which is applied to each of these two groups of data and two different data sets with 9 and 27 features were obtained from each of them after data reduction. Afterwards, the data randomly was firstly mixed within themselves, and then 4-fold cross validation method was applied to create the training and testing data. The training and testing accuracy rates and training time are compared with each other.

As a result, performances of the hybrid classification systems, AIS-ANN and PSO-ANN were seen to be close to the performance of the ANN system. Also, the results of the hybrid systems were much better than AIS, too. However, ANN had much shorter period of training time than other systems. In terms of training times, ANN was followed by PSO-ANN, AIS-ANN and AIS systems respectively. Also, the features that extracted from the data affected the classification results significantly.

Paper Detail
1257
downloads
9
9996980
The Automated Selective Acquisition System
Abstract:

To support design process for launching the product on time, reverse engineering (RE) process has been introduced for quickly generating 3D CAD model from its physical object. The accuracy of the 3D CAD model depends upon the data acquisition technique selected, contact or non-contact methods. In order to reduce times used for acquiring surface and eliminating noises, the automated selective acquisition system has been developed and presented in this research as the alternative channel for non-contact acquisition technique where the data is selectively and locally scanned contour by contour without performing data reduction process. The results present as the organized contour points which are directly used to generate 3D virtual model. The comparison between the proposed technique and another non-contact scanning technique has been presented and discussed.

Paper Detail
921
downloads
8
5894
An Efficient 3D Animation Data Reduction Using Frame Removal
Abstract:

Existing methods in which the animation data of all frames are stored and reproduced as with vertex animation cannot be used in mobile device environments because these methods use large amounts of the memory. So 3D animation data reduction methods aimed at solving this problem have been extensively studied thus far and we propose a new method as follows. First, we find and remove frames in which motion changes are small out of all animation frames and store only the animation data of remaining frames (involving large motion changes). When playing the animation, the removed frame areas are reconstructed using the interpolation of the remaining frames. Our key contribution is to calculate the accelerations of the joints of individual frames and the standard deviations of the accelerations using the information of joint locations in the relevant 3D model in order to find and delete frames in which motion changes are small. Our methods can reduce data sizes by approximately 50% or more while providing quality which is not much lower compared to original animations. Therefore, our method is expected to be usefully used in mobile device environments or other environments in which memory sizes are limited.

Paper Detail
998
downloads
7
5423
Acute Coronary Syndrome Prediction Using Data Mining Techniques- An Application
Abstract:

In this paper we use data mining techniques to investigate factors that contribute significantly to enhancing the risk of acute coronary syndrome. We assume that the dependent variable is diagnosis – with dichotomous values showing presence or  absence of disease. We have applied binary regression to the factors affecting the dependent variable. The data set has been taken from two different cardiac hospitals of Karachi, Pakistan. We have total sixteen variables out of which one is assumed dependent and other 15 are independent variables. For better performance of the regression model in predicting acute coronary syndrome, data reduction techniques like principle component analysis is applied. Based on results of data reduction, we have considered only 14 out of sixteen factors.

Paper Detail
1309
downloads
6
11769
Energy Map Construction using Adaptive Alpha Grey Prediction Model in WSNs
Abstract:
Wireless Sensor Networks can be used to monitor the physical phenomenon in such areas where human approach is nearly impossible. Hence the limited power supply is the major constraint of the WSNs due to the use of non-rechargeable batteries in sensor nodes. A lot of researches are going on to reduce the energy consumption of sensor nodes. Energy map can be used with clustering, data dissemination and routing techniques to reduce the power consumption of WSNs. Energy map can also be used to know which part of the network is going to fail in near future. In this paper, Energy map is constructed using the prediction based approach. Adaptive alpha GM(1,1) model is used as the prediction model. GM(1,1) is being used worldwide in many applications for predicting future values of time series using some past values due to its high computational efficiency and accuracy.
Paper Detail
1190
downloads
5
4200
Development of a Sliding-tearing Mode Fracture Mechanical Tool for Laminated Composite Materials
Abstract:

This work presents the mixed-mode II/III prestressed split-cantilever beam specimen for the fracture testing of composite materials. In accordance with the concept of prestressed composite beams one of the two fracture modes is provided by the prestressed state of the specimen, and the other one is increased up to fracture initiation by using a testing machine. The novel beam-like specimen is able to provide any combination of the mode-II and mode-III energy release rates. A simple closed-form solution is developed using beam theory as a data reduction scheme and for the calculation of the energy release rates in the new configuration. The applicability and the limitations of the novel fracture mechanical test are demonstrated using unidirectional glass/polyester composite specimens. If only crack propagation onset is involved then the mixed-mode beam specimen can be used to obtain the fracture criterion of transparent composite materials in the GII - GIII plane in a relatively simple way.

Paper Detail
1150
downloads
4
6083
Wavelet and K-L Seperability Based Feature Extraction Method for Functional Data Classification
Abstract:
This paper proposes a novel feature extraction method, based on Discrete Wavelet Transform (DWT) and K-L Seperability (KLS), for the classification of Functional Data (FD). This method combines the decorrelation and reduction property of DWT and the additive independence property of KLS, which is helpful to extraction classification features of FD. It is an advanced approach of the popular wavelet based shrinkage method for functional data reduction and classification. A theory analysis is given in the paper to prove the consistent convergence property, and a simulation study is also done to compare the proposed method with the former shrinkage ones. The experiment results show that this method has advantages in improving classification efficiency, precision and robustness.
Paper Detail
854
downloads
3
11050
Determining Cluster Boundaries Using Particle Swarm Optimization
Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. Data visualization can reveal structure in data sets that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOMs, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of a generic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOMs. The application of our method to unlabeled call data for a mobile phone operator demonstrates its feasibility. PSO algorithm utilizes U-matrix of SOMs to determine cluster boundaries; the results of this novel automatic method correspond well to boundary detection through visual inspection of code vectors and k-means algorithm.

Paper Detail
1081
downloads
2
4653
A Monte Carlo Method to Data Stream Analysis
Abstract:
Data stream analysis is the process of computing various summaries and derived values from large amounts of data which are continuously generated at a rapid rate. The nature of a stream does not allow a revisit on each data element. Furthermore, data processing must be fast to produce timely analysis results. These requirements impose constraints on the design of the algorithms to balance correctness against timely responses. Several techniques have been proposed over the past few years to address these challenges. These techniques can be categorized as either dataoriented or task-oriented. The data-oriented approach analyzes a subset of data or a smaller transformed representation, whereas taskoriented scheme solves the problem directly via approximation techniques. We propose a hybrid approach to tackle the data stream analysis problem. The data stream has been both statistically transformed to a smaller size and computationally approximated its characteristics. We adopt a Monte Carlo method in the approximation step. The data reduction has been performed horizontally and vertically through our EMR sampling method. The proposed method is analyzed by a series of experiments. We apply our algorithm on clustering and classification tasks to evaluate the utility of our approach.
Paper Detail
806
downloads
1
13961
Joint Use of Factor Analysis (FA) and Data Envelopment Analysis (DEA) for Ranking of Data Envelopment Analysis
Abstract:
This article combines two techniques: data envelopment analysis (DEA) and Factor analysis (FA) to data reduction in decision making units (DMU). Data envelopment analysis (DEA), a popular linear programming technique is useful to rate comparatively operational efficiency of decision making units (DMU) based on their deterministic (not necessarily stochastic) input–output data and factor analysis techniques, have been proposed as data reduction and classification technique, which can be applied in data envelopment analysis (DEA) technique for reduction input – output data. Numerical results reveal that the new approach shows a good consistency in ranking with DEA.
Paper Detail
1345
downloads