Implementation and Experiments on Distributed Ensemble Learning System (DELS) With Several Partitioning Methods and Classifiers

dc.contributor.advisorChefranov, Alexander (Supervisor)
dc.contributor.authorZamani, Azadeh
dc.date.accessioned2025-02-19T12:11:09Z
dc.date.available2025-02-19T12:11:09Z
dc.date.issued2021-02
dc.date.submitted2021-02
dc.departmentEastern Mediterranean University, Faculty of Engineering, Department of Computer Engineering.en_US
dc.descriptionMaster of Science in Computer Engineering. Institute of Graduate Studies and Research. Thesis (M.S.) - Eastern Mediterranean University, Faculty of Engineering, Dept. of Computer Engineering, 2021. Supervisor: Assoc. Prof. Dr. Alexander Chefranov.en_US
dc.description.abstractABSTRACT: Nowadays, Machine Learning in Big Data is one of the challenges. As the large datasets are too big to handle in the single node memory using distributed method is mandatory. Hence, the methods of distributing data return the results with high accuracy and better performance in time is the goal of this research. Using various learning processes to train multiple classifiers from distributed data sets increases the possibility of achieving higher accuracy, particularly on a big datasets. This is because the combination of classifiers can represent an integration of different learning biases which may compensate for each other's inefficiencies. Implementation and Experiments on Distributed Ensemble Learning System (DELS) With Several partitioning Methods and classifier s in single and multiple systems have been chosen.The user should choose the input dataset, the number of partitions and the classifier. Classification and regression tree (CART) and multilayer perceptron (MLP) are the selected classifier used of decision tree and neural network methods, respectively. We assume that number of partition is related to the number of disjoint bagging which will be used for division of data and consequently the number of parallel processors to which data is sent. Algorithms of bagging the data are disjoint partitions (D), disjoint bags (DB), small bags (SB) and No-replication small bags (NRSB) classification. These stratified inputs are proposed as training samples and will train in single machine. The distribution of each part of this stratified input is done by MPI. This service is responsible for performing several tasks with its own resources separately. The task includes implementing the learning algorithm and extracting the learning model. The results are N training models which are collected using the majority vote method. The model with higher prediction rank is selected in major voting. This final model is used to check the test data and extract the Scoring test result. The previous test is repeated in multi-node system with random input dataset. In single-node, SB (Small Bag) has highest and D (Disjont Partition) has lowest accuracy. CART has 0.998 in accuracy while MLP has 0.96. MLP requires 2 to 11 more times for learning than CART. In multi-node run time in CART is 5 to 11 times faster than MLP. The best test score we reach was 0.955. As the number of disjoint partitions is increased scoring time will increase, thus in 2 partitions scoring time is 37 minutes while in 12 partitions it is 210 minutes. In DELS, better training time get with LADEL and MLP algorithm than CART. It takes 4.6 seconds in 2 nodes while training time decrease to 0.11 second in 12 nodes by using MLP in multi-node. These results are obtained by the CART algorithm in a multi-node system, 207 and 7.01 seconds for 2 and 12 nodes, respectively. Keywords: distributed systems, parallel processing, ensemble learning, bagging, classification, decision tree, neural network, disjoint partitionen_US
dc.identifier.citationZamani, Azadeh. (2021). Implementation and Experiments on Distributed Ensemble Learning System (DELS) With Several Partitioning Methods and Classifiers. Thesis (M.S.), Eastern Mediterranean University, Institute of Graduate Studies and Research, Dept. of Computer Engineering, Famagusta: North Cyprus.en_US
dc.identifier.urihttps://hdl.handle.net/11129/6211
dc.language.isoen
dc.publisherEastern Mediterranean University (EMU) - Doğu Akdeniz Üniversitesi (DAÜ)en_US
dc.relation.publicationcategoryTez
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectComputer Engineering Departmenten_US
dc.subjectInformation storage and retrieval systems - Computer scienceen_US
dc.subjectWeb databases - Big Data - Machine Learningen_US
dc.subjectData Miningen_US
dc.subjectDistributed systems, parallel processing, ensemble learning, bagging, classification, decision tree, neural network, disjoint partitionen_US
dc.titleImplementation and Experiments on Distributed Ensemble Learning System (DELS) With Several Partitioning Methods and Classifiersen_US
dc.typeMaster Thesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ZamaniAzadeh-Master.pdf
Size:
9.69 MB
Format:
Adobe Portable Document Format
Description:
Thesis, Master

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.77 KB
Format:
Item-specific license agreed upon to submission
Description: