Skip to main content

Posts

Showing posts from June, 2019

On the performance of Matlab and Parallel Computing

MATLAB is one of the most powerful scientific computing tools along with Python. Although Python is my favorite scientific programming language since it is opensource, well-documented and has plenty of libraries, I sometimes use MATLAB especially while dealing with very large matrices as MATLAB is highly optimized for large-scale matrix operations, consequently, it performs better at processing very large matrices. From a parallel computing perspective, MATLAB actually strives to utilize all available CPU cores in a parallel way to maximize its performance and reduce the computation time when it is possible. Therefore, it does a kind of parallel computing when it is possible such as in matrix operations as these operations are very suitable to be run parallelly.  However, the parallel operation of the MATLAB might be restricted by bad coding practice of the users especially using for or while loops, because those loops are generally performed in a serial manner with an increasi

Opensource or Public Datasets for Machine Learning Studies and Research

Machine learning (ML) techniques have been applied in many applications from academia to industry and have started to influence our daily lives such as in social media applications or online shopping. Hence, many machine learning algorithms have been developed to improve the performance of these ML techniques. While learning machine learning basic or developing new algorithms it is essential to have reliable and large datasets which include logical connections and labels between data member. Especially in academia, having a well-known and extensively examined datasets is necessary in order to investigate the performance of newly developed machine learning algorithms and compare them to existing ones. There are a large amount of publicly available datasets that could be used with various machine learning techniques such as deep learning, classification, reinforcement learning, clustering, etc. I would like to present the datasets that I really like to use: 1. UC Irvine Machine