Recommender systems predict a user’s future choices/preferences and recommend products/items they might be interested in.

The two most common types are:

- Content-based recommender systems
- Collaborative filtering

This kind of system gives recommendations based on the knowledge of a user’s attitude towards a product. It works on the logic that if users have agreed upon something in the past, then they will do so in the future as well.

This kind of system looks into the attributes of the items and gives recommendations based on the similarity between them. It works on the logic of recommending similar products to what the…

The principal component analysis is an unsupervised learning technique abbreviated as PCA. It is also called general factor analysis. It is used to study the interrelations among a set of variables so as to figure out the underlying structure of those variables. It is used to analyze data.

PCA produces several orthogonal lines which fit the data well. Orthogonal lines are the lines perpendicular to each other in the n-dimensional space. So if a regression line is created, then a line perpendicular to this line will be the orthogonal line. Now the concept of components comes into the picture. Components…

Support vector machines abbreviated as SVMs is a supervised learning algorithm used for classification tasks as well as regression analysis. It analyzes data and recognizes patterns. In this article, SVM will be demonstrated for a classification problem.

For a set of data points with two class labels, the SVM algorithm builds a model that assigns any new data point to one of the classes thereby making it a non-probabilistic binary linear classifier.

SVM represents all the data points in space in such a way that there is a clear wide gap between the examples of separate classes. …

The K-Means clustering algorithm is an unsupervised learning algorithm meaning that it has no target labels. This algorithm groups the similar clusters together.

The various applications of the clustering algorithm are:

- Market segmentation
- Grouping of customers based on features
- Clustering of similar documents

The algorithm follows the given steps:

- Choose a number of clusters “K”.
- Then each point in the data is randomly assigned to a cluster.
- Repeat the next steps until clusters stop changing:

a) Calculate the centroid of the cluster by making the mean vector of points in the cluster, for each cluster.

b) Assign each data point…

It is an algorithm used for classification tasks and works on a very simple principle.

The KNN algorithm is very basic. The training algorithm stores all the data. And the predicting algorithm calculates the distance of a data point to all points in the data, sorts the points in the increasing order of distance from the data point and then predicts the majority label of the ‘k’ closest points.

- It is very simple and easy to understand and implement.
- It used only 2 parameters: k and distance metric.
- It can classify any number of classes.
- The training step is very…

In the 1800s, a person named Francis Galton was studying the relationship between parents and children by looking into the correlation between the heights of the fathers and their sons. He identified that a father’s son is likely to be as tall as his father. But the main discovery was that the son's height is likely to be close to the overall average height of all people.

So for example, if there is a father of height 7 feet then there are chances that his son will be pretty fall too. But since being 7 feet is very rare and…

The capital asset pricing model (CAPM) is very widely used and is considered to be a very fundamental concept in investing. It determines the link between the risk and expected return of assets, in particular stocks.

The CAPM is defined by the following formula:

Portfolio optimization is the process of choosing the best portfolio among the set of all portfolios.

The naive way is to select a group of random allocations and figure out which one has the best Sharpe Ratio. This is known as the Monte Carlo Simulation where randomly a weight is assigned to each security in the portfolio and then the mean daily return and standard deviation of daily return is calculated. This helps in calculating the Sharpe Ratio for randomly selected allocations.

To know more about Sharpe Ratio, check out my previous article:

But the naive way is time taking…

A collection of financial investments is a portfolio. The financial investments can be cash, stocks, bonds, commodities and any other cash equivalents.

An investment strategy where the risk and reward are balanced of the portfolio’s assets according to the user’s investment goals, tolerance of risk and investment horizon is called portfolio allocation.

**→ Import packages**

The basic packages like Pandas will be imported. Along with it, the Quandl package is imported to get the data.

**>>> import** pandas **as** pd

**>>> import** quandl

**>>> import** matplotlib.pyplot **as** plt

**>>> %**matplotlib inline

**→ Data**

The start and end date is decided…

ARIMA which is the short form for ‘Auto-Regressive Integrated Moving Average’ is used on time series data and it gives insights on the past values like lags and forecast errors that can be used for forecasting future values.

To know more about ARIMA models, check out the article below:

The steps involved are as follows:

- Analyzing the time series data by plotting or visualizing it.
- Converting the time series data in a stationary form.
- Plotting the ACF and PACF plots.
- Constructing the ARIMA model.
- Making predictions using the model created.

**→ Import packages**

The basic packages like NumPy and pandas…

Self-driven woman who wishes to deliver creative and engaging ideas and solutions in the field of technology.