Projects

The projects I have done on quantitative finance, Machine learning, and data science/analytics are shown below. I have included labels(tags) which help for quick searching and filtering. Full list of all the projects I have done can be found in my Github repositories page. Feel free to contact me for collaborations, inquiries, recommendations and questions. Thanks.


PORTFOLIO CONSTRUCTION FOR A FIXED RTURNS INVESTMENT OBJECTIVE

The aim of this study is to design and evaluate a quantitative investment strategy that tracks a 15% p.a. return, which represents the minimum acceptable return for an investor. The strategy uses four different approaches to construct a portfolio of 11 assets, consisting of the top 10 stocks in NSE Kenya by market capitalization and a Money Market Fund. The approaches are: minimum tracking error optimization, constrained least squares, target downside deviation, and target factor exposure optimization. The performance of the strategy is assessed using anchored walk forward optimization. The results show that the strategy can achieve a reasonable degree of tracking accuracy and risk-adjusted returns.

Tags: Quant Finance Portfolio optimization Index Tracking


A COMPARISON OF COVARIANCE MATRIX ESTIMATION METHODS FOR TANGENCY PORTFOLIOS

This research article investigates the problem of covariance matrix estimation for tangency portfolios using data from the NSE. Specifically, we compare several methods of covariance matrix estimation and evaluate their performance in constructing tangency portfolios. The results of our study provide insights into the most effective methods for estimating the covariance matrix in the context of the NSE, which can be valuable for portfolio managers and investors seeking to optimize their investments in the East African region.

Tags: Quant Finance Portfolio optimization Robust statistics


AN EXPOSURE-TO-CRYPTO STRATEGY

This is a backtest analysis for a strategy which explores systematic exposure to the crypto asset class subject to minimum acceptanle return and downside risk constraints using the hypothesis drive development approach. The strategy achieves its objective by exploiting the high correlation existing in the crypto-market. The models used include: cointegration, regression, and the Ornstein-Uhlenbeck process.

Tags: Quant Finance Time series Regression


LEARNING REPRESENTATIONS IN HIGH-DIMENSIONAL DATA

This study explores models suitable for high-dimensional data, where several challenges such as multi-collinearity and over-fitting are inherent. The dataset used for accomplishing the project is a high-dimensional finance dataset, where the task is predicting quarterly returns and direction. The models considered in the study include: Ridge regression, LASSO regression, PCA, Kernel-PCA, and ICA.

Tags: Machine Learning Feature engineering Dimensionality reduction Regularization


MODELLING LIABILITY CLAIM SEVERITY USING THE GAMMA FAMILY OF DISTRIBUTIONS

This study seeks to model claim severity from a portfolio of liability insurance policies in Kenya from the period 1980 to 2022. The loss models fitted from the data are selected from the Gamma family of distributions due to their positive skewness and heavy-tailedness. Models explored in this study are: The exponential distribution, the Gamma distribution, the Weibull distribution, the Pareto distribution and the Burr distribution.

Tags: Statistical modelling Loss models


AN E.T.F TRACKING ERROR STRATEGY

This study seeks to construct an active trading strategy which takes advantage of the tracking error of listed Exchange Traded Funds, with the baskets that those funds track. Regression modelling is applied to quantify the pricing discrepancy. A strategy is then constructed to enable a fund manager enter long and short positions, with an aim of bagging risk-free trading profit. The ETF investigated is the ABSA NEW GOLD ETF, listed in the NSE, which tracks the gold bullion.

Tags: Quant Finance Time series Regression


STATISTICAL ARBITRAGE IN THE NAIROBI SECURITIES EXCHANGE

This study explores the prescence of arbitrage opportunities in the Nairobi Securities Exchange, and seeks to come up with active trading strategies to profit from the pricing inefficiencies. Models utilized include: regression models, auto-regressive models, and the Ornstein-Uhlenbeck model for mean reverting processes.

Tags: Quant Finance Time series Regression


SOCIAL MEDIA INSURACE: INSURING REPUTATION IN A DIGITAL AGE

This aim of this study was to come up with an insurance product for insuring social media risks, such as: online impersonation, online abuse, and web application attacks. This comes at a time when the number of internet users in Kenya is growing at an geometric growth rate. The product targeted social media influencers, politicians and celebrities and the general public exposed to social media risks in Kenya. This study utilized the Bayesian modelling framework and actuarial pricing principles.

Tags: Insurance pricing Product development Bayesian analysis


ANALYSIS OF POPULATION SIZE AND RELATED FACTORS - KENYA

This is an exploratory data analysis project which investigates the population size of Kenya since the year 1960, as well as several demographic variables such as fertility, mortality and migration. Population projection models are fitted to the data including: mathematical models, time series models and regression models, in an attempt to project the one-year population size.

Tags: EDA Demography Time series Population projection


IMPROVING PSV INSURANCE IN KENYA

This project was presented during the annual product development competition of Kenyatta University Actuarial Students' Association (ASSK) on financial inclusivity in insurance, where it was awarded the best project. The aim of the project was to improve PSV(Public transport) insurance using deep learning. This was accomplished by using surveillance data feeds from CCTV cameras located in major kenyan highways and roads in Nairobi, Kenya to detect over-speeding, over-loading and dangerous driving among PSV drivers.

Tags: Product development Deep learning Computer vision


THE EFFECTS OF SOCIAL MEDIA TAX ON THE QUALITY OF TWEETS IN UGANDA

This study focused on analyzing the effects of the social media tax imposed on Ugandan internet users (July, 2018) on the quality of tweets in Uganda. We use the Kenyan twitter population as a benchmark. I undertook the project during my Global Summer Institute with Equitech Futures.

Tags: NLP Text processing Twitter datasets


CONSTRUCTING A BAYESIAN NEAREST NEIGHBOUR MODEL

The aim of this project was to improve the current K nearest neighbours algorithm to enable an analyst encode prior beliefs about the classes present in the data, which leads to the adoption of Bayes rule in the standard kNN model.

Tags: Machine learning k-NN Bayesian reasoning Classification


SURVEYING EXPERT SYSTEMS: RULES AND KNOWLEDGE BASE

This was a project aimed at investigating how expert systems work. I accomplished this by designing a popular board game we would play in our childhood days. The game agent was explicitly fed with game rules (skills), and optimal paths to take in order to attack the opponent or defend its positions. The agent (using the rules), self-played against itself in order to construct a knowledge base (experience).

Tags: Product development Deep learning Computer vision


DUPLICATE FINDER

The aim of this project was to construct a duplicate finder algorithm for textual data (dataset with usernames), in order to identify duplicataes in textual data with no primary key. The solution lied in building a nearest neighbour algorithm which utilized string distances to find duplicates in textual data.

Tags: Distance metrics k-NN Text processing