DataQoil
Coiling up our data knowledge.
Data science project can be a challenging but rewarding process. By following the steps we can use in this blog, you can work through the project in an organized and effective way, and ultimately arrive at a solution to your problem. Previously, we wrote blogs on many machine learning algorithms (Classification, Predication) as well as many other topics to help you sharpen your knowledge of how machine work....
How to do Data Science Project - DataQoil Data science is a field that involves using scientific methods, processes, algorithms and systems to extract knowledge and insights from data
WordCloud in Python can be done in different ways but one of the most popular and easier ones is using the package wordcloud. We can install it using the following way. !pip install wordcloud Requirement already satisfied: wordcloud in c:\programdata\anaconda3\lib\site-packages (1.8.1) Requirement already satisfied: pillow in c:\programdata\anaconda3\lib\site-packages (from wordcloud) (8.0.1) Requirement already satisfied: numpy>=1.6.1 in c:\programdata\anaconda3\lib\site-packages (from wordcloud) (1.19.2) Requirement already satisfied: matplotlib in c:\users\viper\appdata\roaming\python\python38\site-packages (from wordcloud) (3.5.3) Requirement already satisfied: pyparsing>=2.2.1 in c:\users\viper\appdata\roaming\python\python38\site-packages (from matplotlib->wordcloud) (3.0.9) Requirement already satisfied: fonttools>=4.22.0 in c:\programdata\anaconda3\lib\site-packages (from matplotlib->wordcloud) (4.37.1) Requirement already satisfied: cycler>=0.10 in c:\programdata\anaconda3\lib\site-packages (from matplotlib->wordcloud) (0.10.0) Requirement already satisfied: python-dateutil>=2.7 in c:\programdata\anaconda3\lib\site-packages (from matplotlib->wordcloud) (2.8.1) Requirement already satisfied: kiwisolver>=1.0.1 in c:\programdata\anaconda3\lib\site-packages (from matplotlib->wordcloud) (1.3.0) Requirement already satisfied: packaging>=20.0 in c:\programdata\anaconda3\lib\site-packages (from matplotlib->wordcloud) (20.4) Requirement already satisfied: six in c:\programdata\anaconda3\lib\site-packages (from cycler>=0.10->matplotlib->wordcloud) (1.15.0)...
Text Analysis with WordCloud in Python - DataQoil Let's create WordCloud in Python to do text analysis. WordClouds simply plots words based on their occurrence.
WorldCup tweet sentiment analysis will be done based on tweets related to the world cup. This is a time of the world cup and social media might be full of activities related to the world cup. Most of us pick a side with the country and make posts based on them or against other teams. I remember getting angry with friends while being on the opposite team during WorldCup....
WorldCup Tweet Sentiment Analysis in Python - DataQoil Let's do WorldCup Tweet Sentiment Analysis because it's the time of the world cup and people are talking about it all around social media.
Cryptography Algorithms in Python is quite easy to code and we will cover only few here. Cryptography Algorithms have been around the world for more than centuries and there are still many inscriptions around various places in the world which we do not understand. Here in this blog, we will cover very basic cryptography algorithms in Python. But if you are interested into learning how to do encryption/decryption in image as well, i have following two blogs:...
Simple Cryptography Algorithms in Python - DataQoil Have you tried to code cryptography algorithms in Python? Here are some of simple algorithms coded. It is fun!
Drawing Simple Geometrical Shapes on Python from scratch, have you tried it? Now on this series of tasks, I am going to tackle some of the interesting image processing concepts from scratch using Python and then will compare it with the popular OpenCV framework. Last time I did Convolution operations from Scratch and RGB to GrayScale conversion, etc. Now is the time to draw circles, rectangle, ellipse and get the flashback of childhood....
Drawing Simple Geometrical Shapes on Python Using NumPy and Visualize it with Matplotlib - DataQoil Drawing Simple Geometrical Shapes on Python, have you tried that? It's fun, challenging and mind-blowing at the same time.
What to merge different images into a single? How to make a simple image merger tool in Python?
https://dataqoil.com/2022/08/21/building-image-merger-web-tool-in-python/
Building Image Merger Web Tool in Python - DataQoil Image Merger is needed when we have to merge two images into one. It can be built within an hour using Streamlit, OpenCV and Python.
Some sites have image size limits while uploading so we have to resize them. But many image reducer sites have limitations too like too many ads. But how to make our own Image Size Reducer tool?
https://dataqoil.com/2022/08/14/building-image-size-reducer-tool-in-python/
Building Image Size Reducer Tool In Python - DataQoil We all have wanted to reduce Image size but how to make our own Image Size Reducer tool using Python, OpenCV and Streamlit? Lets do that.
Fractals are the complex shapes that can mostly be formed by repeatitive or recursive manner. Simplest of them is Middle third Cantor and popular of them is Mandlebrot set.
https://dataqoil.com/2022/08/07/making-fractal-shapes-with-python/
Making Fractal Shapes with Python - DataQoil Fractal Shapes are kind of patterns that has continuous patterns and mostly made up by recursive form of simple rule. Lets create some.
What is DBSCAN algorithm in Clustering? Lets find out.
https://dataqoil.com/2022/08/05/dbscan-clustering-algorithm/
DBSCAN Clustering Algorithm - DataQoil Let's explore how DBSCAN clustering methods function and how they differ from conventional clustering algorithms.
How to make our own people search tool using Python?
https://dataqoil.com/2022/07/24/making-people-search-tool-in-2022-using-beautifulsoup/
Making People Search Tool in 2022 Using BeautifulSoup - DataQoil Lets make our own people finder using Google Search, GitHub Search along with BeautifulSoup and urllib3 in Python.
How to deploy a Streamlit app with custom server using Apache2?
https://wp.me/pdCti4-hF
Deploying Streamlit App with Custom Domain and Apache2 - DataQoil Lets deploy a streamlit app with custom sub-domain/domain using Apache2 Server, ReverseProxy in Ubuntu.
Kruskal Wallis H Test What is Kruskal Wallis H Test Kruskal Wallis H test is a kind of non parametric test which means that there is no presence of parameter and parent population from which sample has been taken is not normally distributed. Kruskal Wallis H test is also known as non parametric version of one way ANOVA. ANOVA test stands for analysis of variance....
https://dataqoil.com/2022/07/01/kruskal-wallis-h-test-in-news-data/
Kruskal Wallis H Test in News Data - DataQoil What can be the application of the the Kruskal Wallis H test in NLP? A simple one is testing if number of word counts is identical or not.
Making Data Dashboard with Apache Superset Hello and welcome back everyone, in this blog, we will explore how we can create awesome data dashboards using Apache superset with little to no code at all. But there are few things one should do before making first dashboard, we need to have installed Superset and have some data too. Installing Apache Superset…...
https://dataqoil.com/2022/06/26/getting-started-with-apache-superset-for-data-dashboards/
Getting Started with Apache Superset for Data Dashboards - DataQoil Apache Superset provides fast, interactive and wide range of charts to begin with and use them in dashboards. Lets try it in COVID 19 Dataset.
Multilayer Perceptron (MLP) In this blog we are going to share how non-linear problem like XOR can be solve using multi layer perceptron. We already written blog about how to apply multi layer perceptron on majority function please have a look here. We all are familiar in that single layer perceptron (slp) are commonly used to classify problems that are linearly separable....
https://dataqoil.com/2022/06/24/multilayer-percepron-using-xor-function-from/
Multilayer Percepron Using XOR function from - DataQoil Applying our knowledge of multilayer perceptrons to the XOR problem will allow us to see how they differ from single layer perceptrons.
Introduction From the past 2 stories of a data and its journey to confess the insights, we have explored several areas and to point out few: We have done EDA based on descriptive and inferential part of the statistics to find strong evidences, relationships and facts about the data. We used some of valuable insights from the EDA and tried to classify the possible environment that the properties reflects to....
https://dataqoil.com/2022/06/19/taking-data-apps-into-webapp-using-streamlit-plotly-and-python/
Taking Data Apps into WebApp: Using Streamlit, Plotly, and Python - DataQoil Lets use Streamlit to make our data science app accessible via web app. Why not perform algorithm choosing and plotting within it?
Multilayer Perceptron (MLP) We all know that single layer perceptron are commonly used to classify problems that are linearly separable. If we choose a single layer perceptron for a non-linearly separable problem, the results may not be successful. As a result, we must look for an alternative solution to a non-linear problem, and one such solution is the multi layer perceptron....
https://dataqoil.com/2022/06/17/multilayer-perceptron-using-majority-function-from-scratch/
Multilayer Perceptron Using Majority Function From Scratch - DataQoil Let's take a deep dive into deep learning and develop your programming skills by building a multilayer perceptron from the ground up.
Beyond and Within EDA Introduction This blog is the continuation of the previous blog post A General Way of Doing EDA. Please follow that before reading this blog. Once we got the knowledge of the data like its properties and features, we can move ahead by taking that knowledge to make some sort of inference. Its often called Modeling. Sometimes Feature Engineering is also done within EDA and beyond it....
https://dataqoil.com/2022/06/12/beyond-and-within-eda-taking-eda-into-modelling/
Beyond and Within EDA: Taking EDA into Modelling - DataQoil Once we got insights about the data and we know its some nature, we can make some kind of inference and turn that into modeling. Lets do that here.
Artificial Neuron Network The term "neural network" refers to an artificial neural network (ANN) (NN). It is a computational paradigm that is inspired by the way the human brain or nervous system performs computation. Perception, pattern recognition, motor control, and other computations are all performed by the brain, which is a highly complex, non-linear, and parallel computation machine. The basic structural unit of the brain is the neuron or nerve cell....
https://dataqoil.com/2022/06/10/single-layer-perceptron-from-scratch/
Single Layer Perceptron From Scratch - DataQoil Let's build a simple perceptron neural network from scratch in Python and take the first steps toward deep learning.
Tweets Scraping using Tweepy Hello and welcome back everyone, in this blog we are going to explore how we can scrape tweets using Twitter's API and Tweepy. The API calls are handled by Tweepy and we only need to give it Keys. Getting API Keys First we need to have a Twitter Developer Account and only with it, we can get keys to scrape tweets....
https://dataqoil.com/2022/06/05/scraping-tweets-with-tweepy/
Scraping Tweets with Tweepy - DataQoil Lets use Twitter Developer API and Tweepy to scrape tweets using Keywords and save them as CSV files as well as Pandas DataFrame.
Monte Carlo Simulations What is Monte Carlo Simulations? One of the main motivations to switch from spreadsheet-type tools (such as Microsoft Excel) to a program like R is for simulation modeling. R allows us to repeat the same (potentially complex and detailed) calculations with different random values over and over again. Within the same software, we can then summarize and plot the results of…...
https://dataqoil.com/2022/06/04/monte-carlo-simulations-in-r/
Monte Carlo Simulations in R - DataQoil Let's use R to build a random sample and do simulation analysis to take our data analysis skills to the next level.
Introduction Hello everyone, welcome back to another new blog where we will explore different ideas and concept one could perform while performing an EDA. In simple words, this blog is a simple walk-through of an average EDA process which might include (in top down order): Data Loading: From various sources (remote, local) and various formats (excel, csv, sql etc.)...
https://dataqoil.com/2022/05/29/a-general-way-to-perform-an-eda/
A General Way to Perform an EDA - DataQoil A general way of doing EDA by touching the Statistics and trying to answer the question of how and what via plots and charts.
What is Hypothesis Testing It is a type of inferential statistics that involves extrapolating results from a sample (random) to the entire population. It is used to make decisions based on statistical tests and models that use the p-value, also known as the Type I error or alpha error. Type I Error : When we reject true null hypothesis then it is called…...
https://dataqoil.com/2022/05/26/different-hypothesis-testing-using-r/
Different Hypothesis Testing Using R - DataQoil Let's use R to perform several hypothesis tests and learn when to utilize which type of hypothesis testing.
Triggers in SQL Triggers in SQL is a way to invoke something as a response to the events on the table in which Trigger is attached. The example of the event can be Insert, Update, Delete. Triggers are of two type, Row Level and Statement Level. The row level trigger is triggered for each row while statement level trigger is triggered once per transaction or ex*****on....
https://dataqoil.com/2022/05/22/mysql-triggers/
MySQL: Triggers - DataQoil Lets explore how can we define triggers in MySQL and get most out of its functionalities along with its benefits and drawbacks.
Polynomial Regression Curve fitting or curve-linear regression are additional words for the same thing. It is used when a scatterplot shows a non-linear relationship. It’s most typically employed with time series data, but it can be applied to a variety of other situations. Let’s use the Nepal Covid data and fit a polynomial models on Covid deaths using R To do this first import excel file in R studio using…...
https://dataqoil.com/2022/05/19/polynomial-regression-model-in-r/
Polynomial Regression Model in R - DataQoil Let's utilize the Polynomial Regression model to see how well it performs. Investigate time series data, using a polynomial regression model.
Introduction Hello and whats up everyone, in this blog we will explore PyScript for running Python codes inside our HTML files. It is quite easy to do so. How does it works under the hood is not what is being focused here but what can we do will be. For docs, please visit here. First Program Create a HTML file and on the top, import packages inside the head section....
https://dataqoil.com/2022/05/15/pyscript-running-python-in-webpages/
Pyscript: Running Python in Webpages - DataQoil Now we can run Python codes within our webpages using Pyscript. What is interesting is that we can use packages inside it too.
Naive Bayes for Nepali News Classification Hello everyone, welcome back to our blog about news classification and in this blog, we are going to explore Naive Bayes for news in our native language Nepali. I started this project nearly a year ago but I never finished it because I did not know anything about it and I knew only BeautifulSoup from Datacamp....
https://dataqoil.com/2022/05/12/nepali-news-classification-using-naive-bayes-and-decision-tress/
Nepali News Classification Using Naive Bayes and Decision Tress - DataQoil Let's classify Nepali news using Naive Bayes and Decision Tree, and see how well these two models fared on the task.
How to Use Alpaca API for Stock Data and make a Streaming app?
https://dataqoil.com/2022/05/01/python-for-stock-market-analysis-alpaca-api/
Python for Stock Market Analysis: Alpaca API - DataQoil In this blog, we will explore alpaca API for Stock analysis using its free subscription and also stream the data.
Logistic Regression from Scratch in Python. Here in this blog, we will experiment logistic regression with MSE and Log Loss.
https://dataqoil.com/2022/04/10/logistic-regression-from-scratch/
Logistic Regression from Scratch in Python: Exploring MSE and Log Loss - DataQoil Let's do logistic regression by taking different cost functions log loss function as well as mean square error in python from scratch.
What is timeseries and how to do analysis in it? This is the first blog about it and many more to come. Stay tuned.
https://dataqoil.com/2022/04/10/python-for-stock-market-analysis-getting-started-into-timeseries-analysis/
Python for Stock Market Analysis: Getting Started into Timeseries Analysis - DataQoil How to start a timeseries analysis in Python for Stock Market Analysis? This blog is a beginner friendly introduction in it.