Awards & Nominations
Creative Young Teen Minds has received the following awards and nominations. Way to go!

Creative Young Teen Minds has received the following awards and nominations. Way to go!
We have developed a county wise Covid 19 risk calculator which will generate risk warnings and vulnerability ratings for people depending upon their county wise location by using various publicly available datasets.Our project aims to solve the problem of lack of availability of information to the public and authorities at the local level. It provides a deep insight into various factors which could lead to a surge in the covid cases.If people have access to such tools and data insights they could make better and informed decisions to protect themselves and authorities can take the necessary precautions to control the possible outbreak of pandemic in future.
Our model provides Covid-19 risk warnings to the public based on their geographical location till the county level and the possible factors which are most likely to cause that calculated uprising in the cases. It provides precise information about risky activities and places in their locality so that they can do the requisite things to protect themselves and their community.
Report(Our project report) (Please open in new tab if does not open directly)
The model uses the openly available data of the past one week about vaccinations, Covid-19 cases and deaths, population density and mobility to discover the patterns and find the correlations amongst these data and find which activities are more likely to cause the surge in cases. The model is trained on county wise data from these datasets and it uses the Random Forest algorithm and permutation importance method implemented using the Python sklearn package to calculate the importance between the number of cases and the possible factors. It uses a bagging approach to train the model and the best results are obtained by keeping the number of estimator decision trees at 50. So, 50 decision trees are involved in training the model. We have repeated this process 10 times in order to obtain the best possible model.
Relation among covid cases and other factors using permutation importance (Please open in new tab if does not open directly)
Our Vulnerability Calculation Formula
For a particular county:
from n = N
Risk = Summation( in*fn/mn)
from n= 1
Where i = feature importance percentage of feature n
f = value of feature n for that county
m = maximum value of feature n
N = total number of features
Our model has various benefits for the people who have very less access to proper information.
Most of the Covid-19 risk calculators that are available uses only medical data as the base to calculate the risk for the individual. Though medical records have a great importance in determining to what scale a person will be infected and what special measures he needs to take as per his medical history.
But this is not a disease where one can stay safe only by following the measures properly by themselves only as it depends upon the actions of the community as a whole. We have seen in several areas of the world that there has been an uprising in cases just because of negligence of 10 percent of the population, and the consequences of which were suffered by the people who were precautious.
If these people who were following the precautions and the authorities who were taking every possible measure had access to the overall picture of their locality that one can get collectively from these datasets, that bigger picture of combined datasets would definitely have played a key role in this situation.
So, our project provides proper information to people from these datasets in an easily understandable way that will enable proper decision making to prevent such things in the future.
We hope to provide accurate and precise information to people about Covid-19 risk in their locality so that better decisions could be taken to prevent the increase in Covid-19 19 cases and destruction caused by it at the local level because it will also increase the awareness among people and the combined efforts at county level will result in better and efficient mitigation of the pandemic at the national and global levels as well. Some things we would like to incorporate in our current model are:
The current model that we made is suitable for a short time period as it uses the data of past one week to past one month.
But this pandemic also forced us to look at some of the other key factors in the long run which we need to focus on to tackle such pandemics in future and which will also be critical for other humanitarian goals of sustainable development, climate change etc.
We would like to make another indicator which will utilize following five different kinds of data categorized according to country(for bigger countries we would like to do it till state, city and county level as per the size and population)and demography:
Health is an important factor especially in pandemics like these. We know the places which have good health infrastructure and most of their population covered with health insurance will be better off. But it is also important to note that there are some diseases like several respiratory and heart diseases whose patients are more vulnerable to Covid-19 . So, we would like to do the following with the long term health data:
We hope to conduct this till the local level so that authorities and people may take good decisions if such a situation breaks out in future.It will also help us in knowing which groups of people are more likely to survive so special attention can be paid to vulnerable groups as compared to equitable resource allocation. This will help governments to allocate proper resources in proper places as per the vulnerability of different groups of people.
Better literacy and education rates results in better awareness among the society and the greater is awareness in the society, the lower is the risk. For this we would like to do the following:
For example - A less educated business shop owner in an underdeveloped area is less likely to take proper sanitation measures. Moreover, we would also like to investigate the strictness of authorities and its effect on the public.
All these trends will help in launching better awareness campaigns and devising new strategies to increase awareness among the targeted audience in a more understandable way.
This will deal with research, collection, study and analysis of the data for the following purposes across all countries till county level as per the demography:
This will help the concerned institutions and organisation in determining a proper policy to tackle problems like this and provide more support to the vulnerable economic sections in such situations and developing a better approach towards more sustainable economy
Through these data we would like to find the relationship between the climate change and environmental crisis across different geographical regions and the outbreak of the pandemic. Climate change is resulting to several genetic mutations in different creatures which may increase of such pandemics in future:
Through this we would like to study the efficacy, ability and willingness of administration at various levels like national, state and local level to find which countries were better of in procuring vaccines and health infrastructure to provide proper facilities to their people and why, so that similar measures could be adopted in other countries in the upcoming times.
We have used the following tools and softwares in development of our project:
•Python
•MS Excel.
•Google Sheets
•Pycharm
•Google Colab
•Sklearn
•Numpy
•Pandas
•Matplotlib
•Season
•Pickle
•MS Word
•Google Docs
•Google Slides
•Google Drive
We used the mobility data from the EO Dashboard maintained by NASA,ESA and JAXA which can be accessed from the link below and it played a crucial role in training our model
https://eodashboard.org/?poi=GG-GG
We also intend to use the air quality data for our long term model to find correlation between AQI,climate change and severeness of pandemic across different regions but were not able to include it in our current model due to lack of time
Space Apps have provided us a golden opportunity to learn new skills and use our skills and talents for the betterment of society and to solve the problem which has caused large scale destruction in the whole world.
Working on this project for the hackathon to contribute in providing a solution to mitigate the effects of the Covid-19 19 pandemic has provided us the chance and opportunity to learn and grow ourselves. We have gained knowledge about many new tools and technologies as well as about the studies and research going on for Covid-19 19. Some of our major learning outcomes were:
As we all know that Covid-19 has badly affected our lives both physically as well as mentally and it has become the challenge for the whole world. Everyday the news headlines show the increasing number of cases and deaths due to which terror has grabbed the hearts and minds of people. The pandemic has disrupted the daily routine of the people. We as students could not go to our schools and miss those golden moments which we enjoyed with our friends and were highly strained because of the continuous online classes , our elders could not go to their workplaces and a mental stigma stuck everyone because of lack of joy and spending excessive time on screens which led to strain on their health. Everything appeared to be lifeless. But despite the strict control measures taken by the authorities and precautions taken by the people there was no improvement in the situation. There was no reduction in the number of Covid-19 patients and the number of deaths.
No household was left unaffected by Covid-19 especially during the second wave. Several of our relatives and acquaintances have also suffered from Covid-19 despite taking the necessary precautions and taking strict care of cleanliness and hygiene just because of lack of proper information. It's very difficult and frustrating to remain isolated and being trapped in one room especially when you are not well and need support from your family members. So this led us to the idea of the project and invent a tool to tackle the situation.
1.Data Collection : Firstly, we gathered various kind of datasets from several different sources at the county level for the US. Our project is based on the following datasets -. Population density data ,Covid-19 cases data ,Mobility data,Vaccination data etc. We have gathered the following data in CSV format.
2.Data Filtration and Data Mining : We then opened, edited and filtered the gathered data MS Excel and filtered the data using various functions and algorithms like VLOOKUP,OFFSET,AVERAGE, SORT AND GROUP BY etc. to assemble the data in a proper order
3.Data Integration and Data Warehousing: Then we integrated all the data into a single spreadsheet to make a final database to train our model and the algorithm. We stored the final integrated data in the form of a spreadsheet on google drive.
4.Data Cleaning and Data Imputation: The final data was then opened in a Google Colaboratory and was cleaned using the Pandas library. Some of the values were missing which were imputed using the KNN Imputer of sklearn library.
5.Machine Learning Modelling and Model Training: We used the Random Forest Algorithm for our machine learning model and the permutation importance method to calculate feature importance. We fitted and trained the model on our final dataset using train_test_split method to split data into train and test datasets.
6.Model Validation and Model Deployment:We then tested the performance of the model on test datasets using various different metrics and repeated this process 10 times to get the best possible model. We further plan to deploy it to the web in the form of an interactive dashboard or API so that it can be easily used by the public.
First of all we would like to express our gratitude to NASA and other organisers for giving us a golden opportunity to express and explore ourselves, because of them we are getting a chance to think out of the box and create something which could be an asset for the society. Secondly we would also like to thank 'Centre of Disease Control and Prevention’, ‘US Census’, ‘Google LLC', 'The New York Times' for providing us with the datasets for our model and ultimately we would like to thank each and every person who supported us directly or indirectly in our journey.
https://eodashboard.org/?poi=GG-GG
https://eodashboard.org/?poi=W1-N1
https://github.com/nytimes/covid-19-data/blob/master/us-counties.csv
https://www.google.com/covid19/mobility/
https://www.epa.gov/sites/production/files/2016-04/ozone-county-population.xlsx
https://data.cdc.gov/Vaccinations/COVID-19-Vaccinations-in-the-United-States-County/8xkx-amqh
#machinelearning #datascience #covid #randomforest #teenagers #creativity #students #eodashboard #technicalworld #project
This project has been submitted for consideration during the Judging process.
COVID-19 continues to be a global problem even though vaccination efforts are underway to control its propagation. Your challenge is to use environmental data and other information (such as epidemiological, social, policy, and economic data) to build a smartphone application that provides individualized, geolocated, COVID-19 risk warnings to guide social awareness, response, and health security.
