Data Systemization for Integrative Competence Concepts Pedagogy Design: This course explores techniques, domain, process approach, and implementation methodologies of combining data from inherently different sources into a single dataset for the purpose of ensuring parity of esteem in the resulting pathways in terms of requirements.The benefits of intelligent algorithms ensures an agile mode of delivery that is based on cloud architecture will enable the adoption of these techniques in all industries and domains. We have chosen the two projects below to map our effort on the journey towards our purpose and goals through data science and machine learning.
Our Purpose: The purpose of this project is to inextricably link occupational, academic, and soft skills requirements to establish capabilities model with ten hirarchical frame starting from 5th Grade to Ph.D. or President/CEO. This purpose statement have approximately 78 premised value derived from eighteen questions.
Prediction of jobs for graduating class using multistep Time Series Analysis:
This is a supervised learning machine learning project in which learners are expected to predict the availability of jobs in specific occupations in their local communities by using multi-step time series analysis. A real-time activity that involves what jobs will be available right now and in what skills areas to help schools develop a cheat sheet of required competencies (knowledge, skills, and understanding).
All jobs have pay variations and we want to be at the top hiring pay bracket to boost the schools image in local communities. Pay are dependent on several criteria, such as the number of available vancancies in the local community (demand). The number of available people with required competencies (supply) to perform the job. The number of internal resources capable of adding the functionalities to theirs until a trusted new hire comes onboard, and several others. These factors all impact how quickly they would like to have you onboard, pay, and the like.
There are some interesting measures schools and students can use to to take control of this exercise, one is tracking the number of work hours, peak performances, and how long individuals on the same job have been performing the same tasks. This will help in efficiently determining how many people with such competencies local businesses will be hiring based on demand factors stated above.
Through this exercise learners will be able to:
- Convert a Time Series problem to a Supervised Learning problem
- Develop mastery of Multi-Step Time Series Forecast analysis
- Data Pre-processing function in Time Series analysis
- Exploratory Data Analysis (EDA) on Time-Series
- Achieve Feature Engineering in Time Series by breaking Time Features to days of the week and weekends
- The concept of Lead-Lag and Rolling Mean.
- Clarity of Auto-Correlation Function (ACF) and Partial Auto-Correlation Function (PACF) in Time Series
- Approaches to solving Multi-Step Time Series problem
- Solving Time-Series with a Regressor Model
- Implementing brick and moter vs online work hours Prediction with Ensemble Models (Random Forest and Xgboost)
Using multiclass classification to determine human capabilities across occupations
Task performance activities require the use of multiclass classification machine learning techniques to analyze the fitness of two datasets, one from concepts of competencies and the other a smartphone tracker of performances. The Question Classification of integrative competencies data and mapping into cencepts provides content for determing the right combination of knowledge, skills, and understanding to perform a task. Daily tasks performance activities will be recorded using a smartphone with embedded inertial sensors, through the collected data we can build a strong dataset for real-time activity recognition and capture. Using an embedded accelerometer and gyroscope in the smartphone the activities can be captured. Video recording the experiment will result in labelling the data manully The experiments have been video-recorded to label the data manually. While the random partition of the data into two sets will result in 70% training and 30% test data.
This exercise aims to provide learners mastery of:
- Data Science Life Cycle
- Univariate and Bivariate analysis
- Data visualizations using various charts.
- Cleaning and preparing the data for modelling.
- Standard Scaling and normalizing the dataset.
- Selecting the best model and making predictions
- How to perform PCA to reduce the number of features
- Understanding how to apply
- Logistic Regression & SVM
- Random Forest Regressor, XGBoost and KNN
- Deep Neural Networks
- Deep knowledge in Hyper Parameter tuning for ANN and SVM.
- How to plot the confusion matrix for visualizing the result
- Develop the Flask API for the selected model.