⭐Improve Data
Once you have imported data is time to understand and clean your data. Arkangel AI performs an automatic analysis and suggest best practices for it
Last updated
Once you have imported data is time to understand and clean your data. Arkangel AI performs an automatic analysis and suggest best practices for it
Last updated
Preparing your data is an iterative process. Even if you clean and prep your training data prior to uploading it to Arkangel AI, you can still improve its quality by assessing features during EDA (Exploratory Data Analysis).
For categorical variables with numerical labels, like 0, 1, 2, it's advisable to represent these categories with descriptive terms such as "bad," "medium," and "good," or even "category1," "category2," and "category3." This helps models to effectively recognize and treat the variable as categorical.
During EDA, Arkangel AI performs Data Quality Assessment. The assessment provides information about data quality issues that are relevant to the stage of model building you are performing. Click one of the following tabs to learn about the two EDA stages.
EDA1 occurs after you upload your data and assesses the All Features list and detects issues like:
Outliers
Inliers
Excess zeros
Disguised missing values
As soon as you load your dataset, DataRobot performs EDA1. In this phase, DataRobot generates summary statistics based on a sample of your data.
Arkangel AI calculates automatically the significance of each feature and correlation with the prediction target selected. To get the calculations select a prediction target.
You might want to remove features that are unrelated to the target. To learn how to use this feature check our Correlation and Significance tutorial.
Once you have all your cleaning commands press the yellow button and move to Create AI Model final step.