Tags give the ability to mark specific points in history as being important
-
2.1.11
a7a92d2c · ·Updated SHAP graphs to display feature importance by categorical level rather than the entire categorical feature
-
2.1.10
fb89b493 · ·Added "groupby" parameter that allows users to perform operations within specific groups rather than across the entire dataframe for the missing values and data splitting functions. The changes also include some minor improvements like adding support for LightGBM models in the evaluation framework and better error handling for the feature serving function
-
-
-
-
-
-
-
-
2.1.2
28e659ca · ·- Fixed model monitoring metrics column order - Removed duplicate prediction_drift_status
-
-
2.1.0
b3b3dedd · ·- Added monitoring metrics functions to allow for model observability - Depreciated `model_metrics` - Updated testing pipeline to allow for local testing - Tidied up README
-
2.0.0
ff00d70e · ·ModelEvaluator Class * Comprehensive replacement for the older model_metrics function * Supports a wide variety of classification and regression models, including multi-class problems * Extensive visualizations (ROC, PR curves, lift charts, SHAP values) * Detailed metrics and performance analytics in a single, cohesive interface * Ability to add custom metrics, save plots, and export metrics to file Apply Functions * New functions to consistently transform new data using patterns from training: * apply_outliers() - Apply existing outlier limits * apply_missing_values() - Apply missing value handling * apply_dummy() - Apply existing dummy coding * These enable production pipelines to use identical transformations to training * Don't require building a separate .py file for scoring transformations - all transformations are handled directly in the configuration yaml file ConfigGenerator Class * Automatically creates scoring configuration files modularly * Supports nested parameters and complex configuration structures * Perfect for version controlling your model parameters and preprocessing steps Memory Optimization * New memory_optimization() function dramatically reduces DataFrame memory usage * Significantly reduces time to train XGBoost models by taking advantage of sparse arrays * Configurable precision modes to balance memory usage vs. numeric precision Other New Functions Added * generate_sql_trend_query() - Generate SQL for time-period analysis * trend_analysis() - Analyze time-series data for patterns Other Notable Improvements * Consistent return patterns (functions now return both data AND metadata) * Standardized function names and improved parameter handling * More robust outlier detection with skew adjustment options * Better missing value handling with more filling methods * Enhanced dummy coding with better prefix handling * Improved correlation/feature reduction with multiple correlation methods * Enhanced split_data() with stratification options and better sampling Breaking Changes * Many of the calls prior to 2.0.0 will not work correctly without slight modifications. Consult the documentation for exact changes. * inplace parameters removed from all functions to conform with pandas best practices * missing_fill() and missing_check() combined into missing_values() * dv_proxies() renamed to remove_outcome_proxies() for better clarity * memory_usage() renamed to memory_optimization -
-
-
-
-
-
-