diff --git a/reference/index.html b/reference/index.html
index 2b90055..afeb3e6 100644
--- a/reference/index.html
+++ b/reference/index.html
@@ -170,8 +170,8 @@
On this page
-
-Reference
+
+Reference
Core
diff --git a/search.json b/search.json
index e7850b4..e56ea87 100644
--- a/search.json
+++ b/search.json
@@ -1,73 +1,17 @@
[
{
- "objectID": "reference/steps-temporal.html",
- "href": "reference/steps-temporal.html",
- "title": "Temporal",
- "section": "",
- "text": "Feature extraction for temporal columns"
- },
- {
- "objectID": "reference/steps-temporal.html#parameters",
- "href": "reference/steps-temporal.html#parameters",
- "title": "Temporal",
- "section": "Parameters",
- "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of date columns to expand into new features.\nrequired\n\n\ncomponents\ntyping.Sequence[typing.Literal[‘day’, ‘week’, ‘month’, ‘year’, ‘dow’, ‘doy’]]\nA sequence of components to expand. Options include - day: the day of the month as a numeric value - week: the week of the year as a numeric value - month: the month of the year as a categorical value - year: the year as a numeric value - dow: the day of the week as a categorical value - doy: the day of the year as a numeric value Defaults to [\"dow\", \"month\", \"year\"].\n('dow', 'month', 'year')"
- },
- {
- "objectID": "reference/steps-temporal.html#examples",
- "href": "reference/steps-temporal.html#examples",
- "title": "Temporal",
- "section": "Examples",
- "text": "Examples\n>>> import ibisml as ml\nExpand date columns using the default components\n>>> step = ml.ExpandDate(ml.date())\nExpand specific columns using specific components\n>>> step = ml.ExpandDate([\"x\", \"y\"], [\"day\", \"year\"])"
- },
- {
- "objectID": "reference/steps-temporal.html#parameters-1",
- "href": "reference/steps-temporal.html#parameters-1",
- "title": "Temporal",
- "section": "Parameters",
- "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of time columns to expand into new features.\nrequired\n\n\ncomponents\ntyping.Sequence[typing.Literal[‘hour’, ‘minute’, ‘second’, ‘millisecond’]]\nA sequence of components to expand. Options include hour, minute, second, and millisecond. Defaults to [\"hour\", \"minute\", \"second\"].\n('hour', 'minute', 'second')"
- },
- {
- "objectID": "reference/steps-temporal.html#examples-1",
- "href": "reference/steps-temporal.html#examples-1",
- "title": "Temporal",
- "section": "Examples",
- "text": "Examples\n>>> import ibisml as ml\nExpand time columns using the default components\n>>> step = ml.ExpandTime(ml.time())\nExpand specific columns using specific components\n>>> step = ml.ExpandTime([\"x\", \"y\"], [\"hour\", \"minute\"])"
- },
- {
- "objectID": "reference/core.html",
- "href": "reference/core.html",
- "title": "Common",
- "section": "",
- "text": "Core APIs"
- },
- {
- "objectID": "reference/core.html#parameters",
- "href": "reference/core.html#parameters",
- "title": "Common",
- "section": "Parameters",
- "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\n*steps\nibisml.core.Step | ibisml.core.Transform\nOne or more preprocessing steps.\n()"
- },
- {
- "objectID": "reference/index.html",
- "href": "reference/index.html",
- "title": "Reference",
- "section": "",
- "text": "Common\nCore APIs\n\n\nSelectors\nSelect sets of columns by name, type, or other properties\n\n\n\n\n\n\nDefine steps in a recipe\n\n\n\nImputation\nImputation and handling of missing values\n\n\nEncoding\nEncoding of categorical and string columns\n\n\nStandardization\nStandardization and normalization of numeric columns\n\n\nTemporal\nFeature extraction for temporal columns\n\n\nOther\nOther common tabular operations"
- },
- {
- "objectID": "reference/index.html#core",
- "href": "reference/index.html#core",
- "title": "Reference",
+ "objectID": "index.html",
+ "href": "index.html",
+ "title": "ibisml",
"section": "",
- "text": "Common\nCore APIs\n\n\nSelectors\nSelect sets of columns by name, type, or other properties"
+ "text": "ibisml is a work-in-progress library for developing Machine Learning feature engineering pipelines using ibis. These pipelines can then be used to transform and feed data to other machine learning libraries like xgboost or scikit-learn.\nBy using ibis for preprocessing and feature engineering, feature engineering pipelines may be compiled to SQL and executed on a wide range of performant and scalable backends. No more need to rewrite code for production deployments, pipelines may be developed locally (against e.g. duckdb) and deployed to production (against e.g. spark) with only a single line of code change."
},
{
- "objectID": "reference/index.html#steps",
- "href": "reference/index.html#steps",
- "title": "Reference",
- "section": "",
- "text": "Define steps in a recipe\n\n\n\nImputation\nImputation and handling of missing values\n\n\nEncoding\nEncoding of categorical and string columns\n\n\nStandardization\nStandardization and normalization of numeric columns\n\n\nTemporal\nFeature extraction for temporal columns\n\n\nOther\nOther common tabular operations"
+ "objectID": "index.html#help-wanted",
+ "href": "index.html#help-wanted",
+ "title": "ibisml",
+ "section": "Help Wanted!",
+ "text": "Help Wanted!\nibisml is a work-in-progress. If you’re interested in getting involved (whether through feature requests, PRs, or just sharing opinions), we’d love to hear from you."
},
{
"objectID": "reference/selectors.html",
@@ -133,39 +77,67 @@
"text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\npredicate\ntyping.Callable[[ibis.ibis.Column], bool]\nA predicate function from Column to bool. Only columns where predicate returns True will be selected.\nrequired"
},
{
- "objectID": "index.html",
- "href": "index.html",
- "title": "ibisml",
+ "objectID": "reference/steps-imputation.html",
+ "href": "reference/steps-imputation.html",
+ "title": "Imputation",
"section": "",
- "text": "ibisml is a work-in-progress library for developing Machine Learning feature engineering pipelines using ibis. These pipelines can then be used to transform and feed data to other machine learning libraries like xgboost or scikit-learn.\nBy using ibis for preprocessing and feature engineering, feature engineering pipelines may be compiled to SQL and executed on a wide range of performant and scalable backends. No more need to rewrite code for production deployments, pipelines may be developed locally (against e.g. duckdb) and deployed to production (against e.g. spark) with only a single line of code change."
+ "text": "Imputation and handling of missing values"
},
{
- "objectID": "index.html#help-wanted",
- "href": "index.html#help-wanted",
- "title": "ibisml",
- "section": "Help Wanted!",
- "text": "Help Wanted!\nibisml is a work-in-progress. If you’re interested in getting involved (whether through feature requests, PRs, or just sharing opinions), we’d love to hear from you."
+ "objectID": "reference/steps-imputation.html#parameters",
+ "href": "reference/steps-imputation.html#parameters",
+ "title": "Imputation",
+ "section": "Parameters",
+ "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of columns to impute. All columns must be numeric.\nrequired"
},
{
- "objectID": "reference/steps-standardization.html",
- "href": "reference/steps-standardization.html",
- "title": "Standardization",
- "section": "",
- "text": "Standardization and normalization of numeric columns"
+ "objectID": "reference/steps-imputation.html#examples",
+ "href": "reference/steps-imputation.html#examples",
+ "title": "Imputation",
+ "section": "Examples",
+ "text": "Examples\n>>> import ibisml as ml\nReplace NULL values in all numeric columns with their respective means, computed from the training dataset.\n>>> step = ml.ImputeMean(ml.numeric())"
},
{
- "objectID": "reference/steps-standardization.html#parameters",
- "href": "reference/steps-standardization.html#parameters",
- "title": "Standardization",
+ "objectID": "reference/steps-imputation.html#parameters-1",
+ "href": "reference/steps-imputation.html#parameters-1",
+ "title": "Imputation",
"section": "Parameters",
- "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of columns to normalize. All columns must be numeric.\nrequired"
+ "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of columns to impute.\nrequired"
},
{
- "objectID": "reference/steps-standardization.html#examples",
- "href": "reference/steps-standardization.html#examples",
- "title": "Standardization",
+ "objectID": "reference/steps-imputation.html#examples-1",
+ "href": "reference/steps-imputation.html#examples-1",
+ "title": "Imputation",
"section": "Examples",
- "text": "Examples\n>>> import ibisml as ml\nNormalize all numeric columns.\n>>> step = ml.ScaleStandard(ml.numeric())\nNormalize a specific set of columns.\n>>> step = ml.ScaleStandard([\"x\", \"y\"])"
+ "text": "Examples\n>>> import ibisml as ml\nReplace NULL values in all numeric columns with their respective modes, computed from the training dataset.\n>>> step = ml.ImputeMode(ml.numeric())"
+ },
+ {
+ "objectID": "reference/steps-imputation.html#parameters-2",
+ "href": "reference/steps-imputation.html#parameters-2",
+ "title": "Imputation",
+ "section": "Parameters",
+ "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of columns to impute. All columns must be numeric.\nrequired"
+ },
+ {
+ "objectID": "reference/steps-imputation.html#examples-2",
+ "href": "reference/steps-imputation.html#examples-2",
+ "title": "Imputation",
+ "section": "Examples",
+ "text": "Examples\n>>> import ibisml as ml\nReplace NULL values in all numeric columns with their respective medians, computed from the training dataset.\n>>> step = ml.ImputeMedian(ml.numeric())"
+ },
+ {
+ "objectID": "reference/steps-imputation.html#parameters-3",
+ "href": "reference/steps-imputation.html#parameters-3",
+ "title": "Imputation",
+ "section": "Parameters",
+ "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of columns to fillna.\nrequired\n\n\nfill_value\ntyping.Any\nThe fill value to use. Must be castable to the dtype of all columns in inputs.\nrequired"
+ },
+ {
+ "objectID": "reference/steps-imputation.html#examples-3",
+ "href": "reference/steps-imputation.html#examples-3",
+ "title": "Imputation",
+ "section": "Examples",
+ "text": "Examples\n>>> import ibisml as ml\nFill all NULL values in numeric columns with 0.\n>>> step = ml.FillNA(ml.numeric(), 0)\nFill all NULL values in specific columns with 1.\n>>> step = ml.FillNA([\"x\", \"y\"], 1)"
},
{
"objectID": "reference/steps-other.html",
@@ -231,101 +203,129 @@
"text": "Examples\n>>> import ibisml as ml\n>>> from ibis import _\nDefine a new column c as a**2 + b**2\n>>> step = ml.Mutate(c=_.a**2 + _.b**2)"
},
{
- "objectID": "reference/steps-encoding.html",
- "href": "reference/steps-encoding.html",
- "title": "Encoding",
+ "objectID": "reference/steps-standardization.html",
+ "href": "reference/steps-standardization.html",
+ "title": "Standardization",
"section": "",
- "text": "Encoding of categorical and string columns"
+ "text": "Standardization and normalization of numeric columns"
},
{
- "objectID": "reference/steps-encoding.html#parameters",
- "href": "reference/steps-encoding.html#parameters",
- "title": "Encoding",
+ "objectID": "reference/steps-standardization.html#parameters",
+ "href": "reference/steps-standardization.html#parameters",
+ "title": "Standardization",
"section": "Parameters",
- "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of columns to one-hot encode.\nrequired\n\n\nmin_frequency\nint | float | None\nA minimum frequency of elements in the training set required to treat a column as a distinct category. May be either: - an integer, representing a minimum number of samples required. - a float in [0, 1], representing a minimum fraction of samples required. Defaults to None for no minimum frequency.\nNone\n\n\nmax_categories\nint | None\nA maximum number of categories to include. If set, only the most frequent max_categories categories are kept.\nNone"
+ "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of columns to normalize. All columns must be numeric.\nrequired"
},
{
- "objectID": "reference/steps-encoding.html#examples",
- "href": "reference/steps-encoding.html#examples",
- "title": "Encoding",
+ "objectID": "reference/steps-standardization.html#examples",
+ "href": "reference/steps-standardization.html#examples",
+ "title": "Standardization",
"section": "Examples",
- "text": "Examples\n>>> import ibisml as ml\nOne-hot encode all string columns\n>>> step = ml.OneHotEncode(ml.string())\nOne-hot encode a specific column, only including categories with at least 20 samples.\n>>> step = ml.OneHotEncode(\"x\", min_frequency=20)\nOne-hot encode a specific column, including at most 10 categories.\n>>> step = ml.OneHotEncode(\"x\", max_categories=10)"
+ "text": "Examples\n>>> import ibisml as ml\nNormalize all numeric columns.\n>>> step = ml.ScaleStandard(ml.numeric())\nNormalize a specific set of columns.\n>>> step = ml.ScaleStandard([\"x\", \"y\"])"
},
{
- "objectID": "reference/steps-encoding.html#parameters-1",
- "href": "reference/steps-encoding.html#parameters-1",
- "title": "Encoding",
- "section": "Parameters",
- "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of columns to categorical encode.\nrequired\n\n\nmin_frequency\nint | float | None\nA minimum frequency of elements in the training set required to treat a column as a distinct category. May be either: - an integer, representing a minimum number of samples required. - a float in [0, 1], representing a minimum fraction of samples required. Defaults to None for no minimum frequency.\nNone\n\n\nmax_categories\nint | None\nA maximum number of categories to include. If set, only the most frequent max_categories categories are kept.\nNone"
+ "objectID": "reference/index.html",
+ "href": "reference/index.html",
+ "title": "Reference",
+ "section": "",
+ "text": "Common\nCore APIs\n\n\nSelectors\nSelect sets of columns by name, type, or other properties\n\n\n\n\n\n\nDefine steps in a recipe\n\n\n\nImputation\nImputation and handling of missing values\n\n\nEncoding\nEncoding of categorical and string columns\n\n\nStandardization\nStandardization and normalization of numeric columns\n\n\nTemporal\nFeature extraction for temporal columns\n\n\nOther\nOther common tabular operations"
},
{
- "objectID": "reference/steps-encoding.html#examples-1",
- "href": "reference/steps-encoding.html#examples-1",
- "title": "Encoding",
- "section": "Examples",
- "text": "Examples\n>>> import ibisml as ml\nCategorical encode all string columns\n>>> step = ml.CategoricalEncode(ml.string())\nCategorical encode a specific column, only including categories with at least 20 samples.\n>>> step = ml.CategoricalEncode(\"x\", min_frequency=20)\nCategorical encode a specific column, including at most 10 categories.\n>>> step = ml.CategoricalEncode(\"x\", max_categories=10)"
+ "objectID": "reference/index.html#core",
+ "href": "reference/index.html#core",
+ "title": "Reference",
+ "section": "",
+ "text": "Common\nCore APIs\n\n\nSelectors\nSelect sets of columns by name, type, or other properties"
},
{
- "objectID": "reference/steps-imputation.html",
- "href": "reference/steps-imputation.html",
- "title": "Imputation",
+ "objectID": "reference/index.html#steps",
+ "href": "reference/index.html#steps",
+ "title": "Reference",
"section": "",
- "text": "Imputation and handling of missing values"
+ "text": "Define steps in a recipe\n\n\n\nImputation\nImputation and handling of missing values\n\n\nEncoding\nEncoding of categorical and string columns\n\n\nStandardization\nStandardization and normalization of numeric columns\n\n\nTemporal\nFeature extraction for temporal columns\n\n\nOther\nOther common tabular operations"
},
{
- "objectID": "reference/steps-imputation.html#parameters",
- "href": "reference/steps-imputation.html#parameters",
- "title": "Imputation",
+ "objectID": "reference/steps-temporal.html",
+ "href": "reference/steps-temporal.html",
+ "title": "Temporal",
+ "section": "",
+ "text": "Feature extraction for temporal columns"
+ },
+ {
+ "objectID": "reference/steps-temporal.html#parameters",
+ "href": "reference/steps-temporal.html#parameters",
+ "title": "Temporal",
"section": "Parameters",
- "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of columns to impute. All columns must be numeric.\nrequired"
+ "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of date columns to expand into new features.\nrequired\n\n\ncomponents\ntyping.Sequence[typing.Literal[‘day’, ‘week’, ‘month’, ‘year’, ‘dow’, ‘doy’]]\nA sequence of components to expand. Options include - day: the day of the month as a numeric value - week: the week of the year as a numeric value - month: the month of the year as a categorical value - year: the year as a numeric value - dow: the day of the week as a categorical value - doy: the day of the year as a numeric value Defaults to [\"dow\", \"month\", \"year\"].\n('dow', 'month', 'year')"
},
{
- "objectID": "reference/steps-imputation.html#examples",
- "href": "reference/steps-imputation.html#examples",
- "title": "Imputation",
+ "objectID": "reference/steps-temporal.html#examples",
+ "href": "reference/steps-temporal.html#examples",
+ "title": "Temporal",
"section": "Examples",
- "text": "Examples\n>>> import ibisml as ml\nReplace NULL values in all numeric columns with their respective means, computed from the training dataset.\n>>> step = ml.ImputeMean(ml.numeric())"
+ "text": "Examples\n>>> import ibisml as ml\nExpand date columns using the default components\n>>> step = ml.ExpandDate(ml.date())\nExpand specific columns using specific components\n>>> step = ml.ExpandDate([\"x\", \"y\"], [\"day\", \"year\"])"
},
{
- "objectID": "reference/steps-imputation.html#parameters-1",
- "href": "reference/steps-imputation.html#parameters-1",
- "title": "Imputation",
+ "objectID": "reference/steps-temporal.html#parameters-1",
+ "href": "reference/steps-temporal.html#parameters-1",
+ "title": "Temporal",
"section": "Parameters",
- "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of columns to impute.\nrequired"
+ "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of time columns to expand into new features.\nrequired\n\n\ncomponents\ntyping.Sequence[typing.Literal[‘hour’, ‘minute’, ‘second’, ‘millisecond’]]\nA sequence of components to expand. Options include hour, minute, second, and millisecond. Defaults to [\"hour\", \"minute\", \"second\"].\n('hour', 'minute', 'second')"
},
{
- "objectID": "reference/steps-imputation.html#examples-1",
- "href": "reference/steps-imputation.html#examples-1",
- "title": "Imputation",
+ "objectID": "reference/steps-temporal.html#examples-1",
+ "href": "reference/steps-temporal.html#examples-1",
+ "title": "Temporal",
"section": "Examples",
- "text": "Examples\n>>> import ibisml as ml\nReplace NULL values in all numeric columns with their respective modes, computed from the training dataset.\n>>> step = ml.ImputeMode(ml.numeric())"
+ "text": "Examples\n>>> import ibisml as ml\nExpand time columns using the default components\n>>> step = ml.ExpandTime(ml.time())\nExpand specific columns using specific components\n>>> step = ml.ExpandTime([\"x\", \"y\"], [\"hour\", \"minute\"])"
},
{
- "objectID": "reference/steps-imputation.html#parameters-2",
- "href": "reference/steps-imputation.html#parameters-2",
- "title": "Imputation",
+ "objectID": "reference/core.html",
+ "href": "reference/core.html",
+ "title": "Common",
+ "section": "",
+ "text": "Core APIs"
+ },
+ {
+ "objectID": "reference/core.html#parameters",
+ "href": "reference/core.html#parameters",
+ "title": "Common",
"section": "Parameters",
- "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of columns to impute. All columns must be numeric.\nrequired"
+ "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\n*steps\nibisml.core.Step | ibisml.core.Transform\nOne or more preprocessing steps.\n()"
},
{
- "objectID": "reference/steps-imputation.html#examples-2",
- "href": "reference/steps-imputation.html#examples-2",
- "title": "Imputation",
+ "objectID": "reference/steps-encoding.html",
+ "href": "reference/steps-encoding.html",
+ "title": "Encoding",
+ "section": "",
+ "text": "Encoding of categorical and string columns"
+ },
+ {
+ "objectID": "reference/steps-encoding.html#parameters",
+ "href": "reference/steps-encoding.html#parameters",
+ "title": "Encoding",
+ "section": "Parameters",
+ "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of columns to one-hot encode.\nrequired\n\n\nmin_frequency\nint | float | None\nA minimum frequency of elements in the training set required to treat a column as a distinct category. May be either: - an integer, representing a minimum number of samples required. - a float in [0, 1], representing a minimum fraction of samples required. Defaults to None for no minimum frequency.\nNone\n\n\nmax_categories\nint | None\nA maximum number of categories to include. If set, only the most frequent max_categories categories are kept.\nNone"
+ },
+ {
+ "objectID": "reference/steps-encoding.html#examples",
+ "href": "reference/steps-encoding.html#examples",
+ "title": "Encoding",
"section": "Examples",
- "text": "Examples\n>>> import ibisml as ml\nReplace NULL values in all numeric columns with their respective medians, computed from the training dataset.\n>>> step = ml.ImputeMedian(ml.numeric())"
+ "text": "Examples\n>>> import ibisml as ml\nOne-hot encode all string columns\n>>> step = ml.OneHotEncode(ml.string())\nOne-hot encode a specific column, only including categories with at least 20 samples.\n>>> step = ml.OneHotEncode(\"x\", min_frequency=20)\nOne-hot encode a specific column, including at most 10 categories.\n>>> step = ml.OneHotEncode(\"x\", max_categories=10)"
},
{
- "objectID": "reference/steps-imputation.html#parameters-3",
- "href": "reference/steps-imputation.html#parameters-3",
- "title": "Imputation",
+ "objectID": "reference/steps-encoding.html#parameters-1",
+ "href": "reference/steps-encoding.html#parameters-1",
+ "title": "Encoding",
"section": "Parameters",
- "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of columns to fillna.\nrequired\n\n\nfill_value\ntyping.Any\nThe fill value to use. Must be castable to the dtype of all columns in inputs.\nrequired"
+ "text": "Parameters\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ninputs\nibisml.select.SelectionType\nA selection of columns to categorical encode.\nrequired\n\n\nmin_frequency\nint | float | None\nA minimum frequency of elements in the training set required to treat a column as a distinct category. May be either: - an integer, representing a minimum number of samples required. - a float in [0, 1], representing a minimum fraction of samples required. Defaults to None for no minimum frequency.\nNone\n\n\nmax_categories\nint | None\nA maximum number of categories to include. If set, only the most frequent max_categories categories are kept.\nNone"
},
{
- "objectID": "reference/steps-imputation.html#examples-3",
- "href": "reference/steps-imputation.html#examples-3",
- "title": "Imputation",
+ "objectID": "reference/steps-encoding.html#examples-1",
+ "href": "reference/steps-encoding.html#examples-1",
+ "title": "Encoding",
"section": "Examples",
- "text": "Examples\n>>> import ibisml as ml\nFill all NULL values in numeric columns with 0.\n>>> step = ml.FillNA(ml.numeric(), 0)\nFill all NULL values in specific columns with 1.\n>>> step = ml.FillNA([\"x\", \"y\"], 1)"
+ "text": "Examples\n>>> import ibisml as ml\nCategorical encode all string columns\n>>> step = ml.CategoricalEncode(ml.string())\nCategorical encode a specific column, only including categories with at least 20 samples.\n>>> step = ml.CategoricalEncode(\"x\", min_frequency=20)\nCategorical encode a specific column, including at most 10 categories.\n>>> step = ml.CategoricalEncode(\"x\", max_categories=10)"
}
]
\ No newline at end of file