dirty_cat.datasets
.fetch_employee_salaries¶
- dirty_cat.datasets.fetch_employee_salaries(load_dataframe=True, drop_linked=True, drop_irrelevant=True, directory=None)[source]¶
Fetches the employee_salaries dataset (regression), available at https://openml.org/d/42125
- Description of the dataset:
Annual salary information including gross pay and overtime pay for all active, permanent employees of Montgomery County, MD paid in calendar year 2016. This information will be published annually each year.
- Parameters:
- drop_linked: bool (default True)
Drops columns “2016_gross_pay_received” and “2016_overtime_pay”, which are closely linked to “current_annual_salary”, the target.
- drop_irrelevant: bool (default True)
Drops column “full_name”, which is usually irrelevant to the statistical analysis.
- Returns:
- DatasetAll
If load_dataframe=True
- DatasetInfoOnly
If load_dataframe=False
Examples using dirty_cat.datasets.fetch_employee_salaries
¶

Dirty categories: machine learning with non normalized strings
Dirty categories: machine learning with non normalized strings