Data Hygiene is the process of cleaning up a data set. The process removes duplicates, corrects misspellings, makes sure the data follows standard formatting rules, fills in missing data and corrects misspellings and punctuation errors, among other things. Good data hygiene is important for other data analytics, such as predictive modeling, since it makes sure the model is being built on the best quality data.
An ETL Pipeline refers to the process in computing where data is extracted (E), transformed (T), and loaded (L) into an output data container.