Define data cleansing? What best practices do you follow during data cleansing?

devquora
devquora

Posted On: Feb 22, 2018

 

Data cleaning is a crucial step in the analysis process where data is inspected to find anomalies, reduce repetitive data, and eliminate incorrect information. It does not involve deleting any existing information from the database; it focuses on enhancing the quality of data so that it can be used for analysis further.

Some of the best practices for data cleansing includes:

  • Development of quality data plans to identify where maximum data quality errors occur, to assess the root cause and design the plan accordingly.
  • To follow a standard process of verifying the critical data before the creation of a database.
  • To identify any duplicates and validate the accuracy of the data to save time during analysis.
  • Tracking all the cleaning operations performed on the data is essential to repeat or remove any operations as necessary.

    Related Questions

    Please Login or Register to leave a response.

    Related Questions

    Data Analyst Interview Questions

    How can we differentiate between Data Mining and Data Analysis?

    Here are a few considerable differences:Data Mining: Data mining does not require any hypothesis and depends on clean..

    Data Analyst Interview Questions

    How do we conduct a data analysis?

    Data analysis deals with collecting, inspecting, cleaning, transforming and modeling data to glean valuable insights an..

    Data Analyst Interview Questions

    List the steps in an analytics project?

    Steps included in an analytics project are; Problem definition Data exploration Data preparation Modeling Validation of..