Effective data cleaning is a crucial prerequisite for ensuring the accuracy, validity, completeness, and consistency of your data. Flawed or inconsistent data can lead to erroneous conclusions and compromise the integrity of your analyses. The quality of your results directly hinges on how well you cleanse your data.

This comprehensive two-day course is thoughtfully crafted to provide participants with the necessary skills and knowledge to proficiently utilize Microsoft Excel and Power BI as powerful tools for data cleaning. Through engaging hands-on activities with real-world datasets, attendees will be guided to:

  1. Identify and rectify data inconsistencies efficiently.
  2. Master the use of basic functions and formulas in Microsoft Excel to streamline the data cleaning process.
  3. Leverage the visualization capabilities of Microsoft Power BI to detect and address outliers.

By the end of this course, participants will have gained the expertise needed to prepare their datasets for robust analysis, thus enhancing the reliability and accuracy of their data-driven decision-making processes. Join us to acquire these invaluable data cleaning skills and elevate your data analysis proficiency to new heights.



Course ​​​​​​​Prerequisites: 

Required Software (must be installed prior to the first day of class).

Microsoft Excel

A general understanding of basic mathematics and associated principles.

Students will need to have access to a Windows laptop (personal or employer issued/assigned). The host agency is unable to provide students with a laptop for use within the course.

 

It is recommended that attendees be familiar with the use of Microsoft Excel and associated functionality of a Windows computer.

 


Duration: 16 Hours (2 days)

Method: In Person

Restrictions: Restricted to Criminal Justice Employees only who are in a Full Duty status (or Light Duty due to medical). 

Required Disclosures: None

Fee: Course attendance is free.


Course Objectives

After successfully completing the course, attendees will be able to know, discuss, use, and execute the following fundamentals:

  • Data cleaning fundamentals
  • Understand basic Excel functionality
  • Using filters
  • Creating a Unique Identifier
  • Utilize Text-to-Columns
  • Understand the basics of a relational database
  • Remove duplicates from a large data set
  • Use pivot tables for basic analysis
  • Utilize conditional formatting for data validation
  • Utilize basic and advanced functions in Excel
    • =VLOOKUP()
    • =CONCAT()
    • =IF()
    • =ISBLANK()
    • =TRIM()
    • =INT()
    • =YEARFRAC()

Course Outline

Module 1: Introduction to our data and cleaning the Incidents

  • Introduction
  • Class Structure
  • GIGO
  • Data cleaning best practices
    • Know your goal
    • Standardizing your data
    • Validating your data
    • Removing duplicate records
    • Know when too much data should be separated
  • Type 1 and Type 2 errors
  • Introduction to the data source
    • 2022 Maryland Homicide by Firearm Data
  • Define our goal
  • Expanding columns
  • Inserting a new column
  • Creating a Unique Identifier
  • Using the Data->Filter button
  • Formatting Date and Time
  • =VLOOKUP()
  • Text-to-columns
  • Finding outliers
  • Finding Lat and Long using various tools
  • Using =CONCAT()

Module 2: Cleaning the Victim and Suspect Data

  • Creating a relational database from our dataset
  • Removing Duplicates
  • Finding Age from DOB
  • Using =IF() 
  • When to leave cells blank, vs changing to unknown
  • =ISBLANK()
  • Using Pivot Tables to find inconsistencies in data
  • Using =TRIM()
  • =VLOOKUP() again
  • Conditional Formatting for Data Validation
  • Data Management ethics