Machine Learning
&
Neural Networks Blog

Crime Prediction and Prevention

This project aims to analyze and visualize crime data in Bucharest using geospatial mapping and machine learning techniques. The primary focus is on classifying crime severity and making predictions about crime severity using the K-Nearest Neighbors (KNN) algorithm. This comprehensive approach combines data processing, machine learning, and interactive visualization to provide valuable insights into crime patterns.

The map provides a powerful tool for law enforcement agencies, city planners, and researchers. It enables them to visualize crime data, understand spatial distributions, and anticipate future crime hotspots, thereby facilitating informed decision-making and proactive measures for crime prevention.

Importing Libraries
Essential libraries for data manipulation, visualization, and machine learning are imported. These include: 'pandas' for handling data frames; 'folium' for creating interactive maps; 'os' for directory operations; 'numpy' for numerical operations; 'sklearn' for machine learning functionalities.

        
 import pandas as pd
 import folium
 import os
 import numpy as np
 from sklearn.neighbors import KNeighborsClassifier
 from sklearn.model_selection import train_test_split
                            

Loading Data
Crime data is read from a CSV file into a pandas DataFrame. This dataset includes various attributes such as crime type, location coordinates (latitude and longitude), and descriptions.

        
 data = pd.read_csv("file_location")
                            

Classifying Crimes
Crimes are classified into two categories based on severity: 'Most Dangerous' includes crimes like Assault, Robbery, and Burglary and 'Less Dangerous' for all other crimes.

        
 most_dangerous_crimes = ['Assault', 'Robbery', 'Burglary']  
 data['Severity'] = data['CrimeType'].apply(lambda x: 'Most Dangerous' if x in most_dangerous_crimes else 'Less Dangerous')
                            

Feature Extraction and Splitting
The features for the machine learning model are the latitude and longitude coordinates, while the target variable is the crime severity. The data is split into training and testing sets using an 80-20 split to ensure the model is trained on a substantial portion of the data while leaving enough data for validation.

        
 X = data[['Latitude', 'Longitude']]
 y = data['Severity']
 X_train, _, y_train, _ = train_test_split(X, y, test_size=0.2, random_state=42)  
                            

Training the KNN Model
A K-Nearest Neighbors (KNN) classifier is initialized with 5 neighbors. The model is trained on the training data (80% of the dataset).

        
 knn = KNeighborsClassifier(n_neighbors=5)
 knn.fit(X_train, y_train)
                            

Creating the Base Map
A base map centered on Bucharest is created using Folium. The coordinates for Bucharest are [44.4268, 26.1025].

        
 bucharest_coordinates = [44.4268, 26.1025]
 base_map = folium.Map(location=bucharest_coordinates, zoom_start=12)
                            

Adding Crime Points
Each crime's location is plotted on the map. 'Red Markers' indicate 'Most Dangerous' crimes and 'Orange Markers' indicate 'Less Dangerous' crimes. Markers are added as CircleMarkers with popups containing crime descriptions, allowing for interactive exploration of the data.

        
 for idx, row in data.iterrows():
    color = 'red' if row['Severity'] == 'Most Dangerous' else 'orange'
    folium.CircleMarker(
        location=(row['Latitude'], row['Longitude']),
        radius=5,
        color=color,
        fill=True,
        fill_color=color,
        fill_opacity=0.6,
        popup=row['Description']
    ).add_to(base_map)
                            

Making Predictions and Adding to the Map
A sample of the dataset is used to make predictions using the trained KNN model. Predicted crime severities are added to the map. 'Blue Markers' indicate predicted 'Most Dangerous' crimes and 'Dark Blue Markers' indicate predicted 'Less Dangerous' crimes.

        
 prediction_data = data.sample(n=10)  
        
 prediction_data['PredictedSeverity'] = knn.predict(prediction_data[['Latitude', 'Longitude']])
        
 for idx, row in prediction_data.iterrows():
    color = 'blue' if row['PredictedSeverity'] == 'Most Dangerous' else 'darkblue'
    popup_message = 'Predicted Most Dangerous Crime' if color == 'blue' else 'Predicted Less Dangerous Crime'
    folium.CircleMarker(
        location=(row['Latitude'], row['Longitude']),
        radius=5,
        color=color,
        fill=True,
        fill_color=color,
        fill_opacity=0.6,
        popup=popup_message
    ).add_to(base_map)
                            


crime

Saving the Map
The output directory is specified, and if it does not exist, it is created using 'os.makedirs'. The final interactive map, which now includes both actual and predicted crime locations, is saved as an HTML file.

        
 output_dir = 'folder_location'
 os.makedirs(output_dir, exist_ok=True)
        
 output_file = os.path.join(output_dir, 'crime_hotspots_with_predictions.html')
        
 base_map.save(output_file)
        
 print(f"Map saved to {output_file}")
                            

Below is the full code with additional comments embedded.

        
 # Import necessary libraries
 import pandas as pd
 import folium
 import os
 import numpy as np
 from sklearn.neighbors import KNeighborsClassifier
 from sklearn.model_selection import train_test_split
        
 # Step 1: Read the data from CSV
 # Load the dataset containing crime data. Make sure to specify the correct file path.
 data = pd.read_csv("file_location")
        
 # Step 2: Classify the crimes
 # Define the criteria for classifying crimes as 'Most Dangerous' or 'Less Dangerous'.
 # Here, we assume that 'Assault', 'Robbery', and 'Burglary' are considered the most dangerous crimes.
 most_dangerous_crimes = ['Assault', 'Robbery', 'Burglary']  # Define your criteria
 data['Severity'] = data['CrimeType'].apply(lambda x: 'Most Dangerous' if x in most_dangerous_crimes else 'Less Dangerous')
        
 # Step 3: Train KNN model
 # Extract features (latitude and longitude) and target variable (severity).
 X = data[['Latitude', 'Longitude']]
 y = data['Severity']
        
 # Split the data into training and testing sets. We use 80% of the data for training.
 X_train, _, y_train, _ = train_test_split(X, y, test_size=0.2, random_state=42)  # Splitting data, you can adjust test_size
        
 # Initialize and train the K-Nearest Neighbors (KNN) model with 5 neighbors.
 knn = KNeighborsClassifier(n_neighbors=5)
 knn.fit(X_train, y_train)
        
 # Step 4: Create a base map centered on Bucharest
 # Define the coordinates for Bucharest and create a base map with a zoom level of 12.
 bucharest_coordinates = [44.4268, 26.1025]
 base_map = folium.Map(location=bucharest_coordinates, zoom_start=12)
        
 # Step 5: Add crime points to the map
 # Iterate through the dataset and add each crime's location to the map.
 # Use red color for 'Most Dangerous' crimes and orange for 'Less Dangerous' crimes.
    for idx, row in data.iterrows():
        color = 'red' if row['Severity'] == 'Most Dangerous' else 'orange'
        folium.CircleMarker(
            location=(row['Latitude'], row['Longitude']),
            radius=5,
            color=color,
            fill=True,
            fill_color=color,
            fill_opacity=0.6,
            popup=row['Description']
        ).add_to(base_map)
        
 # Step 6: Make predictions using KNN algorithm and add blue dots for predictions to the map
 # Generate a sample of the dataset for making predictions.
 prediction_data = data.sample(n=10)
        
 # Predict the severity of crimes using the trained KNN model.
 prediction_data['PredictedSeverity'] = knn.predict(prediction_data[['Latitude', 'Longitude']])
        
 # Add the predicted crime locations to the map. Use blue for 'Most Dangerous' and dark blue for 'Less Dangerous' predictions.
 for idx, row in prediction_data.iterrows():
    color = 'blue' if row['PredictedSeverity'] == 'Most Dangerous' else 'darkblue'
    popup_message = 'Predicted Most Dangerous Crime' if color == 'blue' else 'Predicted Less Dangerous Crime'
    folium.CircleMarker(
        location=(row['Latitude'], row['Longitude']),
        radius=5,
        color=color,
        fill=True,
        fill_color=color,
        fill_opacity=0.6,
        popup=popup_message
    ).add_to(base_map)
        
 # Step 7: Save the map to a different directory
 # Define the output directory and ensure it exists.
 output_dir = 'folder_location'
 os.makedirs(output_dir, exist_ok=True)
        
 # Define the full path to the output file
 output_file = os.path.join(output_dir, 'crime_hotspots_with_predictions.html')
        
 # Save the map to the specified location
 base_map.save(output_file)
        
 print(f"Map saved to {output_file}")
                            



Get the Jupyter Notebook and the dataset used in this project.

If you found this project interesting, you can share a coffee with me, by accessing the below link.

Boost Your Brand's Visibility

Partner with us to boost your brand's visibility and connect with our community of tech enthusiasts and professionals. Our platform offers great opportunities for engagement and brand recognition.

Interested in advertising on our website? Reach out to us at office@ml-nn.eu.