Crime data Analysis using juypter

Authors

  • Dasari Prema Kumari PG scholar, Department of MCA, CDNR collage, Bhimavaram, Andhra Pradesh. Author
  • V.Sarala (Assistant Professor), Master of Computer Applications, DNR collage, Bhimavaram, Andhra Pradesh. Author

Abstract

This project focuses on the analysis and prediction of
crime trends across various states and union territories in
India using machine learning techniques. The dataset
comprises crime-related statistics categorized by state,
district, and year. Initial data preprocessing steps include
handling missing values and removing duplicates to
ensure data quality. Exploratory Data Analysis (EDA) is
conducted through various visualizations to highlight
crime patterns, identify states with high and low crime
rates, and observe temporal trends in Indian Penal Code
(IPC) crimes.A machine learning model using Random
Forest Regressor is trained to predict the total number of
IPC crimes based on state, district, and year as input
features. Label encoding is used to convert categorical
variables into numeric format suitable for model training.
The model’s performance is evaluated using the Rsquared
metric, and predictions are visualized to compare
actual versus forecasted crime numbers.Furthermore, a
user interface component is incorporated, allowing users
to input a specific state, district, and year to receive a
crime forecast along with a safety classification (e.g.,
"Safest City", "Medium Safe City", or "Not Safe City").
This application can serve as a decision-support tool for
policymakers and law enforcement agencies to proactively
address crime trends.

Downloads

Download data is not yet available.

References

National Crime Records Bureau (NCRB), India

Crime in India Reports.

Available at: https://ncrb.gov.in/en/crime-india

(Used for crime data collection and analysis framework)

2. Scikit-learn Documentation

Scikit-learn: Machine Learning in Python.

Available at: https://scikit-learn.org/stable/

(Used for Random Forest Regressor, LabelEncoder, model

evaluation, and data preprocessing)

3. Pandas Documentation Pandas: Python Data Analysis Library.

Available at: https://pandas.pydata.org/

(Used for data manipulation and analysis)

4. NumPy Documentation NumPy: The fundamental package for

scientific computing with Python.

Available at: https://numpy.org/doc/

(Used for numerical operations and data handling)

5. Matplotlib & SeabornHunter, J.D. (2007). Matplotlib: A 2D

graphics environment. Computing in Science &

Engineering.Waskom, M.L. (2021). Seaborn: statistical data

visualization. Journal of Open Source Software.

(Used for data visualization and exploratory data analysis)

6. Joblib Library Joblib: Tools for lightweight pipelining in Python.

Available at: https://joblib.readthedocs.io/

(Used for saving and loading the trained machine learning model)

7. Tkinter GUI Documentation Tkinter: Python’s standard GUI

package.

Available at: https://docs.python.org/3/library/tkinter.html

(Used for basic GUI elements in the CLI input system)

8. Kaggle Crime Datasets (if applicable) Example: Crime in India

(NCRB) – Public dataset on Kaggle. Available at:

https://www.kaggle.com/ (Alternative or supplemental dataset

used for training or validation)

9. Bishop, C. M. (2006) Pattern Recognition and Machine Learning,

Springer. (Reference for machine learning principles and model

evaluation)

10. James, G., Witten, D., Hastie, T., & Tibshirani, R.

(2013) An Introduction to Statistical Learning, Springer. (Used to

understand regression models and evaluation techniques)

Downloads

Published

2025-05-01

Issue

Section

Articles

How to Cite

Crime data Analysis using juypter. (2025). International Journal of Multidisciplinary Engineering In Current Research, 10(5), 221-224. https://ijmec.com/index.php/multidisciplinary/article/view/644

Most read articles by the same author(s)