Deep Learning Project

Implement a complete deep learning workflow capable of identifying plant diseases from raw leaf images and presenting the predictions through an interactive Streamlit-based web application. Since no dataset is currently available, the first step will involve searching and selecting a suitable plant disease dataset from Kaggle. Multiple datasets may be reviewed and compared based on criteria such as dataset size, number of disease categories, labeling quality, and class balance. The final dataset should contain a sufficient number of images per class to support reliable model training and evaluation. After obtaining the dataset, the next phase will focus on data understanding and preparation using a Jupyter Notebook. This will include exploratory data analysis (EDA) to study the distribution of disease classes, visualize sample leaf images, and detect any inconsistencies in the data. Preprocessing steps such as image resizing, normalization, and cleaning will be applied. Additionally, data augmentation techniques (e.g., rotations, flips, and brightness adjustments) will be used to increase variability and improve the model’s ability to generalize. The core of the system will involve developing a Convolutional Neural Network (CNN) using either TensorFlow/Keras or PyTorch. The model will be trained and optimized through experiments with different parameters and configurations. Its performance will be evaluated using commonly used metrics including accuracy, precision, recall, and F1-score, along with a confusion matrix to better understand the model’s performance across individual disease classes. Once a satisfactory model is obtained, it will be saved in a deployable format such as H5, PyTorch (.pt), or TensorFlow SavedModel. This trained model will then be integrated into a Streamlit application, where users can upload an image of a plant leaf and instantly receive the predicted disease type along with the model’s confidence level. To enhance the practical value of the system, additional features will be incorporated. The application will provide AI-driven recommendations, such as possible treatment methods or preventive actions for the detected disease. A database layer (Django with SQLite or PostgreSQL) will be used to store user uploads, prediction results, and related metadata. Furthermore, REST API endpoints will be implemented so that other applications or services can interact with the model and obtain predictions programmatically. Expected Deliverables The final submission should include the following components: A Jupyter Notebook (.ipynb) containing the complete workflow: dataset loading, exploratory analysis, preprocessing, augmentation, model training, and evaluation. app.py, which implements the Streamlit web interface and can be executed using streamlit run app.py. A saved version of the trained model (H5, PT, or TensorFlow SavedModel format). A requirements.txt file listing all dependencies and their versions to ensure the project can be reproduced. A README file describing the installation process, instructions for running the notebook, and steps to launch the Streamlit application. The project will be considered successful when the environment defined in requirements.txt allows the notebook to run from start to finish without modifications and the Streamlit application can successfully classify a newly uploaded leaf image.

Python

Регистрация