Machine Learning
&
Neural Networks Blog

Disease Prediction from Medical Data

In this project, we'll walk you through building a neural network model to predict the appropriate medication for patients based on their symptoms. We'll use a dataset containing patient symptoms and prescribed medications, preprocess the data, train a neural network, and make predictions based on new input symptoms.


Importing Necessary Libraries
First, we need to import the required libraries. We'll use pandas for data manipulation, tensorflow for building and training the neural network, and scikit-learn for splitting the dataset and encoding the target variable.


 import pandas as pd
 import tensorflow as tf
 from sklearn.model_selection import train_test_split
 from sklearn.preprocessing import LabelEncoder
                        

Loading the Data
Next, we'll load the data from a CSV file.


 data = pd.read_csv('disease.csv')
                        

Data Preprocessing
We need to convert the boolean columns to numeric format (0 or 1) for our neural network to process them correctly.


 symptoms = ['ear_pain', 'hearing_loss', 'ear_pressure', 'ear_discharge','tympanic_membrane_perforation', 'middle_ear_effusion', 'tinnitus']
 data[symptoms] = data[symptoms].astype(int)
                        

Defining Features and Target
We separate the features (X) from the target variable (y). We also encode the target variable, which represents the medication prescribed.


 X = data[symptoms].values
        
 label_encoder = LabelEncoder()
 y_encoded = label_encoder.fit_transform(data['medication_prescribed'])
 y = tf.keras.utils.to_categorical(y_encoded)
                        

Splitting the Data
To evaluate our model's performance, we split the data into training and testing sets.


 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
                        

Building the Neural Network Model
We define the neural network architecture with two hidden layers. The final layer uses a softmax activation function for multi-class classification.


 model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(len(symptoms),)),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(3, activation='softmax')  # softmax for multi-class classification
 ])
        
 model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
                        

Training the Model
We train the model on the training data for 50 epochs with a batch size of 32.


 model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test))
                        

Evaluating the Model
After training, we evaluate the model's performance on the test data.


 loss, accuracy = model.evaluate(X_test, y_test)
 print(f'Accuracy on test data: {accuracy}')
                        

Predicting Medication
We define a function to predict the medication based on the input symptoms. The function takes a dictionary of symptoms as input and returns the predicted medication.


 def predict_medication(symptoms_input):
    input_data = []
    for symptom in symptoms:
        input_data.append(symptoms_input[symptom])
    input_data = [input_data]  
    prediction = model.predict(input_data)
    predicted_class = tf.argmax(prediction[0]).numpy()
    predicted_medication = label_encoder.inverse_transform([predicted_class])[0]
    return predicted_medication
                        

Example Usage
Finally, we test the prediction function with an example input and print the predicted medication.


 user_input = {
    'ear_pain': 0,
    'hearing_loss': 1,
    'ear_pressure': 0,
    'ear_discharge': 1,
    'tympanic_membrane_perforation': 0,
    'middle_ear_effusion': 1,
    'tinnitus': 0
 }
                            
 predicted_medication = predict_medication(user_input)
 print(f'Predicted medication: {predicted_medication}')
                        

Below is the full code with additional comments embedded.


 import pandas as pd
 import tensorflow as tf
 from sklearn.model_selection import train_test_split
 from sklearn.preprocessing import LabelEncoder
                            
 # Load data from CSV file
 data = pd.read_csv('disease.csv')
                            
 # Convert boolean columns to numeric (0 or 1)
 symptoms = ['ear_pain', 'hearing_loss', 'ear_pressure', 'ear_discharge','tympanic_membrane_perforation', 'middle_ear_effusion', 'tinnitus']
 data[symptoms] = data[symptoms].astype(int)
                            
 # Define features and target
 X = data[symptoms].values
 # Encode the target variable
 label_encoder = LabelEncoder()
 y_encoded = label_encoder.fit_transform(data['medication_prescribed'])
 y = tf.keras.utils.to_categorical(y_encoded)
                            
 # Split data into training and testing sets
 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
                            
 # Build the neural network model
 model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(len(symptoms),)),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(3, activation='softmax')  # softmax for multi-class classification
 ])
                            
 # Compile the model
 model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
                            
 # Train the model
 model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test))
                            
 # Evaluate the model
 loss, accuracy = model.evaluate(X_test, y_test)
 print(f'Accuracy on test data: {accuracy}')
                            
 # Function to predict medication based on symptoms input
 def predict_medication(symptoms_input):
    # Assuming symptoms_input is a dictionary with keys matching symptom names
    input_data = []
    for symptom in symptoms:
    input_data.append(symptoms_input[symptom])
    input_data = [input_data]  # shape it into a 2D array for prediction
    prediction = model.predict(input_data)
    predicted_class = tf.argmax(prediction[0]).numpy()
    predicted_medication = label_encoder.inverse_transform([predicted_class])[0]
    return predicted_medication
                            
 # Example usage:
 user_input = {
    'ear_pain': 0,
    'hearing_loss': 1,
    'ear_pressure': 0,
    'ear_discharge': 1,
    'tympanic_membrane_perforation': 0,
    'middle_ear_effusion': 1,
    'tinnitus': 0
 }
                            
 predicted_medication = predict_medication(user_input)
 print(f'Predicted medication: {predicted_medication}')
                        


Get the Jupyter Notebook and the dataset used in this project.

If you found this project interesting, you can share a coffee with me, by accessing the below link.

Boost Your Brand's Visibility

Partner with us to boost your brand's visibility and connect with our community of tech enthusiasts and professionals. Our platform offers great opportunities for engagement and brand recognition.

Interested in advertising on our website? Reach out to us at office@ml-nn.eu.