Disease Prediction from Medical Data
In this project, we'll walk you through building a neural network model to predict the appropriate medication for patients based on their symptoms. We'll use a dataset containing patient symptoms and prescribed medications, preprocess the data, train a neural network, and make predictions based on new input symptoms.
Importing Necessary Libraries
First, we need to import the required libraries. We'll use pandas for data manipulation,
tensorflow
for building and training the neural network, and scikit-learn for splitting the dataset and
encoding the target variable.
import pandas as pd import tensorflow as tf from sklearn.model_selection import train_test_split from sklearn.preprocessing import LabelEncoder
Loading the Data
Next, we'll load the data from a CSV file.
data = pd.read_csv('disease.csv')
Data Preprocessing
We need to convert the boolean columns to numeric format (0 or 1) for our neural network to
process them correctly.
symptoms = ['ear_pain', 'hearing_loss', 'ear_pressure', 'ear_discharge','tympanic_membrane_perforation', 'middle_ear_effusion', 'tinnitus'] data[symptoms] = data[symptoms].astype(int)
Defining Features and Target
We separate the features (X) from the target variable (y). We also encode the target
variable, which represents the medication prescribed.
X = data[symptoms].values label_encoder = LabelEncoder() y_encoded = label_encoder.fit_transform(data['medication_prescribed']) y = tf.keras.utils.to_categorical(y_encoded)
Splitting the Data
To evaluate our model's performance, we split the data into training and testing sets.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Building the Neural Network Model
We define the neural network architecture with two hidden layers. The final layer uses a
softmax activation function for multi-class classification.
model = tf.keras.models.Sequential([ tf.keras.layers.Dense(64, activation='relu', input_shape=(len(symptoms),)), tf.keras.layers.Dense(32, activation='relu'), tf.keras.layers.Dense(3, activation='softmax') # softmax for multi-class classification ]) model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
Training the Model
We train the model on the training data for 50 epochs with a batch size of 32.
model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test))
Evaluating the Model
After training, we evaluate the model's performance on the test data.
loss, accuracy = model.evaluate(X_test, y_test) print(f'Accuracy on test data: {accuracy}')
Predicting Medication
We define a function to predict the medication based on the input symptoms. The function
takes a dictionary of symptoms as input and returns the predicted medication.
def predict_medication(symptoms_input): input_data = [] for symptom in symptoms: input_data.append(symptoms_input[symptom]) input_data = [input_data] prediction = model.predict(input_data) predicted_class = tf.argmax(prediction[0]).numpy() predicted_medication = label_encoder.inverse_transform([predicted_class])[0] return predicted_medication
Example Usage
Finally, we test the prediction function with an example input and print the predicted
medication.
user_input = { 'ear_pain': 0, 'hearing_loss': 1, 'ear_pressure': 0, 'ear_discharge': 1, 'tympanic_membrane_perforation': 0, 'middle_ear_effusion': 1, 'tinnitus': 0 } predicted_medication = predict_medication(user_input) print(f'Predicted medication: {predicted_medication}')
Below is the full code with additional comments embedded.
import pandas as pd import tensorflow as tf from sklearn.model_selection import train_test_split from sklearn.preprocessing import LabelEncoder # Load data from CSV file data = pd.read_csv('disease.csv') # Convert boolean columns to numeric (0 or 1) symptoms = ['ear_pain', 'hearing_loss', 'ear_pressure', 'ear_discharge','tympanic_membrane_perforation', 'middle_ear_effusion', 'tinnitus'] data[symptoms] = data[symptoms].astype(int) # Define features and target X = data[symptoms].values # Encode the target variable label_encoder = LabelEncoder() y_encoded = label_encoder.fit_transform(data['medication_prescribed']) y = tf.keras.utils.to_categorical(y_encoded) # Split data into training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Build the neural network model model = tf.keras.models.Sequential([ tf.keras.layers.Dense(64, activation='relu', input_shape=(len(symptoms),)), tf.keras.layers.Dense(32, activation='relu'), tf.keras.layers.Dense(3, activation='softmax') # softmax for multi-class classification ]) # Compile the model model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) # Train the model model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test)) # Evaluate the model loss, accuracy = model.evaluate(X_test, y_test) print(f'Accuracy on test data: {accuracy}') # Function to predict medication based on symptoms input def predict_medication(symptoms_input): # Assuming symptoms_input is a dictionary with keys matching symptom names input_data = [] for symptom in symptoms: input_data.append(symptoms_input[symptom]) input_data = [input_data] # shape it into a 2D array for prediction prediction = model.predict(input_data) predicted_class = tf.argmax(prediction[0]).numpy() predicted_medication = label_encoder.inverse_transform([predicted_class])[0] return predicted_medication # Example usage: user_input = { 'ear_pain': 0, 'hearing_loss': 1, 'ear_pressure': 0, 'ear_discharge': 1, 'tympanic_membrane_perforation': 0, 'middle_ear_effusion': 1, 'tinnitus': 0 } predicted_medication = predict_medication(user_input) print(f'Predicted medication: {predicted_medication}')