Quick Code Tutorial on how to import and display images for Neural Network Classification
Image Classification
The use of image classification in the medical field is a growing area of study and interest. From identifying tumors in MRIs to creating AI that can detect cancer cells in blood, there are many applications for image classification. Creating these AI’s will help with early detection, more accurate diagnosis, and easier access to higher quality medicine anywhere on the globe. In this blog I will show the first few steps of prepping data for an image classifier.
Keras Image Data Generator
To read in Images I used the Keras Image data generator. In this blog I am using a kaggle data base for brain tumors in MRI images. The data set came pre split into train and test sets. First I load in the data as directories.
directory = 'mri_data'
train_directory = 'mri_data/Training'
test_directory = 'mri_data/Testing'
Next I will use the image data generator to rescale the images and separate the labels from the images.
data_train = ImageDataGenerator(rescale=1./255).flow_from_directory(
train_directory,
target_size=(224, 224),
batch_size = 2870,
seed = 123)
# separate images from labels
train_images, train_labels = next(data_train)
print('Found Classes: ',data_train.class_indices)
Simple image EDA
Now that we have read in our images we can do some simple eda to explore our image data. I first like just check for class imbalance. We can do this by graphing the sum of the image labels that we previously separated.
plt.bar(['glioma_tumor','meningioma_tumor','no_tumor','pituitary_tumor'], sum(train_labels))
plt.xticks(rotation=45)
plt.title('Image Class Distribution');
plt.gcf().subplots_adjust(bottom=0.15)
plt.savefig('./images/class_balance.png')
Here is a nice plot that shows us the distribution of the classes in our dataset. We can establish the class discrepancy with the no_tumor class.
Next I just like to get a feel of what the images in the dataset look like.
def get_label(array):
"""
Returns String Label of Class
"""
if array[0] ==1:
return 'glioma_tumor'
elif array[1] ==1:
return 'meningioma_tumor'
elif array[2] ==1:
return 'no_tumor'
elif array[3] ==1:
return 'pituitary_tumor'
label_names = np.apply_along_axis(get_label,1,train_labels)
%matplotlib inline
import matplotlib.pyplot as plt
plt.figure(figsize=(12,12))
for i in range(9):
plt.subplot(330 + 1 + i)
plt.imshow(train_images[i])
plt.gca().set_title(label_names[i])
plt.show()
This block of code gives us a nice nine by nine visualization that helps display different images in the classes. This can help you better understand the differences between the classes and make choices for classification further down the road.