Machine Learning Modelling to Predict Lung Cancer Stages from CT Scan Image

Sabina Zaman1*, Mamun Ahmed2, Mohammad Asaduzzaman Khan2, Tamanna Tajrin2

Keywords: Lung cancer, CT scan images, Image processing, Machine learning algorithm

Abstract: Lung cancer is considered the most common cancer for men and the third common for women. According to the world health organization report near about 1.76 million people died from lung cancer in 2018. Among various computer-aided diagnosis systems processing and analysing CT scan images to detect cancer from images of nodule has become popular in this age. After the implementation of several image processing steps, four (4) significant features- Area, Eccentricity, Diameter, and Perimeter have been extracted. Not only from online CT images of the lung but also using real-life data, a custom database has been prepared. As it is a self-made database, class labels have been determined according to standard rules for stage labelling, so the number of clusters has been verified using the K-valid algorithm. For classification purposes of cancer nodule staging, various machine learning algorithms have been implemented. The comparisons of accuracy and other measures of the classifiers have been implemented to rate and to choose the best classifier for this subject. It is observed that the overall accuracy of each machine learning algorithm has been improved after implementing new approaches to image processing. Unlike other approaches of binary class prediction and implementation of a single algorithm for the task, here we have tried to predict stages and a comparison among the three most traditional machine learning algorithms has been demonstrated.



[1] Blandin Knight S, Crosbie PA, Balata H, Chudziak J, Hussell T, Dive C. Progress and prospects of early detection in lung cancer. Open Biol. 2017;7(9):170070. doi:10.1098/rsob.170070

[2] Cancer survival in England: adult, stage at diagnosis and childhood  patients followed up to 2018,” Home – Office for National Statistics. [Online]. Available:

[3] Carson J, Finley DJ. Lung cancer staging: an overview of the new staging system and implications for radiographic clinical staging. SeminRoentgenol. 2011; 46(3):187–193. doi:10.1053/

[4] Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R., Torre, L. and Jemal, A. (2018). Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians, 68(6), pp.394-424

[5] M. A.gajdhane and P. D. L.m, “Detection of Lung Cancer Stages on CT scan Images by Using Various Image Processing Techniques,” IOSR Journal of Computer Engineering, vol. 16, no. 5, pp. 28–35, 2014

[6]Implementation of Lung Cancer Nodule Feature Extraction Using Digital Image Processing KHIN MYA MYA TUN1, AUNG SOE KHAING2

[7] A.Amutha and R.S.D Wahidabanu, “A Novel Method for Lung Tumor Diagnosis and Segmentation using Level SetActive Contour Modelling”, European Journal of Scientific Research, Vol.90, No.2, November 2012, pp.175-187

[8] Mokhled S. Al-Tarawneh, “Lung Cancer Detection Using Image Processing Techniques”, Leonardo Electronic Journal of Practices and Technologies, Issue 20, JanuaryJune 2012, p.147-158. ISSN 1583-1078

[9] N.A. Memon et. al, “Segmentation of Lungs from CT Scan Images for Early Diagnosis of Lung Cancer,” World Academy of Science, Engineering and Technology. 2006

[10] M.-S. Brown, M.-F. McNitt-Gray and N.-J. Mankovich, Method for segmenting chest CT image data using an anatomical model: preliminary results, IEEE Trans. on Med. Imaging, vol. 16, No. 6, pp. 828839, 1997.

[11] C.-W. Bong, H.-Y. Lam and H. Kamarulzaman, A Novel Image Segmentation Technique for Lung Computed Tomography Images, Communications in Computer and Information Science, Vol. 295, pp. 103-112, 2012.

[12] J.Wang, R. Engelmann and Q. Li, Segmentation of pulmonary nodules in three-dimensional CT images by use of a spiral scanning technique, Medical Physics, Vol. 34, No. 12, pp. 46784689, 2007

[13] D.-T. Lin, C.-R. Yan and, W.-T. Chen, Autonomous detection of pulmonary nodules on CT images with a neural network-based fuzzy system, Comput. Med. Imaging Graph., Vol. 29, pp. 447-458, 2005

[14] M. Antonelli, M. Cococcioni, B. Lazzerini and F. Marcelloni, Computer-aided detection of lung nodules based on decision fusion techniques, Pattern. Anal. Applic., Vo. 14, pp. 295310, 2011.]

[15]J.Wang, R. Engelmann and Q. Li, Segmentation of pulmonary nodules in three-dimensional CT images by use of a spiral scanning technique, Medical Physics, Vol. 34, No. 12, pp. 46784689, 2007.

[16]D.-T. Lin and C.-R. Yan, Lung nodules identification rules extraction with neural fuzzy network, In Proc. of the 9th IEEE International Conference of Information Processing (ICONIP), Vol. 4, pp. 2049-2053, Singapore, 18-22 November 2002

[17] PMC, E., 2020. Europe PMC. [online] Available at:<>

[18] Dhara, A., Mukhopadhyay, S., Dutta, A., Garg, M. and Khandelwal, N., 2020. A Combination Of Shape And Texture Features For Classification Of Pulmonary Nodules In Lung CT Images.