2021 5th International Conference on Computer, Communication and Signal Processing (ICCCSP - 2021) 978-1-6654-3277-1/21/$31.00 ©2021 IEEE Deep Learning based Approach for Facilitating Online Proctoring using Transfer Learning Ashwinkumar J S, Harshavardhini Saravana Kumaran, Sivakarthikeyan U, Konjeti P B V Rajesh, Lavanya R* Department of Electronics and Communication Engineering Amrita School of Engineering, Coimbatore Amrita Vishwa Vidyapeetham, India ashwinkumarjs@gmail.com, hvs0799@gmail.com, sivakarthikeyan.3199@gmail.com, konjetirajesh1@gmail.com, r_lavanya@cb.amrita.edu* Abstract— This paper aims at developing an algorithm which helps to ensure the reliability of online examinations. The proposed algorithm provides an automated approach to facilitate online proctoring which alleviates the cumbersome nature of its manual counterpart. It makes use of transfer learning to realize deep learning and it combines three models namely- YOLO (for fraudulent object and multi-person detection), MPGazeII (for abnormal gaze detection) and VGG16 (for Face recognition) and combines the results of all the individual anomaly detection algorithms to ultimately predict whether the examinee has been engaging in malpractice so that necessary action could be taken. The existing algorithms make use of huge amounts of processing for localization as well as tedious feature extraction to realize online proctoring. The proposed algorithm used Deep Learning to overcome the drawbacks of existing online Proctoring algorithms such that no hand-crafted feature extraction is involved. Keywords— Deep Learning, Neural Networks, Remote online proctoring, VGG-16, YOLO, GazeML, Decision fusion I. I NTRODUCTION Online examinations facilitate candidates to take exams from any location. Online examinations have found a permanent place in today’s educational scenario, not only with respect to remote education, but also the rapid growth in the prevalence and usage of Massive open online courses (MOOCS). These online examinations facilitate students to take up online courses from any geographical location and time zone in the world as long as they have a good network connection, thereby strengthening one of the main motives of education, which is accessibility to all [8]. As online exams become a common mode of assessment in today’s educational scenario, the major hurdle which exists is to ensure a reliable method of assessment. However, ensuring reliability of online examinations continues to remain a challenge. On account of the lack of physical proctoring which is present while attending the examination from a traditional examination hall or center, the examinee can engage in various malpractices such as: Use of material for copying and engaging in malpractice such as books, smartphones, smartwatches, and papers. Presence of multiple people [15] apart from the examinee during the time of exam to help in writing it. Presence of a proxy examinee in place of the original examinee The examinee exhibiting abnormal gaze patterns, indicating that he/she is look elsewhere for the answer. In order to identify these malpractice indications from a remote location, remote proctoring methods are used. Remote proctoring exams have the following advantages [8]: These exams eliminate the need for the arrangement of an exam centre. They eliminate the mandatory requirement and availability of physical proctors during the course of the examination. Due to the automation process, the integrity and security of the exam is strengthened. It helps candidates to write exams whose exam centres as far from them and acts as a good method of attending exams from the safety and comfort of their homes. Automated algorithms for online proctoring that can alleviate the cumbersome nature of its manual counterpart and ensure a seamless proctoring session yielding good efficiency and ease. In the proposed online proctoring algorithm, the objective is to identify the previously mentioned malpractice indications if present while the subject is writing the exam and accordingly identify whether the examinee is engaging in malpractice. Existing algorithms [1], [2], [11] for automated 2021 5th International Conference on Computer, Communication and Signal Processing (ICCCSP) | 978-1-6654-3277-1/20/$31.00 ©2021 IEEE | DOI: 10.1109/ICCCSP52374.2021.9465530 Authorized licensed use limited to: University of Canberra. Downloaded on November 08,2022 at 11:24:15 UTC from IEEE Xplore. Restrictions apply. online proctoring involves huge processing for localization and Feature Extraction, delaying the overall process. However, the proposed algorithm, avoids the use of this heavy processing and makes use of neural networks which do not require any handcrafted feature extraction or localization. Also,[1], [2], [3], [11] are very tedious as they involve training of models from scratch and they require a large amount of training data. The proposed method avoids these drawbacks of conventional machine learning by making use of transfer learning by which we can use an already trained model and fine-tune it for the required application. Hand crafted features involved in conventional machine learning algorithms are restricted in nature [1], [2], [3] and may not be useful in discovering meaningful patterns required for the decision-making process. Therefore, by using deep learning it is possible to realize a more apt and efficient solution [16]. Using transfer learning, a solid deep learning model can be built with comparatively little training data because the model is already pre-trained. Also, the training time is reduced because it can sometimes take days or even weeks to train a deep neural network from scratch on a complex task. In [1], the paper presents a system which makes use of analytics in multimedia [1] which makes use of different features such as: authentication of user, text identification, detection of verbal cues, dynamic window recognition, eye tracking and smartphone detection and combines the feature vectors to yield a threshold value by which fraudulent practices in examinations can be detected. However, the false acceptance rate is high in the case of the face detection algorithm, and the false detection and alarm is also high in object detection algorithms. This paper involves the usage of separate machine learning models for each feature identification and it requires a large dataset. However, by making use of deep learning, intense feature extraction required for each model is avoided and the use of neural networks makes the model immune to illumination changes and ensures robustness, as the system proposed in [1] was not very immune to illumination changes. Also, the use of transfer learning allows the system to make use of pre- trained models by fine-tuning them to cater to the requirements of the required online proctoring application with a small dataset In [2], the paper presents a fully automated, multimodal system which makes use of hardware which is inexpensive and convenient to the user. It contains three modules: system usage analysis, video analysis, as well as audio and active window analysis. The yaw angle-based algorithm mentioned in the given method is able to detect the degree of head movement but it cannot detect whether someone is just moving their eyeball to look into something without moving their head. In [11], the proposed authentication framework gives an economical and high exactness confirmation framework. The technique is to reinforce the resistance of face authentication to online test delegating of varieties in postures and lighting without adding another image processing, such as face alignment or histogram equalization. It makes use of Convolutional Neural Networks, Principal Component Analysis and Eigen Face Detection and detects anomalies such multiple face detection, user not being available in front of screen and impostor detection. However, this system involves building a network from scratch which in turn is time- consuming and requires a very large database for training. The present work makes use of transfer learning which does away with this need to build a network from scratch and using a very large dataset for training as transfer learning requires lesser training time as well as a smaller dataset is sufficient as it makes use of the information already learnt in a pre-trained model and applies it to the new task at hand. The authors in [12] have proposed two types of authentication:(a) static which is performed towards the initial phase of the test and (b)continuous which is applied after the exam starts and continuously checks for student identity. It involves the registration of the examinee and if re Fig. 1. Block diagram for the online proctoring algorithm which makes use of transfer learning to realize deep learning and uses a decision fusion method to fuse the results of the three models to come up with an overall result on whether malpractice has been engaged in or not. 307 Authorized licensed use limited to: University of Canberra. Downloaded on November 08,2022 at 11:24:15 UTC from IEEE Xplore. Restrictions apply. gistered already, it involves smart card insertion and verification followed by facial recognition and facial tracking till completion of the exam. Also, it permits instructors and administrators to see the examinee's screen live. The drawbacks in this system is that not all the modules explained in this paper are implementable from a remote off campus location In the proposed method, transfer learning is used to realize deep learning in order to facilitate the process of remote proctoring. The use of pre-trained networks facilitated with transfer learning, obviates the need to build Deep learning models from scratch [11]. This alleviates the cumbersome training process and avoids overfitting which is a common problem when a large dataset is not available. The specific gaps which the proposed algorithm addresses in contrast to the existing work is as follows: While most of the papers deal with either eyeball movement [3] or head movement [2], the proposed work aims to consider both these modules for improving the reliability of online proctoring. The proposed model aims to use emotion recognition as an added feature along with face recognition to improve the performance. The Proposed work also aims to effectively differentiate between normal and abnormal background activity so as to decrease the false positive rate, which is a point not discussed in existing work. II. METHODOLOGY Fig 1 depicts the schematic of the proposed algorithm which is explained below: YOLO is a model which is used for real-time object detection. It is used to detect the presence of multiple persons in the frame. It is also used to detect the presence of devices such mobile phone, smartphones, smart watches etc. The second model used in this project is VGG16. This model is used to recognize the face of the candidate. The third pre- trained model which is used in the project is GazeML. This model is used to detect the gaze of the pupil writing the exam which helps in understanding whether the candidate is looking away or not. After this, the outputs of these 3 models are taken and are fused [13] together using decision fusion. Once these three outputs of the models are fused, a score is obtained in the end. Now this score is analyzed which helps in understanding whether the student is indulging in malpractices or not. Custom dataset for training and testing the models: A custom dataset has been prepared to use in transfer learning in our models (fine-tuning or doing fixed feature extraction). The custom dataset consists of images of 4 subjects, where each subject contributes 250 images which are further divided as shown in table 1: Web-cam input: The first step involved in the proposed algorithm is to take the camera input. While the camera input is being taken, the average web-camera in a desktop or PC has a frame rate of 20-40 fps (frames per second). The frame per second-fps (frame rate) with respect to the neural network is an important metric which allows us to determine the smoothness of operation of a neural network with respect to the input frames being processed by the webcam. From the view of a Web-camera, the higher the frames per second provided by a webcam in general, the faster and smoother the video experience will be. Each of the models which we have used in the project also have different fps. (i.e.) YOLO- object detection algorithm (30-45 fps), GazeML- gaze detection Algorithm (10-20 fps), VGG-16- face recognition algorithm (20 fps) From the above observed fps of the various networks, it is observed that the frames per second in each of the networks is highly variant, so in order to run these networks in parallel, a method called multi-threading is used. Multi-threading is an important technique in parallel processing and computing. Since, each of the algorithm’s fps is roughly equal to the web-cam fps, it is very much possible to execute a live web-cam proctoring algorithm. Pre-processing: The individual frame size of the video of a web-cam is either VGA (640*480) or HD (1280*720), which are scaled multiples of 32*32, Hence there is no need to resize images in the case of the YOLO algorithm. In the case of GazeML, the image size accepted is VGA’s image size by default, hence this model requires resizing to 640*480 in case HD web-cam input is given. In the case of GazeML, the image size needs to be 150*90, , hence it needs to be resized. In the case of illumination variations, conventional Machine Learning algorithms for online proctoring require the use of Locally Sensitive Histograms (LSH), histogram of oriented gradients (HOG) or minimum average correlation energy (MACE) filters to normalize these illumination variations in each frame before processing. However, the neural networks which have been used in this work [10] are convolutional neural networks (CNN) which are highly immune to most illumination intensities owing to CNN’s abilities to detect and process invariant visual patterns [10]. Hence these require no explicit pre- processing the normalize or eliminate the same, which is an advantage in the proposed algorithm. The following are the pre-trained networks employed in the proposed work A. Pre-trained models: Three different pre-trained models are used which help in identifying whether a student who is writing the exam is indulging in any sorts of malpractice. III. YOLO V 3 ( Y OU O NLY L OOK O NCE ): YOLO is a network which is used to detect objects in real- time. This network is used for the detection of objects such as mobile phones, smartphones, and multi-person detection [15]. YOLO helps in prevention of cheating by students by detecting the various objects that are in the frame. The advantage of using YOLO is that it facilitates the scanning of an image only once for object detection unlike other systems which goes through the image more than once for detection 308 Authorized licensed use limited to: University of Canberra. Downloaded on November 08,2022 at 11:24:15 UTC from IEEE Xplore. Restrictions apply. of the objects present. Some of the advantages of using YOLO is that it is extremely fast, since the detection is framed as a regression problem, a complex pipeline is not needed. Yolo v3 is the used here since it has more advantages than v1 and v2. The Yolo v3 uses a residual block. Residual block is a stack of layers such that the output of a layer is taken and added to another layer deeper in the network. The Yolov3 is better than Yolov2 and Yolov1 at small image detection since it uses short cut connections. These connections help in getting more information. YOLOv3 also predicts confidence [7]. In case, the prior bounding box happens to overlap the ground truth object exceeding than that of any other prior bounding box, the score is 1 for each bounding box using logistic regression. The v3 makes decision at 3 different scales. YOLO uses bounding boxes to detect objects. The network has 4 coordinates for each bounding box: tx, ty, tw, th. If the cell happens to deviate from the top left corner of the image by (cx, cy) and the prior bounding box exhibits a width and height of pw, ph respectively, then the predictions correspond to: b x =σ(t x )+c x (1) b y =σ(t y )+c y (2) b w =p w e tw (3) b h =p h e th (4) where, (bx,by), bh and bw are mid-point coordinates and height and width measurements respectively of the bounding box. Another improvement made in YOLO is the new CNN feature extractor called Darknet-53 [7]. It is a 53 layered CNN. It uses a 3x3 and 1x1 convolutional layers. In Yolov3, the input image is subdivided to 16*16 grids [7]. The anchors which overlap the ground truth object by less than a threshold value is ignored. The evaluation metrics show that Yolov3 has a higher precision rate than Faster R-CNN model. For Yolov3 the processing time for each image ranges between 0.056 ms and 0.060 ms with an average of 0.057 ms. The individual frame size of the video of a web-cam is either VGA (640*480) or HD (1280*720), which are scaled multiples of 32*32, there is no need to resize images in the case of the YOLO algorithm generally. In YOLO, the fine-tuning is done by replacing the classifier layer on top of the convolutional neural network and also the weights of the pretrained network are further fine-tuned by continuing the back-propagation. Since we have to add new classes of objects to recognized which are not present in the YOLO classes, we have fine-tuned the whole layers. 1) VGG16: The model VGG16 helps in face recognition which helps in detecting a proxy in the exam. This network is used to detect the presence of impostors writing the exam and basically verify it is the registered examinee who is writing the exam. VGG16 consists of 16 layers [10]. The model accepts colour images as input with the size 224 x 224 and 3 channels. They are: Red, Green, and Blue. This model consists of 13 convolutional layers. It also consists of 5 max-pooling layers. Convolutional layer is a layer where a neuron is simply associated with a neighborhood of input neurons rather than full-association so the quantity of parameters to be learned is diminished altogether and a network can become more robust with less parameters [10]. Max-pooling is an activity which chooses the greatest component from the locale of the filter map covered by the filter. So the yield from a maximum pooling layer is an feature map which contains the most conspicuous features of the past feature map. It also has 3 fully connected layers. There is a softmax layer at the output. Softmax function is the activation function in the output layer which is used to predict probability. In the case of custom dataset which is small and similar to that of the original dataset which is used for training the VGG16 model, overfitting can occur on account of the small size of the custom dataset in case we use fine-tuning. Therefore, we opt for fixed feature extraction wherein the last fully connected layer is removed. Rest of the network acts as fixed feature extraction. Vgg 16 Top layers have been removed and the dense network was frozen and, the top layer is fixed to the length of the number of students considered to recognized. In this case, the top layer is fixed to a length of 4. 3)GazeMl: This network is used to monitor and detect any abnormal gazes of the examinee while attending the exam. 3D direction of gaze can be pictorially addressed as an image, where a round eyeball and a circle shaped iris are projected onto the image plane, giving an abstract representation of a circle and ellipse [9]. The gaze direction is defined by the vector connecting the larger circle’s center and the ellipse. Two approaches are used in image-based gaze estimation. They are: Feature-based and Model-based. Intermediate supervision helps is used to boost the training of the system. Intermediate supervision enables the network to rapidly become familiar with an approximate for the final output and then get accustomed to modify the estimated features. The network is sequential. It uses Hourglass network. The input eye image is processed to yield direction of gaze in 3D. This direction is often represented as a 3-element unit vector v. The process of gaze estimation is classified into 2 important tasks. They are: Scaling down of input image to simplest normalized form which are called gazemaps Gazemaps obtained are further used for estimation of gaze. For an output image dimension of mxn, diameter of eyeball 2r=1.2n is assumed and the iris center coordinates (ui,vi) is calculated as: ui=(m/2)-r’sin ∅ cosQ (5) vi=(n/2)-r’sinQ (6) Here r’=rcos(sin-1 ½) and gaze direction g=(Q, ∅ ) [9] Images of size 150x90 is provided as the input,64 feature maps of size 75x45 are refined throughout the network and the 2-dimensional confidence maps are pixel-aligned to the input image. After that, 1x1 convolutions are applied to the output of the last hourglass module and apply the loss term of the gazemap. 309 Authorized licensed use limited to: University of Canberra. Downloaded on November 08,2022 at 11:24:15 UTC from IEEE Xplore. Restrictions apply. In Gazeml, to avoid overfitting, fixed feature extraction is opted for, where the last fully connected layer is removed and the network acts as a fixed feature extractor. Again, we have used the whole layers to fine tune it. B. Multi-threading: A thread is a progression of execution through the process code, with its own program counter that monitors which steps to execute subsequently, system registers which hold its present working variables, and a stack which contains the execution history. Multithreading is the ability to allow multiple parts of the program to be executed in the same time. Each part of the program is known as a thread. The main models for multithreading are one-to-one model, many-to one model and many-to-many model. The multithreading model used here in the project is the one-to-one model. 1) One to One Model: The one-to-one model guides every one of the user threads to a kernel thread. This implies that numerous threads can run simultaneously on multiprocessors and different threads can run when one thread settles on an obstructing system call. C. Fusion for decision making: Decision fusion [13] is one type of data fusion that joins the results of different classifiers into a common decision about the activity that happened. Based on the final score produced from the decision fusion process, it can then be determined whether malpractice has been identified or not. 1) Ensemble: An ensemble method is a technique which uses multiple independent similar or different models/weak learners to derive an output or make some predictions [14]. The errors and predictions in any machine learning models are adversely influenced by bias, variance and noise. In order to reduce these drawbacks, ensemble-based methods are used. 2) Fusion: Fusion is used to train spatial-temporal data through the deep learning model, and then fuses the output of all models. The different types of fusion include score level fusion, feature level fusion, decision level fusion (decision level fusion is of three types- hard and soft decision fusion), fuzzy fusion, classifier based fusion. The different fusion strategies include majority voting, weighted voting, majority sum, weighted sum. The fusion strategy to be implemented is still under analysis and an appropriate fusion strategy will be implemented. IV. RESULTS AND DISCUSSION The dataset which has been used consists of 1000 images of 4 subjects (250 per person) which were captured through the computer webcam. In particular, impersonation was introduced in 200 images, 600 images have been used for the identification of gadgets or other fraudulent objects, 200 images have been used for abnormal gaze detection. For YOLO, the images are annotated using labelImg and the annotations were in .XML format and all these individual .XML files were concatenated into a single XML file using Roboflow. For VGG16, the images have been cropped to their face using OpenCV tools and they are resized to 150 *90, and then we have placed them in different directories with the folder name and made use of Keras image pre- processing library to automatically label them. For GazeML, we have used open cv to crop the images to the eyes of the pupil/student with the help of a python script. The dataset has been divided into four categories namely: Object detection, multi-person detection, face detection. Before passing these images to the models, they are preprocessed where the images are resized according to the needs of the models which are used. In this project,3 models have been used. They are YOLO,VGG16 and GazeML. TABLE I: S EGREGATION OF DATASET IMAGES INTO THE FOUR CATEGORIES PER PERSON FOR TRAINING THE MODELS In the case of the VGG-16 model, a validation accuracy of 95% and a training accuracy of 94% has been obtained. The loss graphs for training and validation Module Model used Number of images per subject Object and multi person detection YOLO 150 Face recognition VGG-16 50 Abnormal gaze detection GazeML 50 Fig. 2. Training and testing accuracy for VGG-16 model for face recognition Fig. 3. Training and validation loss for VGG-16 model 310 Authorized licensed use limited to: University of Canberra. Downloaded on November 08,2022 at 11:24:15 UTC from IEEE Xplore. Restrictions apply. along with the accuracy graphs are shown in Fig 2 and Fig 3: • In the case of the YOLO v3 model, a total validation loss of 8.17 has been obtained. The mAP (mean average precision) of all classes is 27.109% with an FPS of 14.42. V. CONCLUSION In the proposed work, an exam proctoring algorithm using transfer learning to realize deep learning has been developed, which is capable of detecting whether a pupil is exhibiting any anomalies such as impersonation, usage of gadgets (or fraudulent objects for engaging in malpractice) and abnormal gaze detection. Currently, the extent of the work is limited to an individual examinee situation but, since we have used transfer learning, this Fig. 4.2. Giou loss of YOLO v3 model Fig. 4.3. Probability loss of YOLO v3 model Fig. 4.4. Total loss of YOLO v3 model Fig. 4.1. Confidence loss of YOLO v3 model (a) (b) ( c) (d) 311 Authorized licensed use limited to: University of Canberra. Downloaded on November 08,2022 at 11:24:15 UTC from IEEE Xplore. Restrictions apply. system is capable of easy deployment on a large scale compared to other conventional models. The future scope of this project is using speech processing which can improve the proctoring much further. VI. ACKNOWLEDGEMENT The authors would like to thank Amrita Vishwa Vidyapeetham and the associated faculty, who helped realize the project and guided the project towards its successful projectile. VII. REFERENCES [1] Y. Atoum, L. Chen, A. X. Liu, S. D. H. Hsu and X. Liu, "Automated Online Exam Proctoring," in IEEE Transactions on Multimedia, vol. 19, no. 7, pp. 1609-1624, July 2017, doi: 10.1109/TMM.2017.2656064. [2] S. Prathish, A. N. S. and K. Bijlani, "An intelligent system for online exam monitoring," 2016 International Conference on Information Science (ICIS), Kochi, 2016, pp. 138-143, doi: 10.1109/INFOSCI.2016.7845315. [3] Y. Cheung and Q. Peng, "Eye Gaze Tracking With a Web Camera in a Desktop Environment," in IEEE Transactions on Human- Machine Systems, vol. 45, no. 4, pp. 419-430, Aug. 2015, doi: 10.1109/THMS.2015.2400442. [4] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 779-788, doi: 10.1109/CVPR.2016.91. [5] Park, Seonwook & Spurr, Adrian & Hilliges, Otmar. (2018). Deep Pictorial Gaze Estimation. 10.1007/978-3-030-01261- 8_44. [6] Y. Taigman, M. Yang, M. Ranzato and L. Wolf, "DeepFace: Closing the Gap to Human-Level Performance in Face Verification," 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 1701-1708, doi: 10.1109/CVPR.2014.220. [7] B. Benjdira, T. Khursheed, A. Koubaa, A. Ammar and K. Ouni, "Car Detection using Unmanned Aerial Vehicles: Comparison between Faster R-CNN and YOLOv3," 2019 1st International Conference on Unmanned Vehicle Systems-Oman (UVS), Muscat, Oman, 2019, pp. 1-6, doi: 10.1109/UVS.2019.8658300. [8] Alessio, Helaine & Malay, Nancy & Maurer, Karsten & Bailer, Albert & Rubin, Beth. (2017). Examining the Effect of Proctoring on Online Test Scores. Online Learning. 21.10.24059/olj.v21i1.885. [9] S. Park, A. Spurr, and O. Hilliges, “Deep Pictorial Gaze Estimation,” arXiv [cs.CV], 2018. [10] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv [cs.CV], 2014. [11] H. S. G. Asep and Y. Bandung, "A Design of Continuous User Verification for Online Exam Proctoring on M-Learning," 2019 International Conference on Electrical Engineering and Informatics (ICEEI), Bandung, Indonesia, 2019, pp. 284-289, doi: 10.1109/ICEEI47359.2019.8988786. [12] M. Ghizlane, B. Hicham and F. H. Reda, "A New Model of Automatic and Continuous Online Exam Monitoring," 2019 International Conference on Systems of Collaboration Big Data, Internet of Things & Security (SysCoBIoTS), Casablanca, Morocco, 2019, pp. 1-5, doi: 10.1109/SysCoBIoTS48768.2019.9028027. [13] R. R. Nair, S. H. Karumanchi and T. Singh, "Neuro-Fuzzy based Multimodal Medical Image Fusion," 2020 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT) , Bangalore, India, 2020, pp. 1-6, doi: 10.1109/CONECCT50063.2020.9198404. [14] D. Vijayan and R. Lavanya, “Ensemble of density-specific experts for mass characterization in mammograms,” Signal Image Video Process., 2021. [15] H. V. and K. R., "Real Time Pedestrian Detection Using Modified YOLO V2," 2020 5th International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 2020, pp. 855-859, doi: 10.1109/ICCES48766.2020.9138103. [16] M. Sushil, G. Suguna, R. Lavanya, and M. Nirmala Devi, “Performance comparison of pre-trained deep neural networks for automated glaucoma detection,” in Proceedings of the International Conference on ISMAC in Computational Vision and Bio-Engineering 2018 (ISMAC-CVB), Cham: Springer International Publishing, 2019, pp. 631–637. 312 Authorized licensed use limited to: University of Canberra. Downloaded on November 08,2022 at 11:24:15 UTC from IEEE Xplore. Restrictions apply.