GitHub - NerminImamovic/testOracleAutomation: Test Oracle Automation with Machine Learning: A Feasibility Study

Test Oracle Automation with Machine Learning: A Feasibility Study

Mälardalen University

School of Innovation Design and Engineering

Västerås, Sweden

Thesis for the Degree of Master of Science (60 credits) in Computer Science with Specialization in Software Engineering - 15.0 credits

Nermin Imamovic

nic17001@student.mdh.se

Examiner: Mobyen Uddin Ahmed Mälardalen University, Västerås, Sweden

Supervisor: Adnan Causevic Mälardalen University, Västerås, Sweden

Company Supervisor: Ola Sellin, Bombardier Transportation, Ostra Ringvgen 2, 722 14 Västerås, Sweden

Abstract

The train represents a complex system, where every sub-system has an important role. If a subsystem doesn’t work how it should, the correctness of whole the train can be uncertain. To ensure that system works properly, we should test each sub-system individually and integrate them together in the whole system. Each of these subsystems consists of the different modules with different functionalities what should be tested.

Testing of different functionalities often requires a different approach. For some functionalities, it is necessary domain knowledge from the human expert, such as classification of signals in different use cases in Propulsion and Controls (PPC) in Bombardier Transportation. Due to this reason, we need to simulate of using experts knowledge in the certain domain. We are investigating the use of machine learning techniques for solving this cases and creating system what will automatically classify different signals using the previous human knowledge.

This case study is conducted in Bombardier Transportation (BT), V¨aster˚as in departments Train Control Management System (TCMS) and Propulsion and Controls (PPC), where data is collected, analyzed and evaluated. We proposed a method for solving the oracle problem based on machine learning approach for different for certain use case. Also, we explained different steps what can be used for solving the test oracle problem where signals are part of verdict process.

Keywords: Test oracle automation, the oracle problem, machine learning, classification, feature engineering, signal classification, time-series analysis, time-series classification, multivariate time-series classification

References

[1] D. Agarwal, D. E. Tamir, M. Last, and A. Kandel. A comparative study of artificial neural networks and info-fuzzy networks as automated oracles in software testing. IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, 42(5):1183–1193, Sept 2012.

[2] C. C. Aggarwal. Data Classification: Algorithms and Applications. Chapman & Hall/CRC, 1st edition, 2014.

[3] R. Almaghairbe and M. Roper. Automatically classifying test results by semi-supervised learning. In 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pages 116–126, Oct 2016.

[4] S. Angra and S. Ahuja. Machine learning and its applications: A review. In 2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDAC), pages 57–60, March 2017.

[5] E. T. Barr, M. Harman, P. McMinn, M. Shahbaz, and S. Yoo. The oracle problem in software testing: A survey. IEEE Transactions on Software Engineering, 41(5):507–525, May 2015.

[6] A. Bertolino. Software testing research and practice. In Proceedings of the Abstract State Machines 10th International Conference on Advances in Theory and Practice, ASM’03, pages 1–21, Berlin, Heidelberg, 2003. Springer-Verlag.

[7] L. C. Briand. Novel applications of machine learning in software testing. In 2008 The Eighth International Conference on Quality Software, pages 3–10, Aug 2008.

[8] B. Chakraborty. Feature selection and classification techniques for multivariate time series, 10 2007.

[9] T. Cheatham, J. Yoo, and N. J. Wahl. Software testing: A machine learning experiment., 01 1995.

[10] P. Cunningham, M. Cord, and S. J. Delany. Supervised Learning, pages 21–49. Springer Berlin Heidelberg, Berlin, Heidelberg, 2008.

[11] E. Ekblad. Continuous Improvements During Project Based Production: A Case Study Executed at Bombardier Transportation. Lule University of Technology, 2015.

[12] Emelie Engstr¨om and Per Runeson. A qualitative survey of regression testing practices. In Proceedings of the 11th International Conference on Product-Focused Software Process Improvement, PROFES’10, pages 3–16, Berlin, Heidelberg, 2010. Springer-Verlag.

13] M. Fagerstr¨om, E. E. Ismail, G. Liebel, R. Guliani, F. Larsson, K. Nordling, E. Knauss, and P. Pelliccione. Verdict machinery: On the need to automatically make sense of test results. In Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA 2016, pages 225–234, New York, NY, USA, 2016. ACM.

[14] Ben D. Fulcher and Nick S. Jones. Highly comparative, feature-based time-series classification. CoRR, abs/1401.3531, 2014.

[15] C. Gherasim. Signal Processing for Voltage and Current Measurements in Power Quality Assessment. Katholieke Universiteit Leuven, 2006.

[16] W. E. Howden. Theoretical and empirical studies of program testing. IEEE Transactions on Software Engineering, SE-4(4):293–298, July 1978.

[17] R. Huerta, S. Vembu, M. Muezzinoglu, and A. Vergara. Dynamical svm for time series classification, 08 2012.

[18] N. Imamovic. Test oracle automation with machine learning: A feasibility study. https: //github.com/NerminImamovic/testOracleAutomation, 2018.

[17] Mikael Fagerstr¨om, Emre Emir Ismail, Grischa Liebel, Rohit Guliani, Fredrik Larsson, Karin Nordling, Eric Knauss, and Patrizio Pelliccione. Verdict machinery: On the need to automatically make sense of test results. In Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA 2016, pages 225–234, New York, NY, USA, 2016. ACM.

[18] Ben D. Fulcher and Nick S. Jones. Highly comparative, feature-based time-series classification. CoRR, abs/1401.3531, 2014.

[19] S. H. Jambukia, V. K. Dabhi, and H. B. Prajapati. Classification of ecg signals using machine learning techniques: A survey. In 2015 International Conference on Advances in Computer Engineering and Applications, pages 714–721, March 2015.

[20] M. Jordan and T.M. Mitchell. Machine learning: Trends, perspectives, and prospects. 349:255–60, 07 2015.

[21] M. W. Kadous. Temporal Classification: Extending the Classification Paradigm to Multivariate Time Series. The University of New South Wales, 2002.

[22] P. Kalapatapu, S. Goli, P. Arthum, and A. Malapati. A study on feature selection and classification techniques of indian music. Procedia Computer Science, 98:125 – 131, 2016. The 7th International Conference on Emerging Ubiquitous Systems and Pervasive Networks (EUSPN 2016)/The 6th International Conference on Current and Future Trends of Information and Communication Technologies in Healthcare (ICTH-2016)/Affiliated Workshops.

[23] U. Karrenberg. Signals in the time and frequency domain, pages 33–64. Springer Berlin Heidelberg, Berlin, Heidelberg, 2013.

[24] P. Laplante, F. Belli, J. Gao, G. Kapfhammer, K. Miller, W. E. Wong, and D. Xu. Software test automation. 2010, 01 2010.

[25] ISTQB Level, Agile Tutorial, 2018 Dates, ISTQB Tests, Contact Us, Privacy Policy, Terms Use, About Us, and Write us. What is a test case?, 2018.

[26] D. Lo, H. Cheng, J. Han, S. Khoo, and C. Sun. Classification of software behaviors for failure detection: a discriminative pattern mining approach. In KDD, 2009.

[27] P. D. Machado and W. L. Andrade. The oracle problem for testing against quantified properties. In 2007 7th International Conference on Quality Software(QSIC), volume 00, pages 415–418, 10 2007.

[28] R. Mathur, S. Miles, and M. Du. Adaptive automation: Leveraging machine learning to support uninterrupted automated testing of software applications. CoRR, abs/1508.00671, 2015.

[29] P. McMinn, M. Stevenson, and M. Harman. Reducing qualitative human oracle costs associated with automatically generated test data. In Proceedings of the First International Workshop on Software Test Output Validation, STOV ’10, pages 1–4, New York, NY, USA, 2010. ACM.

[30] G. Monte. Sensor signal preprocessing techniques for analysis and prediction. In 2008 34th Annual Conference of IEEE Industrial Electronics, pages 1788–1793, Nov 2008.

[31] F. A. Muhammed. An Introduction to UMTS Technology : Testing, Specifications, and Standard Bodies for Engineers and Managers. BrownWalker Press, 2008.

[32] A. Nanopoulos, R. Alcock, and Y. Manolopoulos. Feature-based classification of time-series data. 10:49–61, 01 2001.

[33] F. Nargesian, H. Samulowitz, U. Khurana, E. B. Khalil, and D. Turaga. Learning feature engineering for classification, 08 2017.

[34] R. T. Olszewski. Generalized Feature Extraction for Structural Pattern Recognition in Timeseries Data. PhD thesis, Pittsburgh, PA, USA, 2001. AAI3040489.

[35] P. Ongsulee. Artificial intelligence, machine learning and deep learning. In 2017 15th International Conference on ICT and Knowledge Engineering (ICT KE), pages 1–6, Nov 2017.

[36] N. Passalis, A. Tsantekidis, A. Tefas, J. Kanniainen, M. Gabbouj, and A. Iosifidis. Time-series classification using neural bag-of-features. 2017 25th European Signal Processing Conference (EUSIPCO), pages 301–305, 2017.

[37] M. Polo, P. Reales, M. Piattini, and C. Ebert. Test automation. IEEE Software, 30(1):84–89, Jan 2013.

[38] P. Ravikumar and V. S. Devi. Weighted feature-based classification of time series data. In 2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pages 222–228, Dec 2014.

[39] P. Refaeilzadeh, L. Tang, and H. Liu. Cross-validation. 532538:532–538, 01 2009.

[40] A. D. Richardson. Mining and Classification of Multivaraite Sequential Data. Bar-Ilhan University, 2011.

[41] P. Runeson and M. Hst. Guidelines for conducting and reporting case study research in software engineering. 14(2):131.

[42] P. Sch¨afer and U. Leser. Fast and accurate time series classification with weasel. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM ’17, pages 637–646, New York, NY, USA, 2017. ACM.

[43] S. R. Shahamiri, W. M. N. Wan-Kadir, S. Ibrahim, and S. Z. M. Hashim. Artificial neural networks as multi-networks automated test oracle. Automated Software Engineering, 19(3):303– 334, Sep 2012.

[44] R. H. Shumway and D. S. Stoffer. Time Series Analysis and Its Applications: With R Examples. Springer Texts in Statistics. Springer International Publishing, 4 edition.

[45] D. F. Silva, V. M. A. D. Souza, and G. E. A. P. A. Batista. Time series classification using compression distance of recurrence plots. In 2013 IEEE 13th International Conference on Data Mining, pages 687–696, Dec 2013.

[46] P. Sinha. Speech Processing in Embedded Systems. 01 2010.

[47] S. W. Smith. The Scientist and Engineer’s Guide to Digital Signal Processing. California Technical Publishing, San Diego, CA, USA, 1997.

[48] O. Sutton. Introduction to k nearest neighbour classification and condensed nearest neighbour data reduction, 2012.

[49] J. Tang, S. Alelyani, and H. Liu. Feature selection for classification: A review.

[50] A. Theissler. Detecting Anomalies in Multivariate Time Series from Automotive Systems. Brunel University, 2013.

[51] M. Vanmali, M. Last, and A. Kandel. Using a neural network in the software testing process. 17:45–62, 01 2002.

[52] E. Volna, M. Kotyrba, and M. Janosek. Pattern Recognition and Classification in Time Series Data. IGI Global, Hershey, PA, USA, 1st edition, 2016.

[53] F Wang, S. Yang, and Y. Yang. Regression testing based on neural networks and program slicing techniques. In Y. Wang and T. Li, editors, Practical Applications of Intelligent Systems, pages 409–418, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg.

[54] F. Wang, L. W. Yao, and J. H. Wu. Intelligent test oracle construction for reactive systems without explicit specifications. In 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing, pages 89–96, Dec 2011.

[55] J. Wang, P. Liu, M. She, S. Nahavandi, and A. Kouzani. Bag-of-words representation for biomedical time series classification. 8, 12 2012.

[56] Z. Wang, W. Yan, and T. Oates. Time series classification from scratch with deep neural networks: A strong baseline. In 2017 International Joint Conference on Neural Networks (IJCNN), pages 1578–1585, May 2017.

[57] X. Xi, E. Keogh, C. Shelton, L. Wei, and C. A. Ratanamahatana. Fast time series classification using numerosity reduction. In Proceedings of the 23rd International Conference on Machine Learning, ICML ’06, pages 1033–1040, New York, NY, USA, 2006. ACM.

[58] M. E. Yousif, S. R. Shahamiri, and M. B. Mustafa. Test oracles based on artificial neural networks and info fuzzy networks: A comparative study. In 2015 IEEE 10th Conference on Industrial Electronics and Applications (ICIEA), pages 467–471, June 2015.

[59] H. Yu and S. Kim. Svm tutorial - classification, regression and ranking. In Grzegorz Rozenberg, Thomas Bck, and Joost N. Kok, editors, Handbook of Natural Computing, pages 479–506. Springer, 2012.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Test Oracle Automation		Test Oracle Automation
FinalReport.pdf		FinalReport.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Test Oracle Automation with Machine Learning: A Feasibility Study

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Test Oracle Automation with Machine Learning: A Feasibility Study

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages