Sustainable and Scalable Setup for Teaching Big Data Computing

Linh B Ngo and Hoang Bui

Volume 14, Issue 1 (July 2023), pp. 46–52

https://doi.org/10.22369/issn.2153-4136/14/1/7

PDF icon Download PDF

BibTeX
@article{jocse-14-1-7,
  author={Linh B Ngo and Hoang Bui},
  title={Sustainable and Scalable Setup for Teaching Big Data Computing},
  journal={The Journal of Computational Science Education},
  year=2023,
  month=jul,
  volume=14,
  issue=1,
  pages={46--52},
  doi={https://doi.org/10.22369/issn.2153-4136/14/1/7}
}
Copied to clipboard!

As more students want to pursue a career in big data analytics and data science, big data education has become a focal point in many colleges and universities' curricula. There are many challenges when it comes to teaching and learning big data in a classroom setting. One of the biggest challenges is to prepare big data infrastructure to provide meaningful hands-on experience to students. Setting up necessary distributed computing resource is a delicate act for instructors and system administrators because there is no one size fit all solutions. In this paper, we propose an approach that facilitates the creation of the computing environment on both personal computers and public cloud resources. This combined approach meet different needs and can be used in an educational setting to facilitate different big data learning activities. We discuss and reflect on our experience using these systems in teaching undergraduate and graduate courses.