Browse JOCSE

Filter type

Filter type

Education Level
Filter type

162 results

Scientific Skills, Identity, and Career Aspiration Development from Early Research Experiences in Computer Science

Cecilia O. Alm and Reynold Bailey

Volume 13, Issue 1 (April 2022), pp. 2–16

PDF icon Download PDF

The computer science research workforce is characterized by a lack of demographic diversity. To address this, we designed and evaluated an end-to-end mentored undergraduate research intervention to nurture diverse cohorts' skills for research and develop their vision of themselves as scientists. We hypothesized that this intervention would (a) grow scientific skills, (b) increase science identity, and (c) stimulate students to view scientific careers in computer science as future viable options. The evaluation of the hypotheses addressed the limitations in self-evaluation with a multicomponent evaluation framework, comprising five forms of evidence from faculty and students, engaging on team projects, with cohorts additionally participating in professional development programming. Results indicated that students gained in scientific skills and broadened their identity as scientists and, to some degree, strengthened their outlook on research careers. The introduced structured intervention and evaluation framework were part of a US National Science Foundation Research Experiences for Undergraduates (REU) computing-focused summer program at Rochester Institute of Technology and are applicable in other scientific disciplines and institutional settings.

Creating a Graphical Tool for Non-Programmers to Use to Make Heatmaps

Nicholas Alicea, Akenpaul Chani, Lam Le, Hayata Suenaga, David Toth, Selam Van Voorhis, and Jessica Wooten

Volume 13, Issue 1 (April 2022), pp. 17–20

PDF icon Download PDF

Heatmaps are used to visualize data to enable people to quickly understand them. While there are libraries that enable programmers to create heatmaps with their data, scientists who do not typically write programs need a way to quickly create heatmaps to understand their data and use those figures in their publications. One of the authors is not a programmer but needed a way to generate heatmaps for their research. For a summer undergraduate research experience, we created a program with a graphical user interface to allow non-programmers, including that author, to create heatmaps to visualize their data with just a few mouse clicks. The program allows the user to easily customize their heatmaps and export them as PNG or PDF files to use in their publications.

Magic Castle — Enabling Scalable HPC Training through Scalable Supporting Infrastructures

Félix-Antoine Fortin and Alan Ó Cais

Volume 13, Issue 1 (April 2022), pp. 21–22

PDF icon Download PDF

The potential HPC community grows ever wider as methodologies such as AI and big data analytics push the computational needs of more and more researchers into the HPC space. As a result, requirements for training are exploding as HPC adoption continues to gather pace. However, the number of topics that can be thoroughly addressed without providing access to actual HPC resources is very limited, even at the introductory level. In cases where access to production HPC resources is available, security concerns and the typical overhead of arranging for account provision and training reservations make the scalability of this approach challenging.

Best Practices for NERSC Training

Yun (Helen) He and Rebecca Hartman-Baker

Volume 13, Issue 1 (April 2022), pp. 23–26

PDF icon Download PDF

The National Energy Research Supercomputing Center (NERSC) at Lawrence Berkeley National Laboratory (LBNL) organizes approximately 20 training events per year for its 8,000 users from 800 projects, who have varying levels of High Performance Computing (HPC) knowledge and familiarity with NERSC's HPC resources. Due to the novel circumstances of the pandemic, NERSC began transforming our traditional smaller-scale, on-site training events to larger-scale, fully virtual sessions in March 2020. We treated this as an opportunity to try new approaches and improve our training best practices. This paper describes the key practices we have developed since the start of this transformation, including considerations for organizing events; collaboration with other HPC centers and the DOE ECP Program to increase reach and impact of events; targeted emails to users to increase attendance; efficient management of user accounts for computational resource access; strategies for preventing Zoombombing; streamlining the publication of professional-quality, closed-captioned videos on the NERSC YouTube channel for accessibility; effective communication channels for Q&A; tailoring training contents to NERSC user needs via close collaboration with vendors and presenters; standardized training procedures and publishing of training materials; and considerations for planning HPC training topics. Most of these practices will be continued after the pandemic as effective norms for training.

Building a Computational and Data Science Workforce

Katharine Cahill, Linda Akli, Tandabany Dinadayalane, Ana Gonzalez, Raphael D. Isokpehi, Asamoah Nkwanta, Rachel Vincent-Finley, Lorna Rivera, and Ahlam Tannouri

Volume 13, Issue 1 (April 2022), pp. 27–31

PDF icon Download PDF

Under-representation of minorities and women in the STEM workforce, especially in computing, is a contributing factor to the Computational and Data Science (CDS) workforce shortage. In 2019, 12 percent of the workforce was African American, while only 7 percent of STEM workers were African American with a bachelor's degree or higher. Hispanic share of the workforce increased to 18 percent by 2019; Hispanics with a bachelor's degree or higher are only 8 percent of the STEM workforce [1]. Although some strides have been made in integrating CDS competencies into the university curriculum, the pace of change has been slow resulting in a critical shortage of sufficiently qualified students at both the baccalaureate and graduate levels. The NSF Working Group on Realizing the Potential of Data Science final report recommends "strengthening curriculum at EPSCoR and Minority Serving Institutions (MSI) so students are prepared and competitive for employment opportunities in industry and academia" [2]. However, the resource constraints and large teaching loads can impede the ability of MSIs and smaller institutions to quickly respond and make the necessary curriculum changes. Ohio Supercomputer Center (OSC) in collaboration with Bethune Cookman University (B-CU), Clark Atlanta University (CAU), Morgan State University (Morgan), Southeastern Universities Research Association (SURA), Southern University and A&M College (SUBR), and the University of Puerto Rico at Mayagüez (UPRM) are piloting a Computational and Data Science Curriculum Exchange (C2Exchange) to address the challenges associated with sustained access to computational and data science courses in institutions with high percentage enrollment of students from populations currently under-represented in STEM disciplines. The goal of the C2Exchange pilot is to create a network for resource constrained institutions to share CDS courses and increase their capacity to offer CDS minors and certificate programs. Over the past three years we have found that the exchange model facilitates the sharing of curriculum and expertise across institutions for immediate implementation of some courses and long-term capacity building for new Computational and Data Science programs and minors.

Tailored Computing Instruction for Economics Majors

Richard Lawrence, Zhenhua He, Wesley Brashear, Ridham Patoliya, Honggao Liu, and Dhruva K. Chakravorty

Volume 13, Issue 1 (April 2022), pp. 32–37

PDF icon Download PDF

Responding to the growing need for discipline-specific computing curricula in academic programs, we offer a template to help bridge the gap between informal and formal curricular support. Here, we report on a twenty-contact-hour computing course developed for economics majors at Texas A&M University. The course is built around thematic laboratories that each include learning objectives, learning outcomes, assignments, and assessments and is geared toward students with a high-school level knowledge of mathematics and statistics. Offered in an informal format, the course leverages the wide applicability of the Python programming language and scaffolding offered by discipline-specific, hands-on activities to introduce a curriculum that covers introductory topics in programming while prioritizing approaches that are more relevant to the discipline. The design leverages technology to offer classes in an interactive, Web-based format for both in-person and remote learners, ensuring easy access and scalability to other institutions as needed. To ensure easier adoption among faculty and offer differentiated learning opportunities for students, lectures are modularized to 10-minute segments that are mapped to other concepts covered during the entire course. Class notes, lectures, and exercises are pre-staged and leverage aspects of flipped classroom methods. The course concludes with a group project and follow-on engagements with instructors. In future iterations, curriculum can be extended with a capstone in a Web-based asynchronous certification process.

Leveraging Northeast Cyberteam Successes to Build the CAREERS Cyberteam Program: Initial Lessons Learned

Andrew Sherman, John Goodhue, Julie Ma, Kaylea Nelson, Eric Brown, Christopher Carothers, Galen Collier, Adrian Del Maestro, Andrea Elledge, Wayne Figurelle, John Huffman, Gaurav Khanna, Neil McGlohon, Sia Najafi, Jeff Nucciarone, Anita Schwartz, Bruce Segee, Scott Valcourt, and Ralph Zottola

Volume 13, Issue 1 (April 2022), pp. 38–43

PDF icon Download PDF

Given the pivotal role of data and cyberinfrastructure (CI) in teaching and scientific discovery, it is essential that researchers at small and mid-sized institutions be empowered to fully exploit them. While access to physical infrastructure is essential, it is equally important to have access to people known as Research Computing Facilitators (RCFs) who possess a mix of technical knowledge and interpersonal skills that enables faculty to make the best use of available computing resources. Meeting this need is a significant challenge for small and mid-sized institutions that do not have the critical mass to build teams of RCFs on site. Launched in 2017, the National Science Foundation (NSF) funded Northeast Cyberteam (NECT) built a program to address these challenges for researchers/educators at small and mid-sized institutions in four states — Maine, Massachusetts, New Hampshire, and Vermont — while simultaneously developing self-service tools that support management and execution of RCF engagements. These tools are housed in a Portal called Connect.cyberinfrastructure and have enabled adoption of program methods by the broader research computing community. Initiated in 2020, the NSF-funded Cyberteam to Advance Research and Education in Eastern Regional Schools (CAREERS) has leveraged the NECT methods and tools to jumpstart a program that supports researchers at small and mid-sized institutions in six states and lays the groundwork for an additional level of support via a distributed network of experts directly accessible by the researchers in the region. This paper discusses findings from the first four years of NECT and the first year of CAREERS.

Technology Laboratories: Facilitating Instruction for Cyberinfrastructure Infused Data Sciences

Zhenhua He, Jian Tao, Lisa M. Perez, and Dhruva K. Chakravorty

Volume 13, Issue 1 (April 2022), pp. 44–49

PDF icon Download PDF

While artificial intelligence and machine learning (AI/ML) frameworks gain prominence in science and engineering, most researchers face significant challenges in adopting complex AI/ML workflows to campus and national cyberinfrastructure (CI) environments. Data from the Texas A&M High Performance Computing (HPRC) researcher training program indicate that researchers increasingly want to learn how to migrate and work with their pre-existing AI/ML frameworks on large scale computing environments. Building on the continuing success of our work in developing innovative pedagogical approaches for CI-training approaches, we expand CI-infused pedagogical approaches to teach technology-based AI and data sciences. We revisit the pedagogical approaches used in the decades-old tradition of laboratories in the Physical Sciences that taught concepts via experiential learning. Here, we structure a series of exercises on interactive computing environments that give researchers immediate hands-on experience in AI/ML and data science technologies that they will use as they work on larger CI resources. These exercises, called "tech-labs," assume that participating researchers are familiar with AI/ML approaches and focus on hands-on exercises that teach researchers how to use these approaches on large-scale CI. The tech-labs offer four consecutive sessions, each introducing a learner to specific technologies offered in CI environments for AI/ML and data workflows. We report on our tech-lab offered for Python-based AI/ML approaches during which learners are introduced to Jupyter Notebooks followed by exercises using Pandas, Matplotlib, Scikit-learn, and Keras. The program includes a series of enhancements such as container support and easy launch of virtual environments in our Web-based computing interface. The approach is scalable to programs using a command line interface (CLI) as well. In all, the program offers a shift in focus from teaching AI/ML toward increasing adoption of AI/ML in large-scale CI.

Expanding Interactive Computing to Facilitate Informal Instruction in Research Computing

Richard Lawrence, Tri M. Pham, Phi T. Au, Xin Yang, Kyle Hsu, Stuti H. Trivedi, Lisa M. Perez, and Dhruva K. Chakravorty

Volume 13, Issue 1 (April 2022), pp. 50–54

PDF icon Download PDF

Successful outreach to computational researchers for informing about the benefits of switching to a different computing environment depends on the educator's ability to showcase practical research and development workflows in the new computing environment. Interactive, graphical computing environments are crucial to engage learners in computing education and offer researchers easier ways to adopt new technologies. Interactive, graphical computing allows learners to see the results of their work in real time, which provides the needed feedback for learning and enables chunking of complex tasks. Moreover, there is a natural synergy between computing education and computing research; researchers who are exposed to new computing skills within the context of an interactive and engaging environment are more likely to retain the new skills and adopt the new computing environment in their research and development workflows. Support for interactive, graphical workflows with modern computing tools in containerized computing environments has to be incorporated on high performance computing systems. To begin to address this deficiency, here we discuss our approach to teach containerization technologies in the popular integrated development environment of the Jupyter Notebook. We report on our scheme for implementing containerized software environments for interactive, graphical computing within the Open OnDemand (OOD) framework for research computing workflows, providing an accessible on-ramp for researchers transitioning to containerized technologies. In addition, we introduce several quality-of-life improvements for researchers and educators that will encourage them to continue to use the platform.

Student Simulations of Local Wildfires in a Liberal Arts Geography Course

Ted Wetherbee and Elizabeth Jones

Volume 12, Issue 3 (December 2021), pp. 2–12

PDF icon Download PDF

Wildfire simulations are developed for interactive use in online geography classes under the course titled Disasters. Development of local capability to design and offer computational activities in courses at a small, rural college is a long-term activity based on integrated scientific research and education efforts.

Expanding HLRS Academic HPC Simulation Training Programs to More Target Groups

Tibor Döpper, Bärbel Große-Wöhrmann, Doris Lindner, Darko Milakovic, Jutta Oexle, Michael M. Resch, Oliver Scheel, Sven Slotosch, and Leon Widmaier

Volume 12, Issue 3 (December 2021), pp. 13–26

PDF icon Download PDF

For a long time, high-performance computers and simulations were of interest only at universities and research institutes. In recent years, however, their application and relevance in a wider field has grown; not only do industry and small and medium-sized businesses benefit from these technologies, but their social and political impacts are also increasing significantly. Therefore, there is an increasing need for experts in this field as well as better understanding of the importance of high-performance computing (HPC) and simulations among the general public. For this reason, the German National Supercomputing Center HLRS has broadened its academic training program to include courses for students and teachers as well as for professionals. Specifically, this expansion involves two projects: "Simulated Worlds," which offers a variety of educational programs for middle and high school students, and the "MoeWE" project with its "Supercomputing Academy" for professionals. These projects complement the center's academic educational focus by addressing the special needs of these new target groups who have otherwise not been able to benefit from HLRS' academic training program. In this paper, we present background concepts, programmatic offerings, and exemplary content of the two projects; discuss the experiences involved in their development and implementation; and provide insights that may be useful for improving education and training in this area.

Infusing Fundamental Competencies of Computational Science to the General Undergraduate Curriculum

Ana C. González-Ríos

Volume 12, Issue 3 (December 2021), pp. 27–34

PDF icon Download PDF

The growing need for a workforce that can analyze, model, and interpret real-world data strongly points to the importance of imparting fundamental concepts of computational and data science to the current student generation regardless of their intended majors. This paper describes the experiences in developing and implementing a course in computation, modeling, and simulation. The main goal of the course was to infuse fundamental competencies of computational science to the undergraduate curriculum. The course also aimed at making students aware that modeling and simulation have become an essential part of the research and development process in the sciences, social sciences, and engineering. The course was targeted to students of all majors.

DeapSECURE Computational Training for Cybersecurity Students: Improvements, Mid-Stage Evaluation, and Lessons Learned

Wirawan Purwanto, Yuming He, Jewel Ossom, Qiao Zhang, Liuwan Zhu, Karina Arcaute, Masha Sosonkina, and Hongyi Wu

Volume 12, Issue 2 (February 2021), pp. 3–10

PDF icon Download PDF

DeapSECURE is a non-degree computational training program that provides a solid high-performance computing (HPC) and big-data foundation for cybersecurity students. DeapSECURE consists of six modules covering a broad spectrum of topics such as HPC platforms, big-data analytics, machine learning, privacy-preserving methods, and parallel programming. In the second year of this program, to improve the learning experience, we implemented a number of changes, such as grouping modules into two broad categories, "big-data" and "HPC"; creating a single cybersecurity storyline across the modules; and introducing post-workshop (optional) "hackshops." Two major goals of these changes are, firstly, to effectively engage students to maintain high interest and attendance in such a non-degree program, and, secondly, to increase knowledge and skill acquisition. To assess the program, and in particular the changes made in the second year, we evaluated and compared the execution and outcomes of the training in Year 1 and Year 2. The assessment data shows that the implemented changes have partially achieved our goals, while simultaneously providing indications where we can further improve. The development of a fully on-line training mode is planned for the next year, along with a reproducibility pilot study to broaden the subject domain from cybersecurity to other areas, such as computations with sensitive data.

Exploring Remote Learning Methods for User Training in Research Computing

Dhruva K. Chakravorty, Lisa M. Perez, Honggao Liu, Braden Yosko, Keith Jackson, Dylan Rodriguez, Stuti H. Trivedi, Levi Jordan, and Shaina Le

Volume 12, Issue 2 (February 2021), pp. 11–17

PDF icon Download PDF

The COVID-19 national health crisis forced a sudden and drastic move to online delivery of instruction across the nation. This almost instantaneous transition from a predominantly traditional "in-person" instruction model to a predominantly online model has forced programs to rethink instructional approaches. Before COVID-19 and mandatory social distancing, online training in research computing (RC) was typically limited to "live-streaming" informal in-person training sessions. These sessions were augmented with hands-on exercises on live notebooks for remote participants, with almost no assessment of student learning. Unlike select instances that focused on an international audience, local training curricula were designed with the in-person attendee in mind. Sustained training for RC became more important since when several other avenues of research were diminished. Here we report on two educational approaches that were implemented in the informal program hosted by Texas A&M High Performance Research Computing (HPRC) in the Spring, Summer, and Fall semesters of 2020. These sessions were offered over Zoom, with the instructor assisted by moderators using the chat features. The first approach duplicated our traditional in-person sessions in an online setting. These sessions were taught by staff, and the focus was on offering a lot of information. A second approach focused on engaging learners via shorter pop-up courses in which participants chose the topic matter. This approach implemented a peer-learning environment, in which students taught and moderated the training sessions. These sessions were supplemented with YouTube videos and continued engagement over a community Slack workspace. An analysis of these approaches is presented.

Transitioning Education and Training to a Virtual World, Lessons Learned

S. Charlie Dey, Victor Eijkhout, Lars Koesterke, Je'aime Powell, Susan Lindsey, Rosalia Gomez, Brandi Kuritz, and Joshua Freeze

Volume 12, Issue 2 (February 2021), pp. 18–20

PDF icon Download PDF

Interaction is the key to making education more engaging. Effective interaction is difficult enough to achieve in a live classroom, and it is extremely challenging in a virtual environment. To keep the degree of instruction and learning at the levels our students have come to expect, additional efforts are required to focus efforts on other facets to motivate learning, whether the learning is relative to students in our academic courses, student internship programs, Summer Institute Series, or NSF/TACC's Frontera Fellowship Program. We focus our efforts in lecturing less and interacting more.

Bringing GPU Accelerated Computing and Deep Learning to the Classroom

Joseph Bungo and Daniel Wong

Volume 12, Issue 2 (February 2021), pp. 21–21

PDF icon Download PDF

The call for accelerated computing and data science skills is soaring, and classrooms are on the front lines of feeding the demand. The NVIDIA Deep Learning Institute (DLI) offers handson training in AI, accelerated computing, and accelerated data science. Developers, data scientists, educators, researchers, and students can get practical experience powered by GPUs in the cloud. DLI Teaching Kits are complete course solutions that lower the barrier of incorporating AI and GPU computing in the classroom. The DLI University Ambassador Program enables qualified educators to teach DLI workshops, at no cost, across campuses and academic conferences to faculty, students, and researchers. DLI workshops offer student certification that demonstrates subject matter competency and supports career growth. Join NVIDIA's higher education leadership and leading adopters from academia to learn how to get involved in these programs.

XSEDE EMPOWER: Engaging Undergraduates in the Work of Advanced Digital Services and Resources

Aaron Weeden

Volume 12, Issue 2 (February 2021), pp. 22–24

PDF icon Download PDF

To address the need for a diverse and capable workforce in advanced digital services and resources, the Shodor Education Foundation has been coordinating an undergraduate student program for the Extreme Science and Engineering Discovery Environment (XSEDE). The name of the program is EMPOWER (Expert Mentoring Producing Opportunities for Work, Education, and Research). The goal of the program is to engage a diverse group of undergraduate students in the work of XSEDE, matching them with faculty and staff mentors who have projects that make use of XSEDE services and resources or that otherwise prepare students to use these types of services and resources. Mentors have coordinated projects in computational science and engineering research in many fields of study as well as systems and user support. Students work for a semester, quarter, or summer at a time and can participate for up to a year supported by stipends from the program, at different levels depending on experience. The program has run for 11 iterations from summer 2017 through fall 2020. The 111 total student participants have been 28% female and 31% underrepresented minority, and they have been selected from a pool of 272 total student applicants who have been 31% female and 30% underrepresented minority. We are pleased that the selection process does not favor against women and minorities but would also like to see these proportions increase. At least one fourth of the students have presented their work in articles or at conferences, and multiple credit the program with moving them towards graduate study or otherwise advancing them in their careers.

Pawsey Training Goes Remote: Experiences and Best Practices

Ann Backhaus, Sarah Beecroft, Lachlan Campbell, Maciej Cytowski, Marco De La Pierre, Luke Edwards, Pascal Elahi, Alexis Espinosa Gayosso, and Yathu Sivarajah

Volume 12, Issue 2 (February 2021), pp. 25–30

PDF icon Download PDF

The Pawsey Supercomputing Centre training has evolved over the past decade, but never as rapidly as during the COVID-19 pandemic. The imperative to quickly move all training online — to reach learners facing travel restrictions and physical distancing requirements — has expedited our shift online. We had planned to increase our online offerings, but not at this pace or to this extent. In this paper, we discuss the challenges we faced in making this transition, including how to creatively motivate and engage learners, build our virtual training delivery skills, and build communities across Australia. We share our experience in using different learning methods, tools, and techniques to address specific educational and training purposes. We share trials and successes we have had along the way. Our guiding premise is that there is no universal learning solution. Instead, we purposefully select various solutions and platforms for different groups of learners.

High-Performance Computing Course Development for Cultivating the Generalized System-level Comprehensive Capability

Juan Chen

Volume 12, Issue 2 (February 2021), pp. 31–32

PDF icon Download PDF

Supercomputers are moving towards exascale computing, high-performance computer systems are becoming larger and larger, and the scale and complexity of high-performance computing (HPC) applications are also increasing rapidly, which puts forward high requirements for cultivation of HPC majors and HPC course development. HPC majors are required to be able to solve practical problems in a specific field of high-performance computing, which may be a problem for system design or a problem for a specific HPC application field. Regardless of the type of problem, the complexity and difficulty of the problem are often very high because HPC is interdisciplinary. The development of HPC courses to meet these kinds of talent cultivation needs must emphasize the cultivation of students' Generalized System-level Comprehensive Capabilities, so that students can master the key elements in the limited course knowledge learning process. System-level Comprehensive Capability refers to the ability to use the knowledge and ability of the computer system to solve practical problems. The ACM/IEEE Joint Computer Science Curricula 2013 (CS2013) also involves System-level Perspective. System-level Comprehensive Capability is considered to be a crucial factor to improve students' system development ability and professional ability. This is especially important for students majoring in high-performance computing. Furthermore, due to the HPC field's interdisciplinary and high complexity characteristics, System-level Comprehensive Capability is not enough for HPC majors, and students need to have Generalized System-level Comprehensive Capabilities. A knowledge system at the computer system level "vertically" (from bottom to top: parallel computer architecture, operating system/resource management system, compilation, library optimization, etc.) is no longer enough; multiple high-performance computing application areas should also be "horizontally" involved. Generalized System-level Comprehensive Capabilities vertically and horizontally can meet the needs of different types of high-performance computing talents.

Employing Directed Internship and Apprenticeship for Fostering HPC Training and Education

Elizabeth Bautista and Nitin Sukhija

Volume 12, Issue 2 (February 2021), pp. 33–36

PDF icon Download PDF

Positions within High Performance Computing are difficult to fill, especially that of Site Reliability Engineer within an operational area. At the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory (LBNL), the Operations team manage the HPC computational facility with a complex cooling ecosystem and also serve as the wide area network operations center. Therefore, this position requires skill sets in four specific areas: system administration, storage administration, facility management, and wide area networking. These skills are not taught in their entirety in any educational program; therefore, a new graduate will require extensive training before they can become proficient in all areas. The proximity to Silicon Valley adds another challenge in finding qualified candidates. NERSC has implemented a new approach patterned after the apprenticeship program in the trades. This program requires an intern or apprentice to fulfill milestones during their internship or apprenticeship timeframe, with constant evaluation, feedback, mentorship, and hands-on work that allow candidates to demonstrate their growing skill that will eventually lead to winning a career position. Creating a Platform for Self-Service Learning and Collaboration in the Rapidly Changing Environment of Research Computing

Julie Ma, Torey Battelle, Katia Bulekova, Aaron Culich, John Goodhue, Jacob Pessin, Vanessa Sochat, Dana Brunson, Tom Cheatham, Sia Najafi, Chris Hill, Adrian Del Maestro, Bruce Segee, Ralph Zottola, Scott Valcourt, Zoe Braiterman, Raminder Singh, Robert Thoelen, and Jack Smith

Volume 12, Issue 2 (February 2021), pp. 37–40

PDF icon Download PDF

Ask.CI, the Q&A site for Research Computing, was launched at PEARC18 with the goal of aggregating answers to a broad spectrum of questions that are commonly asked by the research computing community. As researchers, facilitators, staff, students, and others ask and answer questions on Ask.CI, they create a shared knowledge base for the larger community. For smaller institutions, the knowledge base provided by Ask.CI provides a wealth of knowledge that was previously not readily available to scientists and educators in an easily searchable Q&A format. For larger institutions, this self-service model frees up time for facilitators and cyberinfrastructure engineers to focus on more advanced subject matter. Recognizing that answers evolve rapidly with new technology and discovery, Ask.CI has built in voting mechanisms that utilize crowdsourcing to ensure that information stays up to date. Establishing a Q&A site of this nature requires some tenacity. In partnership with the Campus Champions, Ask.CI has gained traction and continues to engage the broader community to establish the platform as a powerful tool for research computing. Since launch, Ask.CI has attracted over 250,000 page views (currently averaging nearly 5,000 per week), more than 400 contributors, hundreds of topics, and a broad audience that spans the US and parts of Europe and Asia. Ask.CI has shown steady growth in both contributions and audience since it was launched in 2018 and is still evolving. In the past year, we introduced Locales, which allow institutions to create subcategories on Ask.CI where they can experiment with posting institution-specific content and use of the site as a component of their user support strategy.

The Design of a Practical Flipped Classroom Model for Teaching Parallel Programming to Undergraduates

Dirk Colbry

Volume 12, Issue 2 (February 2021), pp. 41–45

PDF icon Download PDF

This paper presents a newly developed course for teaching parallel programming to undergraduates. This course uses a flipped classroom model and a "hands-on" approach to learning with multiple real-world examples from a wide range of science and engineering problems. The intention of this course is to prepare students from a variety of STEM backgrounds to be able to take on supportive roles in research labs while they are still undergraduates. To this end, students are taught common programming paradigms such as benchmarking, shared memory parallelization (OpenMP), accelerators (CUDA), and shared network parallelization (MPI). Students are also trained in practical skills including the Linux command line, workflow/file management, installing software, discovering and using shared module systems (LDMOD), and effectively submitting and monitoring jobs using a scheduler (SLURM).

Creative Assessment Design on a Master of Science Degree in Professional Software Development

Cathryn Peoples

Volume 12, Issue 2 (February 2021), pp. 46–57

PDF icon Download PDF

A Master of Science (MSc) conversion degree is one which retrains students in a new subject area within a fast-tracked period of time. This type of programme opens new opportunities to students beyond those gained through their originally-chosen degree. Students entering a conversion degree do so, in a number of cases, to improve career options, which might mean moving from an initially-chosen path to gain skills in a field that they now consider to be more attractive. With a core goal of improving future employability prospects, specific requirements are therefore placed on the learning outcomes achieved from the course content and delivery. In this paper, the learning outcomes are focused on the transferable skills intended to be gained as a result of the assessment design, disseminated to a cohort of students on a Master of Science (MSc) degree in Professional Software Development at Ulster University, United Kingdom. The coursework submissions are explored to demonstrate how module learning has been applied, in a creative way, to facilitate the assessment requirements.

What Influences Students' Understanding of Scalability Issues in Parallel Computing?

Juan Chen, Brett A. Becker, Youwen Ouyang, and Li Shen

Volume 12, Issue 2 (February 2021), pp. 58–65

PDF icon Download PDF

Graduates with high performance computing (HPC) skills are more in demand than ever before, most recently fueled by the rise of artificial intelligence and big data technologies. However, students often find it challenging to grasp key HPC issues such as parallel scalability. The increased demand for processing large-scale scientific computing data makes more essential the importance of mastering parallelism, with scalability often being a crucial factor. This is even more challenging when non-computing majors require HPC skills. This paper presents the design of a parallel computing course offered to atmospheric science majors. It discusses how the design addressed challenges presented by non-computer science majors who lack a background in fundamental computer architecture, systems, and algorithms. The content of the course focuses on the concepts and methods of parallelization, testing, and the analysis of scalability. Considering all students have to confront many (non-HPC) scalability issues in the real world, and there may be similarities between real-world scalability and parallel computing scalability, the course design explores this similarity in an effort to improve students' understanding of scalability issues in parallel computing. The authors present a set of assignments and projects that leverage the Tianhe-2A supercomputer, ranked #6 in the TOP500 list of supercomputers, for testing. We present pre- and post-questionnaires to explore the effectiveness of the class design and find an 11.7% improvement in correct answers and a decrease of 36.8% in obvious, but wrong, answers. The authors also find that students are in favor of this approach.

Promoting HPC Best Practices with the POP Methodology

Fouzhan Hosseini and Craig Lucas

Volume 12, Issue 2 (February 2021), pp. 66–69

PDF icon Download PDF

The performance of HPC applications depends on a wide range of factors, including algorithms, programming models, library and language implementations, and hardware. To make the problem even more complicated, many applications inherit different layers of legacy code, written and optimized for a different era of computing technologies. Due to this complexity, the task of understanding performance bottlenecks of HPC applications and making improvements often ends up being a daunting trial-and-error process. Problematically, this process often starts without having a quantitative understanding of the actual behavior of the HPC code. The Performance Optimisation and Productivity (POP) Centre of Excellence, funded by the EU under the Horizon 2020 Research and Innovation Programme, attempts to establish a quantitative methodology for the assessment of parallel codes. This methodology is based on a set of hierarchical metrics, where the metrics at the bottom of the hierarchy represent common causes of poor performance. These metrics provide a standard, objective way to characterize different aspects of the performance of parallel codes and therefore provide the necessary foundation for establishing a more systematic approach for performance optimization of HPC applications. In consequence, the POP methodology facilitates training new HPC performance analysts. In this paper, we will illustrate these advantages by describing two real-world examples where we used the POP methodology to help HPC users understand performance bottlenecks of their code.

The Computer Science Education Collaborative: Promoting Computer Science Teacher Education Programs for Preservice and In-service Teachers

Regina Toolin, Lisa Dion, and Robert Erickson

Volume 12, Issue 1 (January 2021), pp. 2–7

PDF icon Download PDF

This article reports on the efforts of the Computer Science Education Collaborative during the period between 2018–2020 to develop and implement a new computer science licensure program for preservice teachers seeking a license to teach computer science in grades 7–12 in Vermont. We present a brief review of the literature related to computer science teacher education and describe the process of developing the computer science education minor and major concentration at the University of Vermont. As a form of reflection, we discuss the program development process and lessons learned by the collaborative that might be informative to other institutes of higher education involved in CS teacher education program design and implementation. Finally, we describe next steps for developing in-service licensure programs for teachers seeking computer science professional development or licensure in grades 7–12.

Laboratory Glassware Identification: Supervised Machine Learning Example for Science Students

Arun K. Sharma

Volume 12, Issue 1 (January 2021), pp. 8–15

PDF icon Download PDF

This paper provides a supervised machine learning example to identify laboratory glassware. This project was implemented in an Introduction to Scientific Computing course for first-year students at our institution. The goal of the exercise was to present a typical machine learning task in the context of a chemistry laboratory to engage students with computing and its applications to scientific projects. This is an end-to-end data science experience with students creating the dataset, training a neural network, and analyzing the performance of the trained network. The students collected pictures of various glassware in a chemistry laboratory. Four pre-trained neural networks, Inception-V1, Inception-V3, ResNet-50, and ResNet-101 were trained to distinguish between the objects in the pictures. The Wolfram Language was used to carry out the training of neural networks and testing the performance of the classifier. The students received hands-on training in the Wolfram Language and an elementary introduction to image classification tasks in the machine learning domain. Students enjoyed the introduction to machine learning applications and the hands-on experience of building and testing an image classifier to identify laboratory equipment.

Transport Phenomena in High-speed Wall-bounded Flows Subject to Concave Surface Curvature

Guillermo Araya and Ernie Rivera

Volume 12, Issue 1 (January 2021), pp. 16–23

PDF icon Download PDF

Turbulent boundary layers that evolve along the flow direction are ubiquitous. Moreover, accounting for the effects of wall-curvature driven pressure gradient and flow compressibility adds significant complexity to the problem. Consequently, hypersonic spatially-developing turbulent boundary layers (SDTBL) over curved walls are of crucial importance in aerospace applications, such as unmanned high-speed vehicles, scramjets, and advanced space aircraft. More importantly, hypersonic capabilities would provide faster responsiveness and longer range coverage to U.S. Air Force systems. Thus, the acquired understanding of the physics behind high speed boundary layers over curved wall-bounded flows can lead to the development of more efficient control techniques for the fluid flow (e.g., wave drag reduction) and aerodynamic heating on hypersonic vehicle design. In this investigation, a series of numerical experiments is performed to evaluate the effects of strong concave curvature and supersonic/hypersonic speeds (Mach numbers of 2.86 and 5, respectively) on the thermal transport phenomena that take place inside the boundary layer. The flow solver to be used is based on a RANS approach. Two different turbulence models are compared: the SST (Shear Stress Transport) model by Menter and the standard k-ω model by Wilcox. Furthermore, numerical results are validated by means of experimental data from the literature (Donovan et al., J. Fluid Mech., 259, 1-24, 1994) for the moderate concave curvature case and a Mach number of 2.86. The present study allows us to initially obtain a first insight of the flow physics for a forthcoming better design of 3D meshes and computational boxes, as part of a more ambitious project that involves Direct Numerical Simulation (DNS) of curved wall-bounded flows in the supersonic/hypersonic regime. The uniqueness of this RANS analysis in concave curved walls can be summarized as follows: (i) study of the compressibility effects on the time-averaged velocity and temperature, (ii) analysis of the influence of different inflow boundary conditions.

Performance Evaluation of Monte Carlo Based Ray Tracer

Ayobami Ephraim Adewale

Volume 12, Issue 1 (January 2021), pp. 24–31

PDF icon Download PDF

The main objective of computer graphics is to effectively depict an image in a virtual scene in its realistic form within a reasonable amount of time. This paper discusses two different ray tracing techniques and the performance evaluation of the serial and parallel implementation of ray tracing, which in its serial form is known to be computationally intensive and costly for previous computers. The parallel implementation was achieved using OpenMP with C++, and the maximum speedup was ten times that of the serial implementation. The experiment in this paper can be used to teach high-performance computing students the benefits of multi-threading in computationally intensive algorithms and the benefits of parallel programming.

Training Neural Networks to Accurately Determine Energies of Structures Outside of the Training Set Using Agglomerative Clustering

Carlos A. Barragan and Michael N. Groves

Volume 12, Issue 1 (January 2021), pp. 32–38

PDF icon Download PDF

Machine learning has accounted for solving a cascade of data in an efficient and timely manner including as an alternative molecular calculator to replace more expensive ab initio techniques. Neural networks (NN) are the most predictive for new cases that are similar to examples in their training sets; however, it is sometimes necessary for the NN to accurately evaluate structures not in its training set. In this project, we quantify how clustering a training set into groups with similar geometric motifs can be used to train a NN so that it can accurately determine the energies of structures not in the training set. This was accomplished by generating over 800 C8H7N structures, relaxing them using DFTB+, and grouping them using agglomerative clustering. Some of these groups were assigned to the training group and used to train a NN using the pre-existing Atomistic Machine-learning Package (AMP). The remaining groups were evaluated using the trained NN and compared to the DFTB+ energy. These two energies were plotted and fitted to a straight line where higher R2 values correspond to the NN more accurately predicting the energies of structures not in its training set. This process was repeated systematically with a different number of nodes and hidden layers. It was found that for limited NN architectures, the NN did a poor job predicting structures outside of its training set. This was improved by adding hidden layers and nodes as well as increasing the size of the training set.

Molecular Simulations for Understanding the Stabilization of Fullerenes in Water

Kendra Noneman, Christopher Muhich, Kevin Ausman, Mike Henry, and Eric Jankowski

Volume 12, Issue 1 (January 2021), pp. 39–48

PDF icon Download PDF

Making materials out of buckminsterfullerene is challenging, because it requires first dispersing the molecules in a solvent, and then getting the molecules to assemble in the desired arrangements. In this computational work, we focus on the dispersion challenge: How can we conveniently solubilize buckminsterfullerene? Water is a desirable solvent because of its ubiquity and biocompatibility, but its polarity makes the dispersion of nonpolar fullerenes challenging. We perform molecular dynamics simulations of fullerenes in the presence of fullerene oxides in implicit water to elucidate the role of interactions (van der Waals and Coulombic) on the self-assembly and structure of these aqueous mixtures. Seven coarse-grained fullerene models are characterized over a range of temperatures and interaction strengths using HOOMD-Blue on high performance computing clusters. We find that dispersions of fullerenes stabilized by fullerene oxides are observable in models where the net attraction among fullerenes is about 1.5 times larger than the attractions between oxide molecules. We demonstrate that simplified models are sufficient for qualitatively modeling micellization of these fullerenes and provide an efficient starting point for investigating how structural details and phase behavior depend upon the inclusion of more detailed physics.

Performance Analysis of the Parallel CFD Code for Turbulent Mixing Simulations

Tulin Kaman, Alaina Edwards, and John McGarigal

Volume 12, Issue 1 (January 2021), pp. 49–58

PDF icon Download PDF

Understanding turbulence and mixing due to the hydrodynamic instabilities plays an important role in a wide range of science and engineering applications. Numerical simulations of three dimensional turbulent mixing help us to predict the dynamics of two fluids of different densities, one over the other. The focus of this work is to optimize and improve the computational performance of the numerical simulations for the compressible turbulent mixing on Blue Waters, the petascale supercomputer at the National Center for Supercomputing Applications. In this paper, we study the effect of the programming models on time to solution. The hybrid programming model, which is a combination of parallel programming models, becomes a dominant approach. The most preferable hybrid model is the one that involves the Message Passing Interface (MPI), such as MPI + Pthreads, MPI + OpenMP, MPI + MPI-3 shared memory programming, and others with accelerator support. Among all choices, we choose the hybrid programming model that is based on MPI + OpenMP. We extend the purely MPI parallelized code with OpenMP parallelism and develop the hybrid version of the code. This new hybrid implementation of the code is set up in a way that multiple MPI processes handle the interface propagation, whereas multiple OpenMP threads handle the high order weighted essentially non-oscillatory numerical scheme.

Using Molecular Visualization as a Tool for Culturally Competent and Culturally Relevant Teaching: A Guided-Inquiry Biochemistry Activity

Pumtiwitt McCarthy, Richard Williams, Cleo Hughes-Darden, Roni Ellington, Paminas Mayaka, Monica Jackson, and Asamoah Nkwanta

Volume 11, Issue 2 (April 2020), pp. 2–6

PDF icon Download PDF

The central dogma is a key foundational concept in biochemistry. The idea that DNA mutations cause change at the protein level can be abstract for students. To provide a real-world example of the effect of mutation on protein function, a molecular visualization module was developed and incorporated into two biochemistry courses. This inquiry-based activity explored the molecular basis and cultural relevance of sickle cell anemia. Hemoglobin structural changes from the disease were examined. Participants used free tools including NCBI, RCSB PDB, LALIGN and Swiss PDB DeepView protein visualization software from EXPASY. This module was an active, engaging exercise which exposed students to protein visualization and increased cultural awareness.

The State of Undergraduate Computational Science Programs

Steven I. Gordon and Katharine Cahill

Volume 11, Issue 2 (April 2020), pp. 7–11

PDF icon Download PDF

A number of efforts have been made to introduce computational science in the undergraduate curriculum. We describe a survey of the undergraduate computational science programs in the U.S. The programs face several challenges including student recruitment and limited faculty participation in the programs. We describe the current state of the programs, discuss the problems they face, and discuss potential short- and long-range strategies that might address those challenges.

Development of a Molecular Model for Understanding the Polymer-metal Interface in Solid State Pumps

Jaime D. Guevara, Matthew L. Jones, Peter Müllner, and Eric Jankowski

Volume 11, Issue 2 (April 2020), pp. 12–22

PDF icon Download PDF

Medical micropumps that utilize Magnetic Shape Memory (MSM) alloys are small, powerful alternatives to conventional pumps because of their unique pumping mechanism. This mechanism—the transfer of fluid through the emulation of peristaltic contractions—is enabled by the magneto-mechanical properties of a shape memory alloy and a sealant material. Because the adhesion between the sealant and the alloy determines the performance of the pump and because the nature of this interface is not well characterized, an understanding of sealant-alloy interactions represents a fundamental component of engineering better solid state micropumps in particular, and metal-polymer interfaces in general. In this work we develop computational modeling techniques for investigating how the properties of sealant materials determine their adhesive properties with alloys. Specifically, we develop a molecular model of the sealant material polydimethylsiloxane (PDMS) and characterize its behavior with a model Ni-Mn-Ga surface. We perform equilibrium molecular dynamics simulations of the PDMS/Ni-Mn-Ga interface to iteratively improve the reliability, numerical stability, and accuracy of our models and the associated data workflow. To this end, we develop the first model for simulating PDMS/Ni-Mn-Ga interfaces by combining the Optimized Potentials for Liquid Simulations (OPLS) [21] force field with the Universal Force Field [5], and show promise for informing the design of more reliable MSM micropumps. We also reflect on the experiences of Blue Waters Supercomputing intern Guevara (the first author) to identify key learning moments during the one-year internship that can help guide future molecular simulation training efforts.

Using Blue Waters to Assess Tornadic Outbreak Forecast Capability by Lead Time

Caroline MacDonald and Andrew Mercer

Volume 11, Issue 2 (April 2020), pp. 23–28

PDF icon Download PDF

Severe weather outbreaks come with many different hazards. One of the most commonly known and identifiable outbreaks are those with tornadoes involved. There has been some prior research on these events with respect to lead time, but shifts in model uncertainty by lead time has yet to be quantified formally. As such, in this study we assess tornado outbreak model uncertainty by lead time by assessing ensemble model precision for outbreak forecasts. This assessment was completed by first identifying five major tornado outbreak events and simulating the events using the Weather Research and Forecasting (WRF) model at 24, 48, 72, 96, and 120-hours lead time. A 10-member stochastically perturbed initial condition ensemble was generated for each lead time to quantify uncertainty associated with initialization errors at the varied lead times. Severe weather diagnostic variables derived from ensemble output were used to quantify ensemble uncertainty by lead time. After comparing moment statistics of several convective indices, the Energy Helicity Index (EHI), Significant Tornado Parameter (STP), and Supercell Composite Parameter (SCP) did the best job of characterizing the tornadic outbreaks at all lead times. There was good consistency between each case utilizing these three indices at all five lead times, suggesting outbreak model forecasting confidence may be able to extend up to 5 days for major outbreak events. These results will be useful for operational use by forecasters in forecast ability of tornadic events.

Improvement of the Evolutionary Algorithm on the Atomic Simulation Environment Though Intuitive Starting Population Creation and Clustering

Nicholas Kellas and Michael N. Groves

Volume 11, Issue 2 (April 2020), pp. 29–35

PDF icon Download PDF

The Evolutionary algorithm (EA), on the Atomic Simulation Environment (ASE), provides a means to find the lowest energy conformation molecule of a given stoichiometry. In this study we examine the ways in which the initial population of molecules affect the success of the EA. We have added a set of rules to the way in which the molecules are created that leads to more chemically relevant structures using chemical intuition. We have also implemented a clustering program that selects molecules that differ from each other from a large pool of molecules to form the initial population. Through testing of EA runs with and without clustering and intuitive population creation, the following success rates were obtained; no intuition and no clustering, 28±3%, no intuition with clustering, 31±4%, with fixed intuition but without clustering, 49±5%, with fixed intuition and clustering, 49±4%, with variable intuition and without clustering, 47±4%, and with variable intuition and clustering, 50±3%. A significant increase in success rate was found when implementing intuitive population creation while clustering the initial population seems to marginally help as the population becomes more diverse.

Lessons Learned from the NASA-UVA Summer School and Internship Program

Katherine Holcomb, Jacalyn Huband, and Tsengdar Lee

Volume 11, Issue 1 (January 2020), pp. 3–7

PDF icon Download PDF

From 2013 to 2018 the University of Virginia operated a summer school and internship program in partnership with NASA. The goal was to improve the software skills of students in environmental and earth sciences and to introduce them to high-performance computing. In this paper, we describe the program and discuss its evolution in response to student needs and changes in the high-performance computing landscape. The future direction for the summer school and plans for the materials developed are also discussed.

Northeast Cyberteam Program - A Workforce Development Strategy for Research Computing

John Goodhue, Julie Ma, Adrian Del Maestro, Sia Najafi, Bruce Segee, Scott Valcourt, and Ralph Zottola

Volume 11, Issue 1 (January 2020), pp. 8–11

PDF icon Download PDF

Cyberinfrastructure is as important for research in the 21st century as test tubes and microscopes were in the 20th century. Familiarity with and effective use of cyberinfrastructure at small and mid-sized institutions is essential if their faculty and students are to remain competitive. The Northeast Cyberteam Program is a 3-year NSF-funded regional initiative to increase effective use of cyberinfrastructure by researchers and educators at small and mid-sized institutions in northern New England by making it easier to obtain support from Research Computing Facilitators. Research Computing Facilitators combine technical knowledge and strong interpersonal skills with a service mindset, and use their connections with cyberinfrastructure providers to ensure that researchers and educators have access to the best available resources. It is widely recognized that Research Computing Facilitators are critical to successful utilization of cyberinfrastructure, but in very short supply. The Northeast Cyberteam aims to build a pool of Research Computing Facilitators in the region and a process to share them across institutional boundaries. Concurrently, we are providing experiential learning opportunities for students interested in becoming Research Computing Facilitators, and developing a self-service learning toolkit to provide timely access to information when it is needed.

Incorporating Complexity in Computing Camps for High School Students - A Report on the Summer Computing Camp at Texas A&M University

Dhruva K. Chakravorty, Marinus "Maikel" Pennings, Honggao Liu, Xien Thomas, Dylan Rodriguez, and Lisa M. Perez

Volume 11, Issue 1 (January 2020), pp. 12–20

PDF icon Download PDF

Summer computing camps for high school students are rapidly becoming a staple at High Performance Computing (HPC) centers and Computer Science departments around the country. Developing complexity in education in these camps remains a challenge. Here, we present a report about the implementation of such a program. The Summer Computing Academy (SCA) at is a weeklong cybertraining program offered to high school students by High Performance Research Computing (HPRC) at Texas A&M University (Texas A&M; TAMU). The Summer Computing Academy effectively uses cloud computing paradigms, artificial intelligence technologies coupled with Raspberry Pi micro-controllers and sensors to demonstrate "computational thinking". The program is steeped in well- reviewed pedagogy; the refinement of the educational methods based on constant assessment is a critical factor that has contributed to its success. The hands-on exercises included in the program have received rave reviews from parents and students alike. The camp program is financially self-sufficient and has successfully broadened participation of underrepresented groups in computing by including diverse groups of students. Modules from the SCA program may be implemented at other institutions with relative ease and promote cybertraining efforts nationwide.

Expanding user communities with HPC Carpentry

Alan Ó Cais and Peter Steinbach

Volume 11, Issue 1 (January 2020), pp. 21–25

PDF icon Download PDF

Adoption of HPC as a research tool and industrial resource is a priority in many countries. The use of data analytics and machine learning approaches in many areas also attracts non-traditional HPC user communities to the hardware capabilities provided by supercomputing facilities. As a result, HPC at all scales is experiencing rapid growth of the demand for training, with much of this at the introductory level. To address the growth in demand, we need both a scalable and sustainable training model as well as a method to ensure the consistency of the training being offered. Adopting the successful training model of The Carpentries ( for the HPC space provides a pathway to collaboratively created training content which can be delivered in a scalable way (serving everything from university or industrial HPC systems to national facilities). We describe the ongoing efforts of HPC Carpentry to create training material to address this need and form the collaborative network required to sustain it. We outline the history of the effort and the practices adopted from The Carpentries that enable it. The lessons being created as a result are under active development and being evaluated in practice at sites in Europe, the US and Canada.

Blue Waters Workforce Development: Delivering National Scale HPC Workforce Development

Jennifer Houchins, Scott Lathrop, Robert Panoff, and Aaron Weeden

Volume 11, Issue 1 (January 2020), pp. 26–28

PDF icon Download PDF

There are numerous reports documenting the critical need for high performance computing infrastructure to advance discovery in all fields of study. The Blue Waters project was funded by the National Science Foundation to address this need and provide leading edge petascale computing resources to advance research and scholarship. There are also numerous reports that identify the lack of an adequate workforce capable of utilizing and advancing petascale class computing infrastructure well into the future. From the outset, the Blue Waters project has responded to this critical need by conducting national scale workforce development activities to prepare a larger and more diverse workforce. This paper describes those activities as exemplars for adoption and replication by the community.

One Year HPC Certification Forum in Retrospective

Julian Martin Kunkel, Kai Himstedt, Weronika Filinger, Jean-Thomas Acquaviva, Anja Gerbes, and Lev Lafayette

Volume 11, Issue 1 (January 2020), pp. 29–35

PDF icon Download PDF

The ever-changing nature of HPC has always compelled the HPC community to focus a lot of effort into training of new and existing practitioners. Historically, these efforts were tailored around a typical group of users possessing, due to their background, a certain set of programming skills. However, as HPC has become more diverse in terms of hardware, software and the user background, the traditional training approaches became insufficient in addressing training needs of our community. This increasingly complicated HPC landscape makes development and delivery of new training materials challenging. How should we develop training for users, often coming from non-traditionally HPC disciplines, and only interested in learning a particular set of skills? How can we satisfy their training needs if we don't really understand what these are? It's clear that HPC centres struggle to identify and overcome the gaps in users' knowledge, while users struggle to identify skills required to perform their tasks. With the HPC Certification Forum, we aim to clearly categorise, define, and examine competencies expected from proficient HPC practitioners. In this article, we report the status and progress this independent body has made during the first year of its existence. The drafted processes and prototypes are expected to mature into a holistic ecosystem beneficial for all stakeholders in HPC education.

Project-Based Research and Training in High-Performance Data Sciences, Data Analytics, and Machine Learning

Kwai Wong, Stanimire Tomov, and Jack Dongarra

Volume 11, Issue 1 (January 2020), pp. 36–44

PDF icon Download PDF

This paper describes a hands-on project-based Research Experiences for Computational Science, Engineering, and Mathematics (RECSEM) program in high-performance data sciences, data analytics, and machine learning on emerging computer architectures. RECSEM is a Research Experiences for Undergraduates (REU) site program supported by the USA National Science Foundation. This site program at the University of Tennessee (UTK) directs a group of ten undergraduate students to explore, as well as contribute to the emergent interdisciplinary computational science models and state-of-the-art HPC techniques via a number of cohesive compute and data intensive applications in which numerical linear algebra is the fundamental building block. The RECSEM program complements the growing importance of computational sciences in many advanced degree programs and provides scientific understanding and discovery to undergraduates with an intellectual focus on research projects using HPC and aims to deliver a real-world research experience to the students by partnering with teams of scientists who are in the forefront of scientific computing research at the Innovative Computing Laboratory (ICL), and the Joint Institute for Computational Sciences (JICS) at UTK and Oak Ridge National Laboratory (ORNL). The program also receives collaborative support from universities in Hong Kong and Changsha, China. The program focuses on scientific domains in engineering applications, image processing, machine learning, and numerical parallel solvers on supercomputers and emergent accelerator platforms, particularly their implementation on GPUs. The programs also enjoy close affiliations with researchers at ORNL. Because of these diverse topics of research areas and backgrounds of this project, in this paper we discuss the experiences and resolutions in managing and coordinating the program, delivering cohesive tutorial materials, directing mentorship of individual projects, lessons learned, and improvement over the course of the program, particularly from the perspectives of the mentors.

Computational Biology as a Compelling Pedagogical Tool in Computer Science Education

Vijayalakshmi Saravanan, Anpalagan Alagan, and Kshirasagar Naik

Volume 11, Issue 1 (January 2020), pp. 45–52

PDF icon Download PDF

High-performance computing (HPC), and parallel and distributed computing (PDC) are widely discussed topics in computer science (CS) and computer engineering (CE) education. In the past decade, high-performance computing has also contributed significantly to addressing complex problems in bio-engineering, healthcare and systems biology. Therefore, computational biology applications provide several compelling examples that can be potent pedagogical tools in teaching high-performance computing. In this paper, we introduce a novel course curriculum to teach high- performance, parallel and distributed computing to senior graduate students (PhD) in a hands-on setup through examples drawn from a wealth of areas in computational biology. We introduce the concepts of parallel programming, algorithms and architectures and implementations via carefully chosen examples from computational biology. We believe that this course curriculum will provide students an engaging and refreshing introduction to this well-established domain. Training for OpenMP Compiler Development from Cloud

Anjia Wang, Alok Mishra, Chunhua Liao, Yonghong Yan, and Barbara Chapman

Volume 11, Issue 1 (January 2020), pp. 53–60

PDF icon Download PDF

OpenMP is one of the most popular programming models to exploit node-level parallelism of supercomputers. Many researchers are interested in developing OpenMP compilers or extending existing standard for new capabilities. However, there is a lack of training resources for researchers who are involved in the compiler and language development around OpenMP, making learning curve in this area steep. In this paper, we introduce an ongoing effort,, a free and open online learning platform aimed to train researchers to quickly develop OpenMP compilers. The platform is built on top of Play-With-Docker, a docker playground for users to conduct experiments in an online terminal sandbox. It provides a live training website that is set up on cloud, so anyone with internet access and a web browser will be able to take the training. It also enables developers with relevant skills to contribute new tutorials. The entire training system is open-source and can be deployed on a private server, workstation or even laptop for personal use. We have created some initial tutorials to train users to learn how to extend the Clang/LLVM and ROSE compiler to support new OpenMP features. We welcome anyone to try out our system, give us feedback, contribute new training courses, or enhance the training platform to make it an effective learning resource for the HPC community.

Self-paced Learning in HPC Lab Courses

Christian Terboven, Julian Miller, Sandra Wienke, and Matthias S. Müller

Volume 11, Issue 1 (January 2020), pp. 61–67

PDF icon Download PDF

In a software lab, groups of students develop parallel code using modern tools, document the results and present their solutions. The learning objectives include the foundations of High-Performance Computing (HPC), such as the understanding of modern architectures, the development of parallel programming skills, and coursespecific topics, like accelerator programming or cluster set up. In order to execute the labs successfully with limited personnel resources and still provide students with access to world-class HPC architectures, we developed a set of concepts to motivate students and to track their progress. This includes the learning status survey and the developer diary, which are presented in this work. We also report on our experiences with using innovative teaching concepts to incentivize students to optimize their codes, such as using competition among the groups. Our concepts enable us to track the effectiveness of our labs and to steer them for increasing sizes of diverse students. We conclude that software labs are effective in adding practical experiences to HPC education. Our approach to hand out open tasks and to leave creative freedom in implementing the solutions enables the students to self-pace their learning process and to vary their investment of effort during the semester. Our effort and progress tracking ensures the achieving of the extensive learning objectives and enables our research on HPC programming productivity.

Computational Mathematics, Science and Engineering (CMSE): Establishing an Academic Department Dedicated to Scientific Computation as a Discipline

Dirk Colbry, Michael Murillo, Adam Alessio, and Andrew Christlieb

Volume 11, Issue 1 (January 2020), pp. 68–72

PDF icon Download PDF

The Computational Mathematics, Science and Engineering (CMSE) department is one of the newest units at Michigan State University (MSU). Founded in 2015, CMSE recognizes computation as the "triple junction" of algorithm development and analysis, high performance computing, and applications to scientific and engineering modeling and data science (as illustrated in Figure 1). This approach is designed to engage with computation as a new integrated discipline, rather than a series of decentralized, isolated sub-specialties. In the four years since its inception, the department has grown and flourished; however, the pathway was sometimes arduous. This paper shares lessons learned during the department's development and the initiatives it has taken on to support computational research and education across the university. By sharing these lessons, we hope to encourage and support the establishment of similar departments at other universities and grow this integrated approach to scientific computation as a discipline.

The Supercomputer Institute: A Systems-Focused Approach to HPC Training and Education

J. Lowell Wofford and Cory Lueninghoener

Volume 11, Issue 1 (January 2020), pp. 73–80

PDF icon Download PDF

For the past thirteen years, Los Alamos National Laboratory HPC Division has hosted the Computer System, Cluster and Networking Summer Institute summer internship program (recently renamed "The Supercomputer Institute") to provide a basis is cluster computing for undergraduate and graduate students. The institute invites 12 students each year to participate in a 10-week internship program. This program has been a strong educational experience for many students through this time, and has been an important recruitment tool for HPC Division. In this paper, we describe the institute as a whole and dive into individual components that were changed this year to keep the program up to date. We also provide some qualitative and quantitative results that indicate that these changes have improved the program over recent years.

Creating a Relevant, Application-Based Curriculum for High Performance Computing in High School

Vincent C. Betro and Mary E. Loveless

Volume 11, Issue 1 (January 2020), pp. 81–87

PDF icon Download PDF

While strides have been made to improve science and math readiness at a college-preparatory level, some key fundamentals have been left unaddressed that can cause students to turn away from the STEM disciplines before they find their niche [10], [11], [12], [13]. Introducing collegiate level research and project-based, group-centered learning at a high school level has a multi-faceted effect; in addition to elevated learning outcomes in science and math, students exhibit improved critical thinking and communication skills, leading to improved preparedness for subsequent academic endeavors [1]. The work presented here outlines the development of a STEM ecosystem where both the science department and math department have implemented an interdisciplinary approach to introduce a spectrum of laboratory and computing research skills. This takes the form of both "in situ," micro-curricular elements and stand-alone research and computer science classes which integrate the language-independent concepts of abstraction and object-oriented programming, distributed and high-performance computing, and high and low-level language control applications. This pipeline has been an effective tool that has allowed several driven and interested students to participated in collegiate-level and joint-collegiate projects involving virtual reality, robotics and systems controls, and modeling. The willingness of the departments to cross-pollinate, hire faculty wellversed in research, and support students and faculty with the proper resources are critical factors in readying the next generation of computing leaders.

Introducing Novices to Scientific Parallel Computing

Stephen Lien Harrell, Betsy Hillery, and Xiao Zhu

Volume 11, Issue 1 (January 2020), pp. 88–92

PDF icon Download PDF

HPC and Scientific Computing are integral tools for sustaining the growth of scientific research. Additionally, educating future domain scientists and research-focused IT staff about the use of computation to support research is as important as capital expenditures on new resources. The aim of this paper it to describe the parallel computing portion of Purdue University's HPC seminar series which is used as a tool to introduce students from many non-traditional disciplines to scientific, parallel and high-performance computing.

Evaluating the Effectiveness of an Online Learning Platform in Transitioning Users from a High Performance Computing to a Commercial Cloud Computing Environment

Dhruva Chakravorty and Minh Tri Pham

Volume 11, Issue 1 (January 2020), pp. 93–99

PDF icon Download PDF

Developments in large scale computing environments have led to design of workflows that rely on containers and analytics platform that are well supported by the commercial cloud. The National Science Foundation also envisions a future in science and engineering that includes commercial cloud service providers (CSPs) such as Amazon Web Services, Azure and Google Cloud. These twin forces have made researchers consider the commercial cloud as an alternative option to current high performance computing (HPC) environments. Training and knowledge on how to migrate workflows, cost control, data management, and system administration remain some of the commonly listed concerns with adoption of cloud computing. In an effort to ameliorate this situation, CSPs have developed online and in-person training platforms to help address this problem. Scalability, ability to impart knowledge, evaluating knowledge gain, and accreditation are the core concepts that have driven this approach. Here, we present a review of our experience using Google's Qwiklabs online platform for remote and in-person training from the perspective of a HPC user. For this study, we completed over 50 online courses, earned five badges and attended a one-day session. We identify the strengths of the approach, identify avenues to refine them, and consider means to further community engagement. We further evaluate the readiness of these resources for a cloud-curious researcher who is familiar with HPC. Finally, we present recommendations on how the large scale computing community can leverage these opportunities to work with CSPs to assist researchers nationally and at their home institutions.

Teaching HPC Systems Administrators

Alex Younts and Stephen Lien Harrell

Volume 11, Issue 1 (January 2020), pp. 100–105

PDF icon Download PDF

The ability to grow and teach systems professionals relies on having the capacity to let students interact with supercomputers at levels not given to normal users. In this paper we describe the teaching methods and hardware platforms used by Purdue Research Computing to train undergraduates for HPC systems-facing roles. From Raspberry Pi clusters to the LittleFe project, previous work has focused on providing miniature hardware platforms and developing curriculums for teaching. Recently, we have developed and employed a method using virtual machines to reach a wider audiences, created best practices, and removed barriers for approaching coursework. This paper outlines the system we have designed, expands on the benefits and drawbacks over hardware systems, and discusses the failures and successes we have had teaching HPC System Administrators.

Contributing HPC Skills to the HPC Certification Forum

Julian Kunkel, Kai Himstedt, Weronika Filinger, Jean-Thomas Acquaviva, Anja Gerbes, and Lev Lafayette

Volume 11, Issue 1 (January 2020), pp. 106–107

PDF icon Download PDF

The International HPC Certification Program has been officially launched over a year ago at ISC'18 and since then made significant progress in categorising and defining the skills required to proficiently use a variety of HPC systems. The program reached the stage when the support and input from the HPC community is essential. For the certification to be recognised widely, it needs to capture skills required by majority of HPC users, regardless of their level. This cannot be achieved without contributions from the community. This extended abstract briefly presents the current state of the developed Skill Tree and explains how contributors can extend it. In the talk, we focus on the contribution aspects.

Bridging the Educational Gap between Emerging and Established Scientific Computing Disciplines

Marcelo Ponce, Erik Spence, Ramses van Zon, and Daniel Gruner

Volume 10, Issue 1 (January 2019), pp. 4–11

PDF icon Download PDF

In this paper we describe our experience in developing curriculum courses aimed at graduate students in emerging computational fields, including biology and medical science. We focus primarily on computational data analysis and statistical analysis, while at the same time teaching students best practices in coding and software development. Our approach combines a theoretical background and practical applications of concepts. The outcomes and feedback we have obtained so far have revealed several issues: students in these particular areas lack instruction like this although they would tremendously benefit from it; we have detected several weaknesses in the formation of students, in particular in the statistical foundations but also in analytical thinking skills. We present here the tools, techniques and methodology we employ while teaching and developing this type of courses. We also show several outcomes from this initiative, including potential pathways for fruitful multi- disciplinary collaborations.

Student-led Computational Inorganic Chemistry Research in a Classroom Setting

Erica D. Hummel and S. Chantal E. Stieber

Volume 10, Issue 1 (January 2019), pp. 12–15

PDF icon Download PDF

Advanced computational inorganic methods were introduced as course-based undergraduate research experiences (CUREs) through use of the National Science Foundation's Extreme Science and Engineering Discovery Environment (NSF XSEDE). The ORCA ab initio quantum chemistry program allowed students to conduct independent research projects following in-class lectures and tutorials. Students wrote publication-style papers and conducted peer review of classmates' papers to learn about the full scientific process.

Extending XSEDE Innovations to Campus Cyberinfrastructure - The XSEDE National Integration Toolkit

Eric Coulter, Jodie Sprouse, Resa Reynolds, and Richard Knepper

Volume 10, Issue 1 (January 2019), pp. 16–20

PDF icon Download PDF

XSEDE Service Providers (SPs) and resources have the benefit of years of testing and implementation, tuning and configuration, and the development of specific tools to help users and systems make the best use of these resources. Cyberinfrastructure professionals at the campus level are often charged with building computer resources which are compared to these national-level resources. While organizations and companies exist that guide cyberinfrastructure configuration choices down certain paths, there is no easy way to distribute the long-term knowledge of the XSEDE project to campus CI professionals. The XSEDE Cyberinfrastructure Resource Integration team has created a variety of toolkits to enable easy knowledge and best-practice transfer from XESDE SPs to campus CI professionals. The XSEDE National Integration Toolkit (XNIT) provides the software used on most XSEDE systems in an effort to propagate the best practices and knowledge of XSEDE resources. XNIT includes basic tools and configuration that make it simpler for a campus cluster to have the same software set and many of the advantages and XSEDE SP resource affords. In this paper, we will detail the steps taken to build such a library of software and discuss the challenges involved in disseminating awareness of toolkits among cyberinfrastructure professionals. We will also describe our experiences in updating the XNIT to be compatible with the OpenHPC project, which forms the basis of many new HPC systems, and appears situated to become the de-facto choice of management software provider for many HPC centers.

Student Outcomes in Parallelizing Recursive Matrix Multiply

Chris Fietkiewicz

Volume 10, Issue 1 (January 2019), pp. 21–23

PDF icon Download PDF

Students in a course on high performance computing were assigned the task of parallelizing an algorithm for recursive matrix multiplication. The objectives of the assignment were to: (1) design a basic approach for incorporating parallel programming into a recursive algorithm, and (2) optimize the speedup. Pseudocode was provided for recursive matrix multiplication, and students were required to first implement a serial version before implementing a parallel version. The parallel version had the following requirements: (1) use OpenMP to perform multithreading, and (2) use exactly 4 threads, where each thread computes one quadrant of the array product. Using a class size of 23 students, including undergraduate and graduate, approximately 70% of the students designed valid parallel solutions, and 13% achieved the optimal speedup of 4x. Common errors included recursively creating excessive threads, failing to parallelize all possible mathematical operations, and poor use of compiler directives for OpenMP.

Scientific Computing, High-Performance Computing and Data Science in Higher Education

Marcelo Ponce, Erik Spence, Ramses van Zon, and Daniel Gruner

Volume 10, Issue 1 (January 2019), pp. 24–31

PDF icon Download PDF

We present an overview of current academic curricula for Scientific Computing, High-Performance Computing and Data Science. After a survey of current academic and non-academic programs across the globe, we focus on Canadian programs and specifically on the education program of the SciNet HPC Consortium, using its detailed enrollment and course statistics for the past six to seven years. Not only do these data display a steady and rapid increase in the demand for research-computing instruction, they also show a clear shift from traditional (high performance) computing to data- oriented methods. It is argued that this growing demand warrants specialized research computing degrees.

Initial impact of Evaluation in Blue Waters Community Engagement Program

Lizanne DeStefano and Jung Sun Sung

Volume 10, Issue 1 (January 2019), pp. 32–39

PDF icon Download PDF

The external evaluation activities in the first three years of the Blue Waters Community Engagement program for graduate fellows and undergraduate interns are described in this study. Evaluators conducted formative and summative evaluations to acquire data from the participants at various stages during this period. Details regarding the evaluation methodology, implementation, results, information feedback process, and the overall program impact based on these evaluation findings are outlined here. Participants in both groups were selected from a variety of different scientific backgrounds and their high performance computing expertise also varied at the outset of the program. Implementation challenges stemming from these issues were identified through the evaluation, and accommodations were made in the initial phases of the program. As a result, both the graduate fellowship and undergraduate internship programs were able to successfully overcome many of the identified problems by the end of the third year. The evaluation results also show the significant impact the program was able to make on the future careers of the participants.

Effectively Extending Computational Training Using Informal Means at Larger Institutions

Dhruva K. Chakravorty, Marinus "Maikel" Pennings, Honggao Liu, Zengyu "Sheldon" Wei, Dylan M. Rodriguez, Levi T. Jordan, Donald "Rick" McMullen, Noushin Ghaffari, and Shaina D. Le

Volume 10, Issue 1 (January 2019), pp. 40–47

PDF icon Download PDF

Short courses offered by High Performance Computing (HPC) centers offer an avenue for aspiring Cyberinfrastructure (CI) professionals to learn much-needed skills in research computing. Such courses are a staple at universities and HPC sites around the country. These short courses offer an informal curricular model of short, intensive, and applied micro-courses that address generalizable competencies in computing as opposed to content expertise. The degree of knowledge sophistication is taught at the level of below a minor and the burden of application to domain content is on the learner. Since the Spring 2017 semester, Texas A&M University High Performance Research Computing (TAMU HPRC) has introduced a series of interventions in its short courses program that has led to a 300% growth in participation. Here, we present the strategies and best practices employed by TAMU HPRC in teaching short course modules. We present a longitudinal report that assesses the success of these strategies since the Spring semester of 2017. This data suggests that changes to student learning and a reimagination of the tiered instruction model widely adopted at institutions could be beneficial to student outcomes.

HPC Education and Training: an Australian Perspective

Maciej Cytowski, Luke Edwards, Mark Gray, Christopher Harris, Karina Nunez, and Aditi Subramanya

Volume 10, Issue 1 (January 2019), pp. 48–52

PDF icon Download PDF

The Pawsey Supercomputing Centre has been running a variety of education, training and outreach activities addressed to all Australian researchers for a number of years. Based on experience and user feedback we have developed a mix of on-site and online training, roadshows, user forums and hackathon-type events. We have also developed an open repository of materials covering different aspects of HPC systems usage, parallel programming techniques as well as cloud and data resources usage. In this paper, we will share our experience in using different learning methods and tools to address specific educational and training purposes. The overall goal is to emphasise that there is no universal learning solution, instead, various solutions and platforms need to be carefully selected for different groups of interest.

Trends in Demand, Growth, and Breadth in Scientific Computing Training Delivered by a High-Performance Computing Center

Ramses van Zon, Marcelo Ponce, Erik Spence, and Daniel Gruner

Volume 10, Issue 1 (January 2019), pp. 53–60

PDF icon Download PDF

We analyze the changes in the training and educational efforts of the SciNet HPC Consortium, a Canadian academic High Performance Computing center, in the areas of Scientific Computing and High-Performance Computing, over the last six years. Initially, SciNet offered isolated training events on how to use HPC systems and write parallel code, but the training program now consists of a broad range of workshops and courses that users can take toward certificates in scientific computing, data science, or high-performance computing. Using data on enrollment, attendence, and certificate numbers from SciNet's education website, used by almost 1800 users so far, we extract trends on the growth, demand, and breadth of SciNet's training program. Among the results are a steady overall growth, a sharp and steady increase in the demand for data science training, and a wider participation of 'non-traditional' computing disciplines, which has motivated an increasingly broad spectrum of training offerings. Of interest is also that many of the training initiatives have evolved into courses that can be taken as part of the graduate curriculum at the University of Toronto.

Evaluating Active Learning Approaches for Teaching Intermediate Programming at an Early Undergraduate Level

Dhruva K. Chakravorty, Marinus "Maikel" Pennings, Honggao Liu, Zengyu "Sheldon" Wei, Dylan M. Rodriguez, Levi T. Jordan, Donald "Rick" McMullen, Noushin Ghaffari, Shaina D. Le, Derek Rodriquez, Crystal Buchanan, and Nathan Gober

Volume 10, Issue 1 (January 2019), pp. 61–66

PDF icon Download PDF

There is a growing need to provide intermediate programming classes to STEM students early in their undergraduate careers. These efforts face significant challenges due to the varied computing skill-sets of learners, requirements of degree programs, and the absence of a common programming standard. Instructional scaffolding and active learning methods that use Python offer avenues to support students with varied learning needs. Here, we report on quantitative and qualitative outcomes from three distinct models of programming education that (i) connect coding to hands- on "maker" activities; (ii) incremental learning of computational thinking elements through guided exercises that use Jupyter Notebooks; and (iii) problem-based learning with step-wise code fragments leading to algorithmic implementation. Performance in class activities, capstone projects, in-person interviews, and participant surveys informed us about the effectiveness of these approaches on student learning. We find that students with previous coding experience were able to rely on broader skills and grasp concepts faster than students who recently attended an introductory programming session. We find that, while makerspace activities were engaging and explained basic programming concepts, they lost their appeal in complex programming scenarios. Students grasped coding concepts fastest using the Jupyter notebooks, while the problem-based learning approach was best at having students understand the core problem and create inventive means to address them.

The Impact of MOOC Methodology on the Scalability, Accessibility and Development of HPC Education and Training

Julia Mullen, Weronika Filinger, Lauren Milechin, and David Henty

Volume 10, Issue 1 (January 2019), pp. 67–73

PDF icon Download PDF

This work explores the applicability of Massively Open Online Courses (MOOCs) for scaling High Performance Computing (HPC) training and education. Most HPC centers recognize the need to provide their users with HPC training; however, the current educational structure and accessibility prevents many scientists and engineers who need HPC knowledge and skills from becoming HPC practitioners. To provide more accessible and scalable learn- ing paths toward HPC expertise, the authors explore MOOCs and their related technologies and teaching approaches. In this paper the authors outline how MOOC courses differ from face-to-face training, video-capturing of live events, webinars, and other established teaching methods with respect to pedagogical design, development issues and deployment concerns. The work proceeds to explore two MOOC case studies, including the design decisions, pedagogy and delivery. The MOOC development methods discussed are universal and easily replicated by educators and trainers in any field; however, HPC has specific technical needs and concerns not encountered in other online courses. Strategies for addressing these HPC concerns are discussed throughout the work.

Training Computational Scientists to Build and Package Open-Source Software

Prentice Bisbal

Volume 10, Issue 1 (January 2019), pp. 74–80

PDF icon Download PDF

High performance computing training and education typically emphasizes the first-principles of scientific programming, such as numerical algorithms and parallel programming techniques. How- ever, many computational scientists need to know how to compile and link to applications built by others. Likewise, those who create the libraries and applications need to understand how to organize their code to make it as portable as possible and package it so that it is straightforward for others to use. These topics are not currently addressed by the current HPC education or training curriculum and users are typically left to develop their own approaches. This work will discuss observations made by the author over the last 20 years regarding the common problems encountered in the scientific community when developing their own codes and building codes written by other computational scientists. Recommendations will be provided for a training curriculum to address these shortcomings.

Using Virtual Reality to Enforce Principles of Cybersecurity

Jinsil Hwaryoung Seo, Michael Bruner, Austin Payne, Nathan Gober, Donald "Rick" McMullen, and Dhruva K. Chakravorty

Volume 10, Issue 1 (January 2019), pp. 81–87

PDF icon Download PDF

The Cyberinfrastructure Security Education for Professionals and Students (CiSE-ProS) virtual reality environment is an exploratory project that uses engaging approaches to evaluate the impact of learning environments produced by augmented reality (AR) and virtual reality (VR) technologies for teaching cybersecurity concepts. The program is steeped in well-reviewed pedagogy; the refinement of the educational methods based on constant assessment is a critical factor that has contributed to its success. In its current implementation, the program supports undergraduate student education. The overarching goal is to develop the CiSE-ProS VR program for implementation at institutions with low cyberinfrastructure adoption where students may not have access to a physical data center to learn about the physical aspects of cybersecurity.

Towards an HPC Certification Program

Julian Kunkel, Kai Himstedt, Nathanael Hübbe, Hinnerk Stüben, Sandra Schröer, Michael Kuhn, Matthias Riebisch, Stephan Olbrich, Thomas Ludwig, Weronika Filinger, Jean-Thomas Acquaviva, Anja Gerbes, and Lev Lafayette

Volume 10, Issue 1 (January 2019), pp. 88–89

PDF icon Download PDF

The HPC community has always considered the training of new and existing HPC practitioners to be of high importance to its growth. This diversification of HPC practitioners challenges the traditional training approaches, which are not able to satisfy the specific needs of users, often coming from non-traditionally HPC disciplines, and only interested in learning a particular set of competences. Challenges for HPC centres are to identify and overcome the gaps in users' knowledge, while users struggle to identify relevant skills. We have developed a first version of an HPC certification pro- gram that would clearly categorize, define, and examine competences. Making clear what skills are required of or recommended for a competent HPC user would benefit both the HPC service providers and practitioners. Moreover, it would allow centres to bundle together skills that are most beneficial for specific user roles and scientific domains. From the perspective of content providers, existing training material can be mapped to competences allowing users to quickly identify and learn the skills they require. Finally, the certificates recognized by the whole HPC community simplify inter-comparison of independently offered courses and provide additional incentive for participation.

Potential Influence of Prior Experience in an Undergraduate-Graduate Level HPC Course

Chris Fietkiewicz

Volume 10, Issue 1 (January 2019), pp. 90–92

PDF icon Download PDF

A course on high performance computing (HPC) at Case Western Reserve University included students with a range of technical and academic experience. We consider these experiential differences with regard to student performance and perceptions. The course relied heavily on C programming and multithreading, but one third of the students had no prior experience with these techniques. Academic experience also varied, as the class included 3rd and 4th year undergraduates, master's students, PhD students, and a non- degree student. Results indicate that student performance did not depend on technical experience. However, average overall performance was slightly higher for graduate students. Additionally, we report on students' perceptions of the course and the assigned work.

Deep Learning by Doing: The NVIDIA Deep Learning Institute and University Ambassador Program

Xi Chen, Gregory S. Gutmann, and Joe Bungo

Volume 10, Issue 1 (January 2019), pp. 93–99

PDF icon Download PDF

Over the past two decades, High-Performance Computing (HPC) communities have developed many models for delivering education aiming to help students understand and harness the power of parallel and distributed computing. Most of these courses either lack a hands-on component or heavily focus on theoretical characterization behind complex algorithms. To bridge the gap between application and scientific theory, NVIDIA Deep Learning Institute (DLI) ( has designed an on-line education and training platform that helps students, developers, and engineers solve real-world problems in a wide range of domains using deep learning and accelerated computing. DLI's accelerated computing course content starts with the fundamentals of accelerating applications with CUDA and OpenACC in addition to other courses in training and deploying neural networks for deep learning. Advanced and domain-specific courses in deep learning are also available. The online platform enables students to use the latest AI frameworks, SDKs, and GPU-accelerated technologies on fully-configured GPU servers in the cloud so the focus is more on learning and less on environment setup. Students are offered project-based assessment and certification at the end of some courses. To support academics and university researchers teaching accelerated computing and deep learning, the DLI University Ambassador Program enables educators to teach free DLI courses to university students, faculty, and researchers.

Using CloudLab as a Scalable Platform for Teaching Cluster Computing

Linh B. Ngo and Jeff Denton

Volume 10, Issue 1 (January 2019), pp. 100–106

PDF icon Download PDF

A significant challenge in teaching cluster computing, an advanced topic in the parallel and distributed computing body of knowledge, is to provide students with an adequate environment where they can become familiar with real-world infrastructures that embody the conceptual principles taught in lectures. In this paper, we de- scribe our experience setting up such an environment by leveraging CloudLab, a national experimentation platform for advanced computing research. We explored two approaches in using CloudLab to teach advanced concepts in cluster computing: direct deployment of virtual machines (VMs) on bare-metal nodes and indirect deployment of VMs inside a CloudLab-based cloud.

Programmable Education Infrastructure: Cloud resources as HPC Education Environments

Eric Coulter, Richard Knepper, and Jeremy Fischer

Volume 10, Issue 1 (January 2019), pp. 107–107

PDF icon Download PDF

Cloud computing is growing area for educating students and performing meaningful scientific research. The challenge for many educators and researchers is knowing how to use some of the unique aspects of computing in the cloud. One key feature is true elastic computing — resources on demand. The elasticity and programmability of cloud resources make them an excellent tool for educators who require access to a wide range of computing environments. In the field of HPC education, such environments are an absolute necessity, and getting access to them can create a large burden on the educators above and beyond designing content. While cloud resources won't replace traditional HPC environments for large research projects, they are an excellent option for providing both user and administrator education on HPC environments. The highly configurable nature of cloud environments allows educators to tailor the educational resource to the needs of their attendees, and provide a wide range of hands-on experiences. In this demo, we'll show how the Jetstream cloud environment can be used to provide training for both new HPC administrators and users, by showing a ground-up build of a simple HPC system. While this approach uses the Jetstream cloud, it is generalizable across any cloud provider. We will show how this allows an educator to tackle everything from basic command-line concepts and scheduler use to advanced cluster-management concepts such as elasticity and management of scientific software.

The HPC Best Practices Webinar Series

Osni A. Marques, David E. Bernholdt, Elaine M. Raybourn, Ashley D. Barker, and Rebecca J. Hartman-Baker

Volume 10, Issue 1 (January 2019), pp. 108–110

PDF icon Download PDF

In this contribution, we discuss our experiences organizing the Best Practices for HPC Software Developers (HPC-BP) webinar series, an effort for the dissemination of software development methodologies, tools and experiences to improve developer productivity and software sustainability. HPC-BP is an outreach component of the IDEAS Productivity Project [4] and has been designed to support the IDEAS mission to work with scientific software development teams to enhance their productivity and the sustainability of their codes. The series, which was launched in 2016, has just presented its 22nd webinar. We summarize and distill our experiences with these webinars, including what we consider to be "best practices" in the execution of both individual webinars and a long-running series like HPC-BP. We also discuss future opportunities and challenges in continuing the series.

Physics Conceptual Understanding in a Computational Science Course

Rivka Taub, Michal Armoni, and Mordechai (Moti) Ben-Ari

Volume 9, Issue 2 (December 2018), pp. 2–13

PDF icon Download PDF

Students face many difficulties dealing with physics principles and concepts during physics problem solving. For example, they lack the understanding of the components of formulas, as well as of the physical relationships between the two sides of a formula. To overcome these difficulties some educators have suggested integrating simulations design into physics learning. They claim that the programming process necessarily fosters understanding of the physics underlying the simulations. We investigated physics learning in a high-school course on computational science. The course focused on the development of computational models of physics phenomena and programming corresponding simulations. The study described in this paper deals with the development of students' conceptual physics knowledge throughout the course. Employing a qualitative approach, we used concept maps to evaluate students' physics conceptual knowledge at the beginning and the end of the model development process, and at different stages in between. We found that the students gained physics knowledge that has been reported to be difficult for high-school and even undergraduate students. We use two case studies to demonstrate our method of analysis and its outcomes. We do that by presenting a detailed analysis of two projects in which computational models and simulations of physics phenomena were developed.

Automatic Feature Selection in Markov State Models Using Genetic Algorithm

Qihua Chen, Jiangyan Feng, Shriyaa Mittal, and Diwakar Shukla

Volume 9, Issue 2 (December 2018), pp. 14–22

PDF icon Download PDF

Markov State Models (MSMs) are a powerful framework to reproduce the long-time conformational dynamics of biomolecules using a set of short Molecular Dynamics (MD) simulations. However, precise kinetics predictions of MSMs heavily rely on the features selected to describe the system. Despite the importance of feature selection for large system, determining an optimal set of features remains a difficult unsolved problem. Here, we introduce an automatic approach to optimize feature selection based on genetic algorithms (GA), which adaptively evolves the most fitted solution according to natural selection laws. The power of the GA-based method is illustrated on long atomistic folding simulations of four proteins, varying in length from 28 to 80 residues. Due to the diversity of tested proteins, we expect that our method will be extensible to other proteins and drive MSM building to a more objective protocol.

Teaching and Learning Graph Algorithms Using Animation

Y. Daniel Liang

Volume 9, Issue 2 (December 2018), pp. 23–29

PDF icon Download PDF

Graph algorithms have many applications. Many real-world problems can be solved using graph algorithms. Graph algorithms are commonly taught in the data structures, algorithms, and discrete mathematics courses. We have created two animations to visually demonstrate the graph algorithms. The first animation is for depth-first search, breadth-first search, shortest paths, connected components, finding bipartite sets, and Hamiltonian path/cycle on unweighted graphs. The second animation is for the minimum spanning trees, shortest paths, travelling salesman problems on weighted graphs. The animations are developed using HTML, CSS, and JavaScript and are platform independent. They can be viewed from a browser on any device. The animations are useful tools for teaching and learning graph algorithms. This paper presents these animations.

Identification of Active Oligonucleotide Sequences Using Artificial Neural Network

Alex Luke, Sarah Fergione, Riley Wilson, Brady Gunn, and Stan Svojanovsky

Volume 9, Issue 2 (December 2018), pp. 30–36

PDF icon Download PDF

In this project we designed an Artificial Neural Network (ANN) computational model to predict the activity of short oligonucleotide sequences (octamers) with important biological role as exonic splicing enhancers (ESE) motifs recognized by human SR protein SC35. Since only active sequences were available from the literature as our initial data set, we generated an additional set of complementary sequences to the original set. We used back-propagation neural network (BPNN) with MATLAB® Neural Network Toolbox™ on our research designated computer. In Stage I of our project we trained, validated and tested the BPNN prototype. We started with 20 samples in the training and 8 samples in the validation sets. Trained and validated BPNN prototype was then used to test the unique set of 10 octamer sequences with 5 active samples and their 5 complementary sequences. The test showed 2 classification errors, one false positive and the other false negative. We used the test data and moved into Stage II of the project. First, we analyzed the initial DNA numerical representation (DNR) and changed the scheme to achieve higher difference between the subsets of active and complementary sequences. We compared the BPNN results with different numbers of nodes in the second hidden layer to optimize model accuracy. To estimate future model performance we needed to test the classifier on newly collected data from another paper. This practical application included the testing of 41 published, non-repeating SC35 ESE motif octamers, together with 41 complementary sequences. The test showed high BPNN accuracy in the predictive power for both (active and inactive) categories. This study shows the potential for using a BPNN to screen SC35 ESE motif candidates.

Parsing Next Generation Sequencing Data in Parallel Environments for Downstream Genetic Variation Analysis

Mariana Vasquez, Jonathon Mohl, and Ming-Ying Leung

Volume 9, Issue 2 (December 2018), pp. 37–45

PDF icon Download PDF

With the recent advances in next generation sequencing technology, analysis of prevalent DNA sequence variants from patients with a particular disease has become an important tool for understanding the associations between the disease and genetic mutations. A publicly accessible bioinformatics pipeline, called OncoMiner (, was implemented in 2016 to help biomedical researchers analyze large genomic datasets from patients with cancer. However, the current version of OncoMiner can only accept input files with a highly specific format for sequence variant description. In order to handle data from a broader range of sequencing platforms, a data preprocessing tool is necessary. We have therefore implemented the OncoMiner Preprocessing (OP) program for parsing data files in the popular FastQ and BAM formats to generate an OncoMiner input file. OP involves using the open source Bowtie2 and SAMtools software, followed by a python script we developed for genetic sequence variant identification. To preprocess very large datasets efficiently, the OP program has been parallelized on two local computers and the Blue Waters system at the National Center for Supercomputing Applications using a multiprocessing approach. Although reasonable parallelization efficiency has been obtained on the local computers, the OP program's speedup on Blue Waters has been limited, possibly due to I/O issues and individual node memory constraints. Despite these, Blue Waters has provided the necessary resources to process 35 datasets from patients with acute myeloid leukemia and demonstrated significant correlation of OP runtimes with the BAM input size and chromosome diversity.

Teaching Kinetics through Differential Equations Constructed with a Berkeley Madonna Flow Chart Model

Franklin M. Chen

Volume 9, Issue 1 (May 2018), pp. 2–12

PDF icon Download PDF

The alias feature of the Berkeley Madonna platform allows this author to create a chemical kinetics project manual for students to create flow charts with rate equations consistent with their learning from physical chemistry textbooks. The platform used in this way becomes versatile and powerful that allow students to explore any chemical kinetics problems from simple (e.g. 1st or 2nd order kinetics) to complex (e.g. stratosphere ozone depletion, the Lotka-Volterra mechanism) bypassing complicate syntax that are required by most of the powerful mathematical programs. This kinetics manual has been successfully implemented in UW-Green Bay in the fall semester of 2017 with the students' success rate greater than 80%.

Motivating Computational Science with Systems Modeling

Holly Hirst

Volume 9, Issue 1 (May 2018), pp. 13–18

PDF icon Download PDF

This paper describes introducing rate of change and systems modeling paradigms and software as tools to increase appreciation for computational science. A similar approach was used with three different audiences: freshman liberal arts majors, junior math education majors, and college faculty teaching introductory science courses. A description of the implementation used with each audience and their reactions to the material is discussed, along with some example problems that could be used in a variety of courses.

Building a MATLAB Graphical User Interface to Solve Ordinary Differential Equations as a Final Project for an Interdisciplinary Elective Course on Numerical Computing

Steve M. Ruggiero, Jianan Zhao, and Ashlee N. Ford Versypt

Volume 9, Issue 1 (May 2018), pp. 19–28

PDF icon Download PDF

A final project assignment is described for an interdisciplinary applied numerical computing upper division and graduate elective in which students develop a GUI for defining and solving a system of ordinary differential equations (initial value problems) and the associated explicit algebraic equations such as values for parameters. The primary task is to develop a GUI for MATLAB using GUIDE that takes a user-specified number of differential equations and explicit algebraic equations as input, solves the system of ODEs using \mcode{ode45}, returns the solution vector, and plots the solution vector components vs. the independent variable. The code for the GUI must be verified by showing that it returns the same results and the same figures as a system of ODEs with a known solution. The purpose of the final project assignment is threefold: (1) to practice GUI design and construction in MATLAB, (2) to verify code implementation, and (3) to review content covered throughout the course. The manuscript first introduces the course and the context and motivation for the project. Then the project assignment is detailed. Two student project submissions are described. The verification case study is also provided.

Modeling the Effects of Star Formation with a Volumetric Feedback Model

Claire Kopenhafer and Brian W. O'Shea

Volume 9, Issue 1 (May 2018), pp. 29–38

PDF icon Download PDF

We implemented two new models for star formation and supernova feedback into the astrophysical code Enzo. These models are designed to efficiently capture the bulk properties of galaxies and the influence of the circumgalactic medium (CGM). Unlike Enzo's existing models, these do not track stellar populations over time with computationally expensive particle objects. Instead, supernova explosions immediately follow stellar birth and their feedback is deposited in a volumetric manner. Our models were tested using simulations of Milky Way-like isolated galaxies, and we found that neither model was able to produce a realistic, metal-enriched CGM. Our work suggests that volumetric feedback models are not sufficient replacements for particle-based star formation and feedback models.

Using Inexpensive Microclusters and Accessible Materials for Cost-Effective Parallel and Distributed Computing Education

Joel C. Adams, Suzanne J. Matthews, Elizabeth Shoop, David Toth, and James Wolfer

Volume 8, Issue 3 (December 2017), pp. 2–10

PDF icon Download PDF

With parallel and distributed computing (PDC) now in the core CS curriculum, CS educators are building new pedagogical tools to teach their students about this cutting-edge area of computing. In this paper, we present an innovative approach we call microclusters - personal, portable Beowulf clusters - that provide students with hands-on PDC learning experiences. We present several different microclusters, each built using a different combination of single board computers (SBCs) as its compute nodes, including various ODROID models, Nvidia's Jetson TK1, Adapteva's Parallella, and the Raspberry Pi. We explore different ways that CS educators are using these systems in their teaching, and describe specific courses in which CS educators have used microclusters. Finally, we present an overview of sources of free PDC pedagogical materials that can be used with microclusters.

Innovative Model, Tools, and Learning Environments to Promote Active Learning for Undergraduates in Computational Science & Engineering

Hong Liu, Michael Spector, Matthew Ikle, Andrei Ludu, and Jerry Klein

Volume 8, Issue 3 (December 2017), pp. 11–18

PDF icon Download PDF

This paper presents an innovative hybrid learning model as well as the tools, resources, and learning environment to promote active learning for both face-to-face students and online students. Most small universities in the United States lack adequate resources and cost justifiable enrollments to offer Computational Science and Engineering (CSE) courses. The goal of the project was to find an effective and affordable model for small universities to prepare underserved students with marketable analytical skills in CSE. As the primary outcome, the project created a cluster of collaborating institutions that combines students into common classes and used cyberlearning learning tools to deliver and manage instruction. The instrumental tools for educational technologies included Smart Podium, digital projector, teleconference system such as AdobeConnect, auto tracking camera and high quality audios in both local and remote classrooms. As innovative active learning environment, R&D process was used to provide a coherent framework for designing instruction and assessing learning. Course design centered on model-based learning which proposes that students learn complex content by elaborating on their mental model, developing a conceptual model, refining a mathematical model, and conducting experiments to validate and revise their conceptual and mathematical models. A wave lab and underwater robotics lab were used to facilitate the experimental components of hands-on research projects. Course delivery included interactive live online help sessions, immediate feedback to students, peer support, and teamwork which were crucial for student success. Another key feature of instruction of the project was using emerging technologies such as HIMATT [8] to evaluate how students think through and model complex, ill-defined and ill-structured realistic problems.

Computational approaches to scattering by microspheres

Reed M. Hodges, Kelvin Rosado-Ayala, and Maxim Durach

Volume 8, Issue 3 (December 2017), pp. 19–24

PDF icon Download PDF

Mie theory is used to model the scattering off of wavelength-sized microspheres. It has numerous applications for many different geometries of spheres. The calculations of the electromagnetic fields involve large sums over vector spherical harmonics. Thus, the simple task of calculating the fields, along with additional analytical tools such as cross sections and intensities, require large summations that are conducive to high performance computing. In this paper, we derive Mie theory from first principles, and detail the process and results of programming Mie theory physics in Fortran 95. We describe the theoretical background specific to the microspheres in our system and the procedure of translating functions to Fortran. We then outline the process of optimizing the code and parallelizing various functions, comparing efficiencies and runtimes. The shorter runtimes of the Fortran functions are then compared to their corresponding functions in Wolfram Mathematica. Fortran has shorter runtimes than Mathematica by between one and four orders of magnitude for our code. Parallelization further reduces the runtimes of the Fortran code for large jobs. Finally, various plots and data related to scattering by dielectric spheres are presented.

Toward simulating Black Widow binaries with CASTRO

Platon I. Karpov, Maria Barrios Sazo, Michael Zingale, Weiqun Zhang, and Alan C. Calder

Volume 8, Issue 3 (December 2017), pp. 25–29

PDF icon Download PDF

We present results and lessons learned from a 2015-2016 Blue Waters Student Internship. The project was to perform preliminary simulations of an astrophysics application, Black Widow binary systems, with the adaptive-mesh simulation code Castro. The process involved updating the code as needed to run on Blue Waters, constructing initial conditions, and performing scaling tests exploring Castro's hybrid message passing/threaded architecture.

Using Blue Waters to Assess Non-Tornadic Outbreak Forecast Capability by Lead Time

Taylor Prislovsky and Andrew Mercer

Volume 8, Issue 3 (December 2017), pp. 30–35

PDF icon Download PDF

Derechos are a dangerous, primarily non-tornadic severe weather outbreak type responsible for a variety of atmospheric hazards. However, the exact predictability of these events by lead time is unknown, yet would likely be invaluable to forecasters responsible for predicting these events. As such, the predictability of nontornadic outbreaks by lead time was assessed. Five derecho events spanning 1979 to 2012 were selected and simulated using the Weather Research and Forecasting (WRF) model at 24, 48, 72, 96, and 120-hours lead time. Nine stochastically perturbed initial conditions were generated for each case and each lead time, yielding an ensemble of derecho simulations. Moment statistics of the derecho composite parameter (DCP), a good proxy for derecho environments, were used to assess variability in forecast quality and precision by lead time. Overall, results showed that 24 and 48 hour simulations had similar variability characteristics, as did 96 and 120 hours. This suggests the existence of a change point or statistically notable drop-off in forecast performance at 72-hours lead time that should be more fully explored in future work. These results are useful for forecasters as they give a first guess as to forecast skill and precision prior to initiating their predictions at lead times of out to 5 days.

Blue Waters Supercomputing Applications in Climate Modeling with the WRF Model

Morgan Smith and Andrew Mercer

Volume 8, Issue 3 (December 2017), pp. 36–43

PDF icon Download PDF

Long-term atmospheric forecasting remains a significant challenge that in the field of operational meteorology. These long-term forecasts are typically completed through the use of climatological variability patterns in the geopotential height fields, known in the field of meteorology as teleconnections. Despite heavy reliance on teleconnections for long-term forecasts, the characterization of these patterns in operational weather models remains inadequate. The purpose of this study is to diagnose the ability of an operational forecast model to render well-known teleconnection patterns. The Weather Research and Forecasting (WRF) model, a commonly employed regional operational forecast model, was used in the simulation of the major 500 mb Northern Hemisphere midlatitude teleconnection patterns. These patterns were formulated using rotated principal component analysis on the 500 mb geopotential height fields. The resulting simulated teleconnection patterns were directly compared to observed teleconnection fields derived from the National Center for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) reanalysis 500 mb geopotential height database, a commonly utilized observational dataset in climate research. Results were quite poor, as the resulting teleconnection patterns only somewhat resembled those constructed on the observed dataset, suggesting a limited capability of the WRF in resolving the underlying variability structure of the hemispheric midlatitude atmosphere. Additionally, configuring the regional model to complete this simulation was met with a series of computational challenges, some of which were not successfully overcome. These results suggest future needs for improvement of the WRF model in reconstructing teleconnection fields and for use in climate modeling.

A model Scientific Computing course for freshman students at liberal arts Colleges

Arun K. Sharma

Volume 8, Issue 2 (August 2017), pp. 2–9

PDF icon Download PDF

Computing is ubiquitous and perhaps the most common element of our shared experience. However, many students do not seem to recognize the serious applications and implications of computing to the sciences. Wagner College, like many liberal arts colleges, requires a semester of a Computer Science affiliated course to provide students with an exposure to ``technological skills''. Sadly, such courses typically do not delve into high-level computational skills or computational thinking and generally provide instruction in using Microsoft Office\textregistered{} products and rudimentary worldwide web concepts. These courses and approaches were probably valuable a decade ago when computing devices were not quite as prevalent. However, in today's world these courses appear outdated and do not provide relevant skills to the modern undergraduate student. We have created a course called, ``Introduction to Scientific Computing'', to remedy this problem and to provide students with state-of-the-art technological tools. The course provides students with hands-on training on typical work-flows in scientific data analysis and data visualization. Students are trained in the symbolic computing platform, Wolfram Mathematica\textregistered{}, to apply functional programming to develop data analysis and problem solving skills. The course presents computational thinking examples in the framework of various scientific disciplines. This exposure helps students to understand the advantages of technical computing and its direct relevance to their educational goals. The students are also trained to perform molecular visualization using open source software packages to understand secondary and tertiary protein structures, construct molecular animations, and to analyze computer simulation data. These experiences stimulate students to apply these skills across multiple courses and their research endeavors. Student self-assessment data suggests that the course satisfies a unique niche in undergraduate education and enriches the training of future STEM graduates.

Authentic computer science undergraduate research experience through computational science and research ownership

Lior Shamir

Volume 8, Issue 2 (August 2017), pp. 10–16

PDF icon Download PDF

Research experience has been identified as a high-impact intervention for increasing student engagement and retention in STEM. However, authentic undergraduate research leading to primary authorship peer-reviewed publications is a challenge due to the relatively short time the students work on their capstone projects, and the insufficient preparation of the students as researchers. The challenge is further magnified in the field of computer science, where the absence of ``traditional'' labs limits the opportunities of undergraduate students to participate in research. Here we present a novel approach to authentic computer science undergraduate research, based on interdisciplinary computational science and student ownership of their research projects. Instead of the traditional role of undergraduate research assistant, the students select their own research topic based on their personal interests, and with the assistance of a faculty complete all stages of their research project. The uniqueness of the approach is its ability to lead to scientific discoveries and peer-reviewed publications such that the primary author is the student, while allowing the student to experience the entire research process, from defining the research question through analysis of the experimental results. In three years the model led to a dramatic increase in the number of undergraduate students who publish primary-author peer-reviewed scientific papers. The intervention increased the number of peer-reviewed student-authored publications from none to a very high rate of about one third of the students, in many cases publishing in the top outlets in their field.

Energy-Efficient Virtual Screening with ARM-CPU-Based Computers

Olivia Alford and David Toth

Volume 8, Issue 2 (August 2017), pp. 17–23

PDF icon Download PDF

We attempted to find a more sustainable solution for performing virtual screening with AutoDock Vina which uses less electricity than computers using typical x64 CPUs. We tested a cluster of ODROID-XU3 Lite computers with ARM CPUs and compared its performance to a server with x64 CPUs. In order to be a viable solution, our cluster needed to perform the screen without sacrificing speed or increasing hardware costs. The cluster completed the virtual screen in a little less time than our comparison server while using just over half the electricity that the server used. Additionally, the hardware for the cluster cost about 38% less than the server, making it a viable solution.

An Implementation of Parallel Bayesian Network Learning

Joseph S. Haddad, Timothy W. O'Neil, Anthony Deeter, and Zhong-Hui Duan

Volume 8, Issue 2 (August 2017), pp. 24–28

PDF icon Download PDF

Bayesian networks may be utilized to infer genetic relations among genes. This has proven useful in providing information about how gene interactions influence life. However, Bayesian network learning is slow due to the nature of the algorithm. K2, a search space reduction, helps speed up the learning process but may introduce bias. To eliminate this bias, multiple Bayesian networks must be computed. This paper evaluates and realizes parallelization of network generation and the reasoning behind the choices made. Methods are developed and tested to evaluate the results of the implemented accelerations. Generating networks across multiple cores results in a linear speed-up with negligible overhead. Distributing the generation of networks across multiple machines also introduces linear speed-up, but results in additional overhead.

Interactive Analytics for Complex Cognitive Activities on Information from Annotations of Prokaryotic Genomes

Raphael D. Isokpehi, Kiara M. Wootson, Dominique R. Smith-McInnis, and Shaneka S. Simmons

Volume 8, Issue 2 (August 2017), pp. 29–36

PDF icon Download PDF

Several microbial genome databases provide collections of thousands of genome annotation files in formats suitable for the performance of complex cognitive activities such as decision making, sense making and analytical reasoning. The goal of the research reported in this article was to interactive analytics resources to support the performance of complex cognitive activities on a collection of publicly available genome information spaces. A supercomputing infrastructure (Blue Waters Supercomputer) provided computational tools to construct information spaces while visual analytics software and online bioinformatics resources provided tools to interact with the constructed information spaces. The Rhizobiales order of bacteria that includes the Brucella genus was the use case for preforming the complex cognitive activities. An interesting finding among the genomes of the dolphin pathogen, Brucella ceti, was a cluster of genes with evidence for function in conditions of limited nitrogen availability.

How to Build a Fast HPC n-Body Engine From Scratch

Eric Peterson, Max Kelly, and Dr. Victor Pinks II

Volume 8, Issue 2 (August 2017), pp. 37–45

PDF icon Download PDF

Communicating and transferring computational science knowledge and literacy is a tremendously important concept for students at all levels of education to understand. Computational knowledge is especially important due to the tremendous impact that computer programming has had on all scientific and engineering disciplines. As technology evolves, so must our educational system in order for society to evolve as a whole. We undertook direct instruction of a computational science course, and have developed a curriculum that can be expanded upon to provide students entering technical disciplines with the background that they need to be successful. The course would provide insight to the C programming language as well as how computers function at a more basic level. Students would undertake projects that explores how to program simple tasks and operations and ultimately ends in a final project aimed at assessing the knowledge accumulated from the course.

Parallelized Model of Low-Thrust Cargo Spacecraft Trajectories and Payload Capabilities to Mars

Wesley Yu and Hans Mark

Volume 8, Issue 2 (August 2017), pp. 46–53

PDF icon Download PDF

As a Blue Waters Student Internship Program project, we have developed a model of interplanetary low-thrust trajectories from Earth to Mars for spacecrafts supplying necessary cargo for future human-crewed missions. Since these cargo missions use ionic propulsion that causes a gradual change in the spacecraft's velocity, the modeling is more computationally expensive than conventional trajectories assuming instantaneous spacecraft velocity changes. This model calculates the spacecraft's time of flight and swept angle at different payload masses with other parameters kept constant and correlates them with known locations of the planets. With parallelization using OpenMP on Blue Waters, its runtime has decreased from 10.55 to 1.53 hours. The program takes a user-selected Mars arrival date and outputs a given range of dates with maximum payload capabilities. This parallelized model will greatly reduce the time required for future mission design projects when other factors like spacecraft solar panel power output may vary with new mission specifications. The internship experience has enhanced the intern's ability to manage a project and will impact positively on his future graduate studies or research career.

Implementation of Computational Aids in Diels-Alder Reactions: Regioselectivity and Stereochemistry of Adduct Formation

Jiyoung Jung, Susan Zirpoli, and Glenn Slick

Volume 8, Issue 1 (February 2017), pp. 2–6

PDF icon Download PDF

The Diels-Alder reaction is one of the most well-known organic reactions and is widely used for six-membered ring formation. Regio- and stereo-selective Diels-Alder reactions have been emphasized in various areas including pharmaceutical and polymer industries. However, covering the theoretical background of such reactions in an undergraduate class is challenging because the interactions between molecular orbitals is poorly visualized for students. Especially when dealing with polycyclic aromatic hydrocarbons (PAHs) and asymmetric compounds, the complexity of regio- and stereo-selectivity becomes more pronounced. Herein we utilized web-based computational tools (WebMO) to visualize the HOMO-LUMO of each reaction component and their interaction to form chemical bonds. In this study we demonstrated the incorporation of computational aids into a Diels-Alder laboratory class dramatically facilitates students' understanding of several important concepts including frontier orbital theory, thermodynamics of the reaction, three-dimensional visualization, and so on. The assessment of teaching effectiveness prior to and after implementation of computational aids into Diels-Alder reactions will also be discussed in this manuscript.

Educational Module on Genomic Sequence Alignment Using HPC

Angela B. Shiflet, George W. Shiflet, Daniel S. Couch, Pietro Hiram Guzzi, and Mario Cannataro

Volume 8, Issue 1 (February 2017), pp. 7–11

PDF icon Download PDF

"Aligning SequencesSequentially and Concurrently," an educational computational science module by the authors and available online, develops a sequential algorithm to determine the highest similarity score and the alignments that yield this score for two DNA sequences. Moreover, the module considers several approaches to parallelization and speedup. Besides a serial implementation in C, a parallel program in C/MPI is available. This paper describes the module and details experiences using the material in a bioinformatics course at University "Magna Graecia" of Catanzaro, Italy. Besides being appropriate for such a course, the module can provide a meaningful application for a high performance computing or a data structures class.

VisMo: Augmented Reality Visualization of Scientific Data and Molecular Structures

Max Collins and Dr. Alan B. Craig

Volume 8, Issue 1 (February 2017), pp. 12–15

PDF icon Download PDF

In this paper, we describe and detail the creation of and use for our project that allows for augmented reality visualization of data produced using Blue Waters supercomputer or other high performance computers. While molecular structures have been displayed using augmented reality before [1,6], we created a pipeline for using information from the Protein Data Bank and automatically loading it into an augmented reality scene for further display and interaction. We find it important to create an easy way for students, scientists, and anyone else to be able to visualize molecular structures using Augmented Reality because it offers an interactive three dimensional perspective that is typically not available in the classroom. Learning about molecular structures in 2D is much less comprehensive, and our technique for visualization will be free for the end user and offer a great deal of aid to the learning and teaching process. There is no separate purchase required as long as a user has a smart phone or tablet. This is a helpful addition to scientific papers which, if containing the right target image, can be used as the visualization "anchor". The Protein Data Bank (PDB) houses information about proteins, nucleic acids, and more to help scientists and students understand concepts and ideas in biology and chemistry [5]. Our project goal is to open the PDB up to students and people who are not familiar with augmented reality visualization and allow people to learn using the PDB by visualizing molecular structures in different representations, annotating and interacting with the structures, and offering learning modules for common molecular structures. We created a prototype mobile application allowing for molecular visualization of PDB structures, and are continuing to tweak our project for an eventual release to the public.


Venkata Suhas Maringanti, Basileal Imana, and Peter Yoon

Volume 8, Issue 1 (February 2017), pp. 16–19

PDF icon Download PDF

The problem of interconnecting nets with multi-port terminals in VLSI circuits is a direct generalization of the Group Steiner Problem (GSP). The GSP is a combinatorial optimization problem which arises in the routing phase of VLSI circuit design. This problem has been intractable, making it impractical to be used in real-world VLSI applications. This paper presents our work on designing and implementing a parallel approximation algorithm for the GSP based off an existing heuristic on a distributed architecture. Our implementation uses the CUDA-aware MPI approach to compute the approximate minimum-cost Group Steiner tree for several industry-standard VLSI graphs. Our implementation achieves up to 103x speedup compared to the best known serial work for the same graph. We present the speedup results for graphs up to 3k vertices. We also investigate some performance bottleneck issues by analyzing and interpreting the program performance data.

STUDENT PAPER: GPU Acceleration for SQL Queries on Large-Scale Distributed Systems

Linh Nguyen and Paul Hemler

Volume 8, Issue 1 (February 2017), pp. 20–26

PDF icon Download PDF

General purpose GPUs are a powerful hardware with a number of applications in the realm of relational databases. We further extended a database framework designed to allow for GPU execution queries. Our technique is novel in that it implements Dynamic Parallelism, a new feature in recent hardware, to accelerate SQL JOINs. Query execution results in 1.25X speedup on average with respect to a previous method, also accelerated by GPUs, which employs a multi-dimensional CUDA Grid. More importantly, we divided the queries to run on multiple BW nodes to investigate the scalability of both SELECT and JOIN.

Cognitive Aspects of Computational Modeling and Simulation in Teaching and Learning

Osman Yasar

Volume 7, Issue 1 (April 2016), pp. 2–14

PDF icon Download PDF

We discuss cognitive aspects of modeling and simulation in an efficacy study of computational pedagogical content knowledge (CPACK) professional development of K-12 STEM teachers. Evidence includes data from a wide range of educational settings over the past ten years. We present a computational model of the mind based on an iterative cycle of deductive and inductive cognitive processes. The model is aligned with empirical research from cognitive psychology and neuroscience and it opens door to a whole series of future studies on computational thinking.

Introducing Teachers to Modeling Water in Urban Environments

Steven I. Gordon, Jason Cervenec, and Michael Durand

Volume 7, Issue 1 (April 2016), pp. 15–20

PDF icon Download PDF

Geoscience educators in K-12 have limited experience with the quantitative methods used by professionals as part of their everyday work. Many science teachers at this level have backgrounds in other science fields. Even those with geoscience or environmental science backgrounds have limited experience with applying modeling and simulation tools to introduce realworld activities into their classrooms. This article summarizes a project aimed at introducing K-12 geoscience teachers to project based exercises using urban hydrology models that can be integrated into their classroom teaching. The impact of teacher workshops on teacher's confidence and willingness to utilize computer modeling in their classes is also reported.

Computational Thinking as a Practice of Representation: A Proposed Learning and Assessment Framework

Camilo Vieira, Manoj Penmetcha, Alejandra J. Magana, and Eric Matson

Volume 7, Issue 1 (April 2016), pp. 21–30

PDF icon Download PDF

This study proposes a research and learning framework for developing and assessing computational thinking under the lens of representational fluency. Representational fluency refers to individuals ability to (a) comprehend the equivalence of different modes of representation and (b) make transformations from one representation to another. Representational fluency was used in this study to guide the design of a robotics lab. This lab experience consisted of a multiple step process in which students were provided with a learning strategy so they could familiarize themselves with representational techniques for algorithm design and the robot programming language. The guiding research question for this exploratory study was: Can we design a learning experience to effectively support individuals computing representational fluency? We employed representational fluency as a framework for the design of computing learning experiences as well as for the investigation of student computational thinking. Findings from the implementation of this framework to the design of robotics tasks suggest that the learning experiences might have helped students increase their computing representational fluency. Moreover, several participants identified that the robotics activities were engaging and that the activities also increased their interest both in algorithm design and robotics. Implications of these findings relate to the use of representational fluency coupled with robotics to integrate computing skills in diverse disciplines.

Revising and Expanding a Blue Waters Curriculum Module as a Parallel Computing Learning Experience

Ruth Catlett and David Toth

Volume 7, Issue 1 (April 2016), pp. 31–39

PDF icon Download PDF

The party problem is a mathematical problem in the discipline of Ramsey Theory. Because of the problems embarrassingly parallel nature, its extreme computational requirements, and its relative ease of understanding implementation with a nave algorithm, it is well suited to serve as an example problem for teaching parallel computing. Years ago, a curriculum module for Blue Waters was developed using this problem. However, delays in the delivery of Blue Waters resulted in the module being released before Blue Waters was accessible. Therefore, performance data and compilation instructions for Blue Waters were not available. We have revised the module to provide source code for new versions of the programs to demonstrate more parallel computing libraries. We have also added performance data and compilation instructions for the code in the old version of the module and for the new implementations, which take advantage of the capabilities of the Blue Waters supercomputer now that it is available.

Abatement of Computational Issues Associated with Dark Modes in Optical Metamaterials

Matthew LePain and Maxim Durach

Volume 7, Issue 1 (April 2016), pp. 39–45

PDF icon Download PDF

Optical fields in metamaterial nanostructures can be separated into bright modes, whose dispersion is typically described by effective medium parameters, and dark fluctuating fields. Such combination of propagating and evanescent modes poses a serious numerical complication due to poorly conditioned systems of equations for the amplitudes of the modes. We propose a numerical scheme based on a transfer matrix approach, which resolves this issue for a parallel plate metal-dielectric metamaterial, and demonstrate its effectiveness.

Exploring Design Characteristics of Worked Examples to Support Programming and Algorithm Design

Camilo Vieira, Junchao Yan, and Alejandra J. Magana

Volume 6, Issue 1 (July 2015), pp. 2–15

PDF icon Download PDF

In this paper we present an iterative research process to integrate worked examples for introductory programming learning activities. Learning how to program involves many cognitive processes that may result in a high cognitive load. The use of worked examples has been described as a relevant approach to reduce student cognitive load in complex tasks. Learning materials were designed based on instructional principles of worked examples, and were used for a freshmen programming course. Moreover, the learning materials were refined after each iteration based on student feedbacks. The results showed that novice students benefited more than experienced students when exposed to the worked examples. In addition, encouraging students to carry out an elaborated self-explanation of their coded solutions may be a relevant learning strategy when implementing the worked examples pedagogy

Picky: A New Introductory Programming Language

Francisco J. Ballesteros, Gorka Guardiola Múzquiz, and Enrique Soriano-Salvador

Volume 6, Issue 1 (July 2015), pp. 16–24

PDF icon Download PDF

In the authors' experience the languages available for teaching introductory computer programming courses are lacking. In practice, they violate some of the fundamentals taught in an introductory course. This is often the case, for example, with I/O. Picky is a new open source programming language created specifically for education that enables the students to program according to the principles laid down in class. It solves a number of issues the authors had to face while teaching introductory courses for several years in other languages. The language is small, simple and very strict regarding what is a legal program. It has a terse syntax and it is strongly typed and very restrictive. Both the compiler and the runtime include extra checks to provide safety features. The compiler generates byte-code for compatibility and the programming tools are freely available for Linux, MacOSX, Plan 9 from Bell Labs and Windows. This paper describes the language and discusses the motivation to implement it and its main educational features.

Identification of Inhibitors of Fatty Acid Synthesis Enzymes in Mycobacterium Smegmatis

Alexander Priest, E. Davis Oldham, Lynn Lewis, and David Toth

Volume 6, Issue 1 (July 2015), pp. 25–31

PDF icon Download PDF

Antibiotic-resistant strains of Mycobacterium tuberculosis have rendered some of the current treatments for tuberculosis ineffective, creating a need for new treatments. Today, the most efficient way to find new drugs to treat tuberculosis and other diseases is to use virtual screening to quickly consider millions of potential drug candidates and filter out all but the ones most likely to inhibit the disease. These top hits can then be tested in a traditional wet lab to determine their potential effectiveness. Using supercomputers, we screened over 4 million potential drug molecules against each of two enzymes that are critical to the survival of Mycobacterium tuberculosis. During this process, we determined the top candidate molecules to test in the wet lab.

Characterizing Ligand Interactions in Wild-type and Mutated HIV-1 Proteases

Leyte L. Winfield, Rosalind Gregory-Bass, Jordan Campbell, and Andy Watkins

Volume 5, Issue 1 (August 2014), pp. 2–9

PDF icon Download PDF

A computational module has been developed in which students examine the binding interactions between indinavir and HIV-1 protease. The project is a component of the Medicinal Chemistry course offered to upper level chemistry, biochemistry, and biology majors. Students work with modeling and informatics tools utilized in drug development research while evaluating wild-type and mutated forms of the HIV-1 protease in complex with the inhibitor indinavir. By quantifying the molecular interactions within protease-inhibitor complexes, students can characterize the structural basis for reduced efficacy of indinavir.

Scaling and Visualization of N-Body Gravitational Dynamics with GalaxSeeHPC

David A. Joiner and James Walters

Volume 5, Issue 1 (August 2014), pp. 10–22

PDF icon Download PDF

In this paper, we present GalaxSeeHPC, a new cluster-enabled gravitational N-Body program designed for educational use, along with two potential student experiences that illustrate what students might be able to investigate at larger N than available with earlier versions of GalaxSee. GalaxSeeHPC adds additional force calculation algorithms and input options to the previous clusterenabled version. GalaxSeeHPC lessons have been developed focusing on two key studies, the structure of rotating galaxies and the large scale structure of the universe. At large N, visualizing the results becomes a significant challenge, and tools for visualization are presented. The canonical lesson in the original version of GalaxSee is the rotation and flattening of a cluster with angular momentum. Model discrepancies that are not obvious at the range of N available in previous versions become quite obvious at large N, and changes to the initial mass and velocity distribution can be seen more readily. For the large scale structure models, while basic clearing and clustering can be seen at around N=5,000, N=50,000 allows for a much clearer visualization of the filamentary structure at large scale, and N=500,000 allows for a more detailed geometry of the knots formed as the filaments combine to form superclusters. For the galactic dynamics simulations, we found that while a flattening due to overall angular momentum can be explored with N=1,000 or smaller, formation of spiral structure requires not only a larger number of objects, typically on the order of 10,000, but also modifications to the default initial mass and velocity distributions used in older versions of GalaxSee.

Introducing Evolutionary Computing in Regression Analysis

Olcay Akman

Volume 5, Issue 1 (August 2014), pp. 23–27

PDF icon Download PDF

A typical upper level undergraduate or first year graduate level regression course syllabus treats model selection with various stepwise regression methods. Here we implement evolutionary computing for subset model selection and accomplish two goals: i) introduce students to the powerful optimization method of genetic algorithms, and ii) transform a regression analysis course to a regression and modeling without requiring any additional time or software commitment.Furthermore we also employed Akaike Information Criterion (AIC) as a measure of model fitness instead of another commonly used measure of R-square. The model selection tool uses Excel which makes the procedure accessible to a very wide spectrum of interdisciplinary students with no specialized software requirement. An Excel macro, to be used as an instructional tool is freely available through the author's website.

Teaching Students to Program Using Visual Environments: Impetus for a Faulty Mental Model?

Edward Dillon, Monica Anderson-Herzog, and Marcus Brown

Volume 5, Issue 1 (August 2014), pp. 28–43

PDF icon Download PDF

When learning to program, students are typically exposed to either a visual or command line environment. Visual environments are usually adopted to help engage students with programming due to their user-friendly feature capabilities. This article explores the effect of using visual environments such as Integrated Development Environments and syntax-free tools to teach students how to program. Prior studies have shown that some visual environments can have a productive impact on a student's ability to learn and become engaged with programming. However, the functional behavior of visual environments may cause a student to develop a faulty mental model for programming. One possible reason is due to the fixed set of skills that a student acquires upon initial exposure to programming while using a visual environment. Two systematic studies were conducted for exposing students to programming in introductory courses using both visual and command line environments. From the first study, it was found that visual environments can initially impose a lower learning curve for students. However, the second study revealed that visual environments may present a challenge for students to directly transfer their acquired skills to other programming environments after initial exposure.

Computational Math, Science, and Technology (C-MST) Approach to General Ed

Osman Yasar

Volume 4, Issue 1 (October 2013), pp. 2–10

PDF icon Download PDF

In this paper, we present a computational approach to teaching general education courses that expose students to science and computing principles in engaging contexts, including modeling and simulation, games, and history. The courses use scalable curriculum modules organized in layers of increasing difficulties in order to balance learning challenges and student abilities. We describe the computational pedagogy followed in these modules and courses, with particular attention to the simulation-based course, namely introduction to computational science, to present a case study for those considering similar initiatives.

Introducing Transition Matrices and Their Biological Applications

Angela B. Shiflet and George W. Shiflet

Volume 4, Issue 1 (October 2013), pp. 11–15

PDF icon Download PDF

The Blue Waters Undergraduate Petascale Education Program (NSF) sponsors the development of educational modules that help students understand computational science and the importance of high performance computing. As part of this materials development initiative, we developed two modules, "Time after Time: Age- and Stage-Structured Models" and "Probable Cause: Modeling with Markov Chains," which develop application problems involving transition matrices and provide accompanying programs in a variety of systems (C/MPI, C, MATLAB, Mathematica). Age- and stage-structured models incorporate the probability of an animal passing from one age or stage to the next as well as the animal's average reproduction at each age or stage. Markov chain models are based on the probability of passing from one state to another. These educational materials follow naturally from another Blue Waters module, "Living Links: Applications of Matrix Operations to Population Studies," which provides a foundation for the use of matrix operations. This paper describes the two modules and details experiences using the resources in classes.

STEM-Based Computing Educational Resources on the Web

Tatiana Ringenberg and Aejandra Magana

Volume 4, Issue 1 (October 2013), pp. 16–23

PDF icon Download PDF

This paper explores the landscape of computing educational resources found on the web together with teaching and learning materials that can facilitate the integration of computational thinking into the classroom. In specific, this paper focuses in finding and describing existing learning environments that integrate computational thinking into a STEM discipline This study provides initial steps towards that goal of providing a comprehensive list of STEM-based computational resources on the web that also provides guiding information, which can help teachers and parents make decisions to evaluate and integrate these resources easily for educational purposes.

Transformation of a Mathematics Department's Teaching and Research Through a Focus on Computational Science

Yanlai Chen, Gary Davis, Sigal Gottlieb, Adam Hausknecht, Alfa Heryudono, and Saeja Kim

Volume 4, Issue 1 (October 2013), pp. 24–29

PDF icon Download PDF

Undergraduate teaching that focuses on student-driven research, mentored by research active faculty, can have a powerful effect in bringing relevance and cohesiveness to a department's programs. We describe and discuss such a program in computational mathematics, and the effects this program has had on the students, the faculty, the department and the university.

STUDENT PAPER: Solving the Many-Body Polarization Problem on GPUs: Application to MOFs

Brant Tudor and Brian Space

Volume 4, Issue 1 (October 2013), pp. 30–34

PDF icon Download PDF

Massively Parallel Monte Carlo, an in-house computer code available at, has been successfully utilized to simulate interactions between gas phase sorbates and various metal-organic materials. In this regard, calculations involving polarizability were found to be critical, and computationally expensive. Although GPGPU routines have increased the speed of these calculations immensely, in its original state, the program was only able to leverage a GPUs power on small systems. In order to study larger and evermore complex systems, the program model was modified such that limitations related to system size were relaxed while performance was either increased or maintained. In this project, parallel programming techniques learned from the Blue Waters Undergraduate Petascale Education Program were employed to increase the efficiency and expand the utility of this code.

Parallelization of the Knapsack Problem as an Introductory Experience in Parallel Computing

Michael Crawford and David Toth

Volume 4, Issue 1 (October 2013), pp. 35–39

PDF icon Download PDF

As part of a parallel computing course where undergraduate students learned parallel computing techniques and got to run their programs on a supercomputer, one student designed and implemented a sequential algorithm and two versions of a parallel algorithm to solve the knapsack problem. Performance tests of the programs were conducted on the Ranger supercomputer. The performance of the sequential and parallel implementations was compared to determine speedup and efficiency. We observed 82%-86% efficiency for the MPI version and 89% efficiency for the OpenMP version for sufficiently large inputs to the problem. Additionally, we discuss both the student and faculty member's reflections about the experience.

Cyber Collaboratory-based Sustainable Design Education: A Pedagogical Framework

Kyoung-Yun Kim, Karl R. Haapala, Gül E. Okudan Kremer, and Michael K. Barbour

Volume 3, Issue 2 (December 2012), pp. 2–10

PDF icon Download PDF

Educators from across the educational spectrum are faced with challenges in delivering curricula that address sustainability issues. This article introduces a cyber-based interactive e-learning platform, entitled the Sustainable Product Development Collaboratory, which is focused on addressing this need. This collaboratory aims to educate a wide spectrum of learners in the concepts of sustainable design and manufacturing by demonstrating the effects of product design on supply chain costs and environmental impacts. In this paper, we discuss the overall conceptual framework of this collaboratory along with pedagogical and instructional methodologies related to the collaboratory-based sustainable design education. Finally, a sample learning module is presented along with methods for assessment of student learning and experiences with the collaborator.

A Hands-on Education Program on Cyber Physical Systems for High School Students

Vijay Gadepally, Ashok Krishnamurthy, and Umit Ozguner

Volume 3, Issue 2 (December 2012), pp. 11–17

PDF icon Download PDF

Cyber Physical Systems (CPS) are the conjoining of an entities' physical and computational elements. The development of a typical CPS system follows a sequence from conceptual modeling, testing in simulated (virtual) worlds, testing in controlled (possibly laboratory) environments and finally deployment. Throughout each (repeatable) stage, the behavior of the physical entities, the sensing and situation assessment, and the computation and control options have to be understood and carefully represented through abstraction. The CPS Group at the Ohio State University, as part of an NSF funded CPS project on "Autonomous Driving in Mixed Environments", has been developing CPS related educational activities at the K-12, undergraduate and graduate levels. The aim of these educational activities is to train students in the principles and design issues in CPS and to broaden the participation in science and engineering. The project team has a strong commitment to impact STEM education across the entire K-20 community. In this paper, we focus on the K-12 community and present a two-week Summer Program for high school juniors and se- niors that introduces them to the principles of CPS design and walks them through several of the design steps. We also provide an online repository that aids CPS researchers in providing a similar educational experience.

Using Supercomputing to Conduct Virtual Screen as Part of the Drug Discovery Process in a Medicinal Chemistry Course

David Toth and Jimmy Franco

Volume 3, Issue 2 (December 2012), pp. 18–25

PDF icon Download PDF

The ever-increasing amount of computational power available has made it possible to use docking programs to screen large numbers of compounds to search for molecules that inhibit proteins. This technique can be used not only by pharmaceutical companies with large research and development budgets and large research universities, but also at small liberal arts colleges with no special computing equipment beyond the desktop PCs in any campus' computer laboratory. However, despite the availability of significant quantities of compute time available to small colleges to conduct these virtual screens, such as supercomputing time available through grants, we are unaware of any small colleges that do this. We describe the experiences of an interdisciplinary research collaboration between faculty in the Chemistry and Computer Science Departments in a chemistry course where chemistry and biology students were shown how to conduct virtual screens. This project began when the authors, who had been collaborating on drug discovery research using virtual screening, decided that the virtual screening process they were using in their research could be adapted to fit in a couple of lab periods and would complement one of the instructors' courses on medicinal chemistry. The resulting labs would introduce students to the virtual screening portion of the drug discovery process.

Metadata Management in Scientific Computing

Eric L. Seidel

Volume 3, Issue 2 (December 2012), pp. 26–33

PDF icon Download PDF

Complex scientific codes and the datasets they generate are in need of a sophisticated categorization environment that allows the community to store, search, and enhance metadata in an open, dynamic system. Currently, data is often presented in a read-only format, distilled and curated by a select group of researchers. We envision a more open and dynamic system, where authors can publish their data in a writeable format, allowing users to annotate the datasets with their own comments and data. This would enable the scientific community to collaborate on a higher level than before, where researchers could for example annotate a published dataset with their citations. Such a system would require a complete set of permissions to ensure that any individual's data cannot be altered by others unless they specifically allow it. For this reason datasets and codes are generally presented read-only, to protect the author's data; however, this also prevents the type of social revolutions that the private sector has seen with Facebook and Twitter. In this paper, we present an alternative method of publishing codes and datasets, based on Fluidinfo, which is an openly writeable and social metadata engine. We will use the specific example of the Einstein Toolkit, a part of the Cactus Framework, to illustrate how the code's metadata may be published in writeable form via Fluidinfo.

Bringing ab initio Electronic Structure Calculations to the Nano Scale through High Performance Computing

James Currie, Rachel Cramm Horn, and Paul Rulis

Volume 3, Issue 2 (December 2012), pp. 34–40

PDF icon Download PDF

An ab initio density functional theory based method that has a long history of dealing with large complex systems is the Orthogonalized Linear Combination of Atomic Orbitals (OLCAO) method, but it does not operate in parallel and, while the program is empirically observed to be fast, many components of its source code have not been analyzed for efficiency. This paper describes the beginnings of a concerted effort to modernize, parallelize, and functionally extend the OLCAO program so that it can be better applied to the complex and challenging problems of materials design. Specifically, profiling data were collected and analyzed using the popular performance monitoring tools TAU and PAPI as well as standard UNIX time commands. Each of the major components of the program was studied so that parallel algorithms that either modified or replaced the serial algorithm could be suggested. The program was run for a collection of different input parameters to observe trends in compute time. Additionally, the algorithm for computing interatomic interaction integrals was restructured and its performance was measured. The results indicate that a fair degree of speed-up of even the serial version of the program could be achieved rather easily, but that implementation of a parallel version of the program will require more substantial consideration.

A Performance Comparison of a Naïve Algorithm to Solve the Party Problem using GPUs

Michael V.E. Bryant and David Toth

Volume 3, Issue 2 (December 2012), pp. 41–48

PDF icon Download PDF

The R(m, n) instance of the party problem asks how many people must attend a party to guarantee that at the party, there is a group of m people who all know each other or a group of n people who are all complete strangers. GPUs have been shown to significantly decrease the running time of some mathematical and scientific applications that have embarrassingly parallel portions. A brute force algorithm to solve the R(5, 5) instance of the party problem can be parallelized to run on a number of processing cores many orders of magnitude greater than the number of cores in the fastest supercomputer today. Therefore, we believed that this currently unsolved problem is so computationally intensive that GPUs could significantly reduce the time needed to solve it. In this work, we compare the running time of a naive algorithm to help make progress solving the R(5, 5) instance of the party problem on a CPU and on five different GPUs ranging from low-end consumer GPUs to a high-end GPU. Using just the GPUs computational capabilities, we observed speedups ranging from 1.9 to over 21 in comparison to our quad-core CPU system.

Application of the Occupational Analysis of Computational Thinking-Enabled STEM Professionals as a Program Assessment Tool

Joyce Malyn-Smith and Irene Lee

Volume 3, Issue 1 (June 2012), pp. 2–10

PDF icon Download PDF

This paper describes the application of findings from the National Science Foundation's project on Computational Thinking (CT) in America's Workplace to program assessment. It presents the process used to define the primary job functions and work tasks of CT-Enabled STEM professionals in today's scientific enterprise. Authors describe three programs developing CT skills among learners in secondary and post secondary programs and how the resulting occupational analysis was used to review these programs. The article presents ways this analysis can be used as a framework to guide the development of STEM learning outcomes and activities, and sets of directions for future work.

Building a Project Methodology to Provide Authentic and Appropriate Experiences in Computational Science for Middle and High School Students

Patricia Jacobs and Jennifer Houchins

Volume 3, Issue 1 (June 2012), pp. 11–18

PDF icon Download PDF

Shodor, a national resource for computational science education, has successfully developed a model for middle and high school students to gain authentic and appropriate experiences in computational science. As we prepare students for the 21st century workforce, three of the most important skills for advancing modern mathematics and science are quantitative reasoning, computational thinking, and multi-scale modeling. Shodor's Computing MATTERS: Pathways to Cyberinfrastructure program, funded in part by the National Science Foundation Cyberinfrastructure Training, Education, Advancement, and Mentoring (CI-TEAM) program, provides opportunities for middle and high school students to explore all three of these areas. One of the wide range of programs offered through Computing MATTERS is the SUCCEED Apprenticeship Program. The overall goal of the SUCCEED Apprenticeship Program is to provide students with authentic and appropriate experiences in the use of technologies, techniques and tools of Information Technology (IT) with a particular focus on computational science and to produce evidence that students become proficient in these IT technologies, techniques and skills. The program combines appropriate structure (classroom-style training and project-based work experience) with meaningful work content, giving students a wide variety of technical and communication skills. The program uses innovative approaches to get students excited about computational science and enables students to grow from excitement to expertise in science, technology, engineering, and mathematics (STEM). Since its beginning in 2005, the SUCCEED Apprenticeship Program has proven to be a successful model for enabling middle and high school students of both genders and of ethnically and economically diverse backgrounds to gain proficiency in STEM while learning, experiencing, and using information technologies.

A Web Service Infrastructure and its Application for Distributed Chemical Equilibrium Computation

Subrata Bhattacharjee, Christopher P. Paolini, and Mark Patterson

Volume 3, Issue 1 (June 2012), pp. 19–27

PDF icon Download PDF

W3C standardized Web Services are becoming an increasingly popular middleware technology used to facilitate the open exchange of data and perform distributed computation. In this paper we propose a modern alternative to commonly used software applications such as STANJAN and NASA CEA for performing chemical equilibrium analysis in a platform-independent manner in combustion, heat transfer, and fluid dynamics research. Our approach is based on the next generation style of computational software development that relies on loosely-coupled network accessible software components called Web Services. While several projects in existence use Web Services to wrap existing commercial and open-source tools to mine thermodynamic data, no Web Service infrastructure has yet been developed to provide the thermal science community with a collection of publicly accessible remote functions for performing complex computations involving reacting flows. This work represents the first effort to provide such an infrastructure where we have developed a remotely accessible software service that allows developers of thermodynamics and combustion software to perform complex, multiphase chemical equilibrium computation with relative ease. Coupled with the data service that we have already built, we show how the use of this service can be integrated into any numerical application and invoked within commonly used commercial applications such as Microsoft Excel and MATLAB® for use in computational work. A rich internet application (RIA) is presented in this work to demonstrate some of the features of these newly created Web Services.

Cloud-enabling Scientific Tools and Computational Methods for Invigorating STEM Learning and Research

Bina Ramamurthy, Jessica Poulin, and Katharina Dittmar

Volume 3, Issue 1 (June 2012), pp. 28–33

PDF icon Download PDF

We present a cloud-enabled comprehensive platform (Pop!World) for experiential learning, education, training and research in population genetics and evolutionary biology. The major goal of Pop!World is to leverage the advances in cyber-infrastructure to improve accessibility of important biological concepts to students at all levels. It is designed to empower a broad spectrum of users with access to cyber-enabled scientific resources, tools and platforms, thus, preparing the next generation of scientists. Pop!World offers a highly engaging alternative to currently prevalent textual environments that fail to captivate net-generation audiences. It is also more mathematically focused than currently available tools, allowing it to be used as a basic teaching tool and expanded to higher education levels and collaborative research platforms. The project is a synergistic inter-disciplinary collaboration among investigators from Computer Science & Engineering and Biological Sciences. In this paper we share our invaluable multi-disciplinary experience (CSE and BIO) in the design and deployment of the Pop!World platform and its successful integration into the introductory biological sciences course offerings over the past two years. We expect our project to serve as a model for creative use of advances in cyber-infrastructure for engaging the cyber-savvy net-generation [11] students and invigorating STEM education.

Transforming the Primary Research Process through a Virtual Linguistic Lab for the Study of Language Acquisition and Use: Challenges and Accomplishments

Maria Blume and Barbara Lust

Volume 3, Issue 1 (June 2012), pp. 34–46

PDF icon Download PDF

This project involves both the development of a community of scholars committed to cross-institution, interdisciplinary and cross-linguistic collaboration (a Virtual Center for Language Acquisition, VCLA) and the creation of a web-based infrastructure through which a new generation of scholars can learn concepts and technologies empowered through this CI environment. These technologies, constituting a Virtual Linguistic Lab (VLL), provide the student with the structure for data creation, data management and data analysis as well as the tools for collaborative data sharing. This infrastructure, informed and executed through computational science, involves the coherent integration of an open web-based gateway (The VCLA website), linked to a specialized web-based VLL portal which includes not only real world examples and visualizations of data creation and analyses, but several cybertools by which these data can be managed and analyzed. This infrastructure subserves both the beginning student and the researcher pursuing calibrated methods and structured data sharing for collaborative purposes. Students continually engage in the development of the cybertools involved and in the scientific method involved in primary research. In this paper we summarize our objectives, the challenges we face and the solutions we have developed to these challenges. At this point, the project is just completing an implementation stage and is being readied to move to a diffusion stage.

Institutional and Individual Influences on Scientists' Data Sharing Practices

Youngseek Kim and Jeffrey M. Stanton

Volume 3, Issue 1 (June 2012), pp. 47–56

PDF icon Download PDF

Many contemporary scientific endeavors now rely on the collaborative efforts of researchers across multiple institutions. As a result of this increase in the scale of scientific collaboration, sharing and reuse of data using private and public repositories has increased. At the same time, data sharing practices and capabilities appear to vary widely across disciplines and even within some disciplines. This research sought to develop an understanding of this variation through the lens of theories that account for individual choices within institutional contexts. We conducted a total of 25 individual semi-structured interviews to understand researchers' current data sharing practices. The main focus of our interviews was: (1) to explore domain specific data sharing practices in diverse disciplines, and (2) to investigate the factors motivating and preventing the researchers' current data sharing practices. Results showed support for an institutional perspective on data sharing as well as a need for better understanding of scientists' altruistic motives for participating in data sharing and reuse.

Sustain City - A Cyberinfrastructure-Enabled Game System for Science and Engineering Design

Ying Tang, Sachin Shetty, Talbot Bielefeldt, Kauser Jahan, John Henry, and S. Keith Hargrove

Volume 3, Issue 1 (June 2012), pp. 57–65

PDF icon Download PDF

The emergence of transformative technological advances in science and engineering practice has necessitated the integration of these advances in engineering classrooms. In this paper, we present the design and implementation of a virtual reality game system that infuses cyberinfrastructure (CI) learning experiences into the Project-Lead-The-Way (PLTW) pre-engineering classrooms to promote metacognition for science and engineering design in context. The CI features, metacognitive strategies, context-oriented approaches as well as their seamless integration in the game system are elaborated in detail through two game modules, Power Ville and Stability. Both games involve students in the process of decision-making that contributes to different aspects of city infrastructures (energy and transportation). The evaluation of Power Ville deployment in a PLTW classroom is also presented. The preliminary assessment confirms the usability of CI and metacognitive tools in science and engineering design.

Using Spreadsheets to Visualize Virus Concentration

Jyoti Champanerkar and Christina Dizzia

Volume 2, Issue 1 (December 2011), pp. 1–8

PDF icon Download PDF

In this paper, we model the growth of virus in an infected person, taking into account the effect of antibiotics and immunity of the person. We use discrete dynamical systems or difference equations to model the situation; and Excel to obtain the numerical solutions and visualize the solution using graphing capabilities of Excel.

Preparing Teachers to Infuse Computational Science into their Classroom Instruction

Susan J. Ragan, Cheryl Begandy, Nancy R. Bunt, Charlotte M. Trout, and Scott A. Sinex

Volume 2, Issue 1 (December 2011), pp. 9–14

PDF icon Download PDF

Establishing consistent use of computer models and simulations in K-12 classrooms has been a challenge for the computational science education community. Scaling successful local efforts has been particularly difficult. In this article we describe how a training model from one place and time can be translated into a training model for another very different place and time if critical factors such as school system culture, professional development organization, local learning standards and goals, and collaboration between STEM disciplines are taken into account.

Introducing Matrix Operations through Biological Applications

Angela B. Shiflet and George W. Shiflet

Volume 2, Issue 1 (December 2011), pp. 15–20

PDF icon Download PDF

For the Blue Waters Undergraduate Petascale Education Program (NSF), we developed a computational science module, "Living Links: Applications of Matrix Operations to Population Studies," which introduces matrix operations using applications to population studies and provides accompanying programs in a variety of systems (C/MPI, MATLAB, Mathematica). The module provides a foundation for the use of matrix operations that are essential to modeling numerous computational science applications from population studies to social networks. This paper describes the module; details experiences using the material in two undergraduate courses (High Performance Computing and Linear Algebra) in 2010 and 2011 at Wofford College and two workshops for Ph.D. students at Monash University in Melbourne, Australia, in 2011; and describes refinements to the module based on suggestions in student and instructor evaluations.

Accelerating Geophysics Simulation using CUDA

Brandon Holt and Daniel Ernst

Volume 2, Issue 1 (December 2011), pp. 21–27

PDF icon Download PDF

CitcomS, a finite element code that models convection in the Earth's mantle, is used by many computational geophysicists to study the Earth's interior. In order to allow faster experiments and greater simulation capability, there is a push to increase the performance of the code to allow more computations to complete in the same amount of time. To accomplish this we leverage the massively parallel capabilities of graphics processors (GPUs), specifically those using NVIDIA's CUDA framework. We translated existing functions to run in parallel on the GPU, starting with the functions where the most computing time is spent. Running on NVIDIA Tesla GPUs, initial results show an average speedup of 1.8 that stays constant with increasing problem sizes and scales with increasing numbers of MPI processes. As more of the CitcomS code is successfully translated to CUDA, and as newer general purpose GPU frameworks like Fermi are released, we should continue to see further speedups in the future.

Understanding the Structural and Functional Effects of Mutations in HIV-1 Protease Mutants Using 100ns Molecular Dynamics Simulations

Christopher D. Savoie and David L. Mobley

Volume 2, Issue 1 (December 2011), pp. 28–34

PDF icon Download PDF

The Human Immunodeficiency Virus type 1 protease (HIV-1 PR) performs a vital role in the lifecycle of the virus, specifically in the maturation of new viral particles. Therefore, delaying the onset of AIDS, the primary goal of HIV treatment, can be achieved by inhibiting this protease.[1] However, the rapidly mutating virus quickly develops drug resistance to current inhibitors, thus novel protease inhibitors are needed. Here, 100ns molecular dynamics (MD) simulations were done for the wild type and two mutant proteases to gain insight into the mechanisms by which the mutations confer drug resistance. Several different metrics were used to search for differences between the wild type and mutant proteases including: flap tip distance and root-mean-square deviation (RMSD), mutual information, and Kullback-Leibler divergence. Finally, it is found at the 100ns timescale there are not large differences in the structure, flexibility and motions of the wild type protease relative to the mutants, and longer simulations may be needed to identify how the structural changes imparted by the mutations affect the protease's functionality.

Introduction to the First Issue

Steven I. Gordon

Volume 1, Issue 1 (December 2010), pp. 1–1

PDF icon Download PDF

It is with great pleasure that we release the first issue of the Journal of Computational Science Education. The journal is intended as an outlet for those teaching or learning computational science to share their best practices and experiences with the community. Included are examples of programs and exercises that have been used effectively in the classroom to teach computational science concepts and practices, assessments of the impact of computational science education on learning outcomes in science and engineering fields, and the experiences of students who have completed significant computational science projects. With a peer-reviewed journal, we hope to provide a compendium of the best practices in computational science education along with links to shareable educational materials and assessments.

Computational Algebraic Geometry as a Computational Science Elective

Adam E. Parker

Volume 1, Issue 1 (December 2010), pp. 2–7

PDF icon Download PDF

This paper presents a new mathematics elective for an undergraduate Computational Science program. Algebraic Geometry is a theoretical area of mathematics with a long history, often highlighted by extreme abstraction and difficulty. This changed in the 1960s when Bruno Buchberger created an algorithm that allowed Algebraic Geometers to compute examples for many of their theoretical results and gave birth to a subfield called Computational Algebraic Geometry (CAG). Moreover, it introduced many rich applications to biology, chemistry, economics, robotics, recreational mathematics, etc. Computational Algebraic Geometry is usually taught at the graduate or advanced undergraduate level. However, with a bit of work, it can be an extremely valuable course to anyone with decent algebra skills. This manuscript describes Math 380: Computational Algebraic Geometry and shows the usefulness of the class as an elective to a Computational Science program. In addition, a module that gives students a high-level introduction to this valuable computational method was constructed for our Introductory Computational Science course.

Using WebMO to Investigate Fluorescence in the Ingredients of Energy Drinks

Mark Smith, Emily Chrisman, Patty Page, and Kendra Carroll

Volume 1, Issue 1 (December 2010), pp. 8–12

PDF icon Download PDF

With computers gaining more powerful processors, computational modeling can be introduced gradually to secondary students allowing them to visualize complex topics and gather data in the different scientific fields. In this study, students from four rural high schools used computational tools to investigate attributes of the ingredients that might cause fluorescence in energy drinks. In the activity, students used the computational tools of WebMO to model several ingredients in energy drinks and gather data on them, such as molecular geometry and ultraviolet-visible absorption spectra (UV-Vis spectra). Using the data they collected, students analyzed and compared their ingredient molecules and then compared them to molecules that are known to fluoresce to determine any patterns. After students participated in this activity, data from testing suggest they were more aware of fluorescence, but not more aware of how to read an UV-Vis spectrum.

The Use of Spreadsheets and Service-Learning Projects in Mathematics Courses

Morteza Shafii-Mousavi and Paul Kochanowski

Volume 1, Issue 1 (December 2010), pp. 13–27

PDF icon Download PDF

In the Indiana University system, as well as many other schools, finite mathematics is a prerequisite for most majors, especially business, public administration, social sciences, and some life science areas. Statisticians Moore, Peck, and Rossman (2002) articulate a set of goals for mathematics prerequisites: including instilling an appreciation of the power of technology and developing skills necessary to use appropriate technology to solve problems, developing understanding, and exploring concepts. The paper describes the use of Excel spreadsheets in the teaching and learning of finite mathematics concepts in the linked courses Mathematics in Action: Social and Industrial Problems and Introduction to Computing taught for business, liberal arts, science, nursing, education, and public administration students. The goal of the linked courses is to encourage an appreciation of mathematics and promote writing as students see an immediate use for it in completing actual real-world projects. The courses emphasize learning and writing about mathematics and the practice of computer technology applications through completion of actual industrial group projects. Through demonstration of mathematical concepts using Excel spreadsheet, we stress synergies between mathematics, technology, and real-world applications. These synergies emphasize the learning goals such as quantitative skill development, analytical and critical thinking, information technology and technological issues, innovative and creative reasoning, and writing across the curriculum.

Computational Chemistry for Chemistry Educators

Shawn C. Sendlinger and Clyde R. Metz

Volume 1, Issue 1 (December 2010), pp. 28–32

PDF icon Download PDF

In this paper we describe an ongoing project where the goal is to develop competence and confidence among chemistry faculty so they are able to utilize computational chemistry as an effective teaching tool. Advances in hardware and software have made research-grade tools readily available to the academic community. Training is required so that faculty can take full advantage of this technology, begin to transform the educational landscape, and attract more students to the study of science.

Testing the Waters with Undergraduates (If you lead students to HPC, they will drink)

Angela B. Shiflet and George W. Shiflet

Volume 1, Issue 1 (December 2010), pp. 33–37

PDF icon Download PDF

For the Blue Waters Undergraduate Petascale Education Program (NSF), we developed two computational science modules, "Biofilms: United They Stand, Divided They Colonize" and "Getting the 'Edge' on the Next Flu Pandemic: We Should'a 'Node' Better." This paper describes the modules and details our experiences using them in three courses during the 2009-2010 academic year at Wofford College. These courses, from three programs, included students from several majors: biology, chemistry, computer science, mathematics, physics, and undecided. Each course was evaluated by the students and instructors, and many of their suggestions have already been incorporated into the modules.

Parallelization of Particle-Particle, Particle-Mesh Method within N-Body Simulation

Nicholas Nocito

Volume 1, Issue 1 (December 2010), pp. 38–43

PDF icon Download PDF

The N-Body problem has become an intricate part of the computational sciences, and there has been rise to many methods to solve and approximate the problem. The solution potentially requires on the order of calculations each time step, therefore efficient performance of these N-Body algorithms is very significant [5]. This work describes the parallelization and optimization of the Particle-Particle, Particle-Mesh (P3M) algorithm within GalaxSeeHPC, an open-source N-Body Simulation code. Upon successful profiling, MPI (Message Passing Interface) routines were implemented into the population of the density grid in the P3M method in GalaxSeeHPC. Each problem size recorded different results, and for a problem set dealing with 10,000 celestial bodies, speedups up to 10x were achieved. However, in accordance to Amdahl's Law, maximum speedups for the code should have been closer to 16x. In order to achieve maximum optimization, additional research is needed and parallelization of the Fourier Transform routines could prove to be rewarding. In conclusion, the GalaxSeeHPC Simulation was successfully parallelized and obtained very respectable results, while further optimization remains possible.

An Automated Approach to Multidimensional Benchmarking on Large-Scale Systems

Samuel Leeman-Munk and Aaron Weeden

Volume 1, Issue 1 (December 2010), pp. 44–50

PDF icon Download PDF

High performance computing raises the bar for benchmarking. Existing benchmarking applications such as Linpack measure raw power of a computer in one dimension, but in the myriad architectures of high performance cluster computing an algorithm may show excellent performance on one cluster while on another cluster of the same benchmark it performs poorly. For a year a group of Earlham student researchers worked through the Undergraduate Petascale Education Program (UPEP) on an improved, multidimensional benchmarking technique that would more precisely capture the appropriateness of a cluster resource to a given algorithm. We planned to measure cluster effectiveness according to the thirteen dwarfs of computing as published in Berkeley's parallel computing research paper. To accomplish this we created PetaKit, a software stack for building and running programs on cluster computers.