Machine-learning-augmented block play

Michael Rumbelow is a PhD student in the School of Education, researching early concepts of numbers in children through the medium of block play. With seedcorn funding from the Bristol Digital Futures Institute, Michael, his PhD supervisor Alf Coles, and computer vision consultant PySource, collaborated to create a prototype application that augments and analyses block play in real-time. The Python application uses a machine learning model to recognise blocks from a webcam's video feed, responding to block arrangements in real-time with interactive educational elements. In one mode of operation, for example, blocks represent mathematical objects and arrangements lead to audio output being generated, e.g. "two plus three equals five".

The researchers contacted the RSE team's ask-rse mailbox to request assistance with (i) training the machine learning model used in the prototype application and (ii) installing the application on a new machine. A member of the RSE team, James Womack, assisted the researchers through a series of video consultations and independent investigative work.

The technical assistance provided by the RSE team enabled the researchers to make more effective use of the prototype application, giving them the capability to train the machine learning model on new data and to run the application on multiple machines.

The advice from ask-rse has been incredibly valuable for our project to develop an innovative app to use AI to detect children’s block play for our research.It has significantly improved the quality and value for money we have achieved on the project, for example in deciding which direction to go in the spec, and exploring the potential for AI and training methods, and there were several times when we would have been stuck without ask-rse’s practical help, for example in re-installing the app on other devices.To have ask-rse on hand has greatly increased our confidence in applying novel software and hardware to our research.

- Michael Rumbelow

Image provided by Michael Rumbelow to illustrate the ML-augmented block play case study.

Motivation

When the researchers first contacted ask-rse, they had a copy of the prototype block play application running on a single laptop using a pre-trained Mask R-CNN model for block recognition. To improve the accuracy and robustness of block recognition, the model would require re-training with a more extensive set of training images. The computer vision consultant had provided some outline instructions for training the model for use with the application using a Jupyter notebook on Google's Colab cloud platform. As the application was still in the early stages of development, the installation process had not been documented and the supporting software packages it required to run had not been fully specified.

The researchers were new to software development and machine learning and requested the assistance of the RSE team in (i) working through the model training procedure, and (ii) installing the prototype application on other computers.

Solution

Training the machine learning model

The RSE investigated and tested the training process outlined by PySource, then led Michael through the process.

The outline instructions suggested using Google's Colab to run a Jupyter notebook containing the Mask-RCNN training code. While investigating the process, the RSE found that Colab was not a stable or robust way to train the model, because Colab's free offering has dynamic usage limits, which vary depending on availability. Running the training notebook on Colab using a pre-annotated dataset could not be completed because the Colab instance reached its maximum duration before completing the training.

The RSE recommended converting the Jupyter notebook into a non-interactive script that could be run on the ACRC's high performance computing (HPC) facilities. The HPC facilities would provide a more reliable compute resource with predictable limits suitable for training the computer vision model. Running the training as a non-interactive script would have the additional advantage of allowing training to be easily re-run in the future, for example where additional training data became available.

Once he had been made aware of the university's HPC resources, Michael was enthusiastic about using them in his research. The RSE offered to support and advise Michael in making effective use of the HPC facilities, and started by signposting the ACRC's HPC training courses and the HPC account application form.

Installation of the prototype application

The RSE examined the prototype application to determine how to reliably install it, then guided the researchers through the process on Michael's supervisor's laptop.

Michael had a working copy of the application running on his Windows laptop, but the application's dependencies and the procedure to install it had not been fully documented. The application prototype's installation on Michael's laptop had been ad hoc, installing dependencies when needed and requiring minor machine-specific modifications to the source code.

The RSE investigated the process of installing the prototype application in a Windows development virtual machine. He successfully installed a working copy of the prototype in the virtual machine, identified the software's dependencies, and fully documented the process in a GitHub repository. Since the researchers were primarily concerned with running the application on Windows, the documentation was written specifically for Windows, making use of Windows' native command line shell, PowerShell.

During his investigation, the RSE codified the application's dependencies in Conda environment and pip requirements files, which were included in the GitHub repository. The application required the older TensorFlow 1 library, and the most recent version 1 release of the tensorflowpackage on PyPI supported only older versions of Python (up to 3.7, released in 2018). The documented installation process ensures the availability of the correct Python 3.7 interpreter, while keeping the older Python 3.7 isolated from other versions of Python on the same computer. This is achieved by installing Python 3.7 into a Conda environment, then installing the required Python packages within that environment using pip.

After testing and documenting the installation process, the RSE led the researchers through the process in a video call. With the guidance provided during the video call, the researchers were able to successfully install and run the application on Alf's laptop. This had the immediate practical value of providing the team with an additional laptop with a working copy of the application to use in real-world testing (e.g. in schools).

Conclusion

In this ask-rse interaction, the RSE team assisted researchers in the early development of an innovative machine-learning-augmented educational application. A combination of investigative work and tailored support was provided, enabling the researchers to install their prototype application on multiple computers and understand the process of training the underlying computer vision model.

Since the researchers' project was at a prototype stage when they approached ask-rse, the emphasis of the support offered was on enabling the researchers to effectively plan and navigate the technical aspects of the next phase of their project, in particular: improving the robustness of the software and making effective use of the compute resources available at the university (e.g. HPC).

In feedback to the RSE team, the researchers indicated they greatly valued the assistance and support offered through ask-rse. They indicated that their interaction with ask-rse had positive impacts beyond the practical support provided, crediting ask-rse with:

Making the research team aware of the HPC resources available at the university and the possibility of using these to support their work
Providing a "second opinion" on technical issues and development directions for the project, to complement the views of other technical collaborators
Helping the researchers develop confidence in talking about the technology used in their work, through discussion and informal technical mentoring

Following this interaction, the RSE team supported and advised the researchers in their successful application to the 2021-22 round of Jean Golding Institute (JGI) seedcorn funding to improve the robustness and expand the functionality of the block play recognition application. As part of this project, the RSE team are supporting the researchers to adapt their machine learning model training code to make use of the university's HPC resources.

Appendix

The process of installing the prototype application on Windows was fully documented as a Markdown document in a GitHub repository. The repository was originally privately shared with the researchers for internal use. With the agreement of the researchers, the repository has been made publicly available (with CC BY-SA 4.0 and MIT licenses) to accompany this case study: BristolRSE/block_play_ml on GitHub.