A new curriculum has been designed by MIT researchers and collaborators to teach middle school students about artificial intelligence (AI). It aims to bring awareness of the technology to the sector of the population which is growing about surrounded by AI.
The open-source educational material was piloted at Massachusetts STEM week in the fall of 2019. It covers aspects of the technology such as how AI systems are designed, ways they can be used to influence the public, and their role within the future job market.
Back in October during Mass STEM Week, many middle schools within the commonwealth had a change in curriculum. There was an immersive week of hands-on learning, and it was led by a team consisting of Cynthia Breazeal, associate professor of media arts and sciences at MIT; Randi Williams ‘18, graduate research assistant in the Personal Robots Group at the MIT Media Lab; and i2 Learning, a nonprofit organization.
“Preparing students for the future means having them engage in technology through hands-on activities. We provide students with tools and conceptual frameworks where we want them to engage with our materials as conscientious designers of AI-enabled technologies,” Breazeal says. “As they think through designing a solution to address a problem in their community, we get them to think critically about the ethical implications of the technology.”
The idea to bring awareness of the technology to young students began three years ago with the Personal Robots Group. They started a program meant to teach AI concepts to preschoolers, and it then spread to other learning experiences and more children. Eventually, the group developed a curriculum for middle school students. An AI curriculum was piloted in Somerville, Massachusetts last Spring.
“We want to make a curriculum in which middle-schoolers can build and use AI — and, more importantly, we want them to take into account the societal impact of any technology,” says Williams.
The curriculum is called How to Train Your Robot, and it was first piloted during an i2 summer camp in Boston. It was then presented to teachers by students during Mass STEM Week, and some of the teachers took part in two days of professional development training. The training was aimed at preparing the teachers to give more than 20 class hours of AI content to students. The curriculum was used within three schools across six classrooms.
Blakeley Hoffman Payne, a graduate research assistant in the Personal Robots Group, was responsible for some of the work in the AI curriculum. Payne’s research focuses on the ethics of artificial intelligence and how to teach children to design, use, and think about AI. Students took part in discussions and creative activities, such as designing robot companions and deploying machine learning to solve problems. Students then shared their inventions with their communities.
“AI is an area that is becoming increasingly important in people’s lives,” says Ethan Berman, founder of i2 Learning and MIT parent. “This curriculum is very relevant to both students and teachers. Beyond just being a class on technology, it focuses on what it means to be a global citizen.”
One of the projects involved students building a “library robot” that was designed to locate and retrieve books for people with mobility challenges. Students had to take things into account such as how the technology would affect the job of a librarian and how it impacts the work.
The curriculum could be expanded to more classrooms and schools, and other disciplines could be added. Some other possible disciplines include social studies, math, science, art, and music, and the ways in which these can be implemented into the AI projects will be explored.
“We hope students walk away with a different understanding of AI and how it works in the world,” says Williams, “and that they feel empowered to play an important role in shaping the technology.”
Researchers Develop Tool Able to Turn Equations Into Illustrations
Researchers at Carnegie Mellon University have created a tool that is able to turn the abstractions of mathematics into illustrations and diagrams through software.
The process works by users typing ordinary mathematical expressions which are then turned into illustrations by the software. One of the major developments in this project is that the expressions are not required to be basic functions, as in the case of a graphing calculator. Instead, they can be complex relationships coming from various different fields within mathematics.
The tool was named Penrose by the researchers, inspired by the mathematician and physicist Roger Penrose, who is known for using complex mathematical and scientific ideas through diagrams and drawings.
Penrose will be presented by researchers at the SIGGRAPH 2020 Conference on Computer Graphics and Interactive Techniques. The conference will take place virtually this year due to the COVID-19 pandemic.
Keenan Crane is an assistant professor of computer science and robotics.
“Some mathematicians have a talent for drawing beautiful diagrams by hand, but they vanish as soon as the chalkboard is erased,” Crane said. “We want to make this expressive power available to anyone.”
Diagrams are not used as much in technical communication, due to the required amount of high-skill and tedious work required in order to produce them. To get around this, the Penrose tool allows experts to encode the steps in the system, and other users are then able to access this by using mathematical language. All of this means that the computer is doing most of the work.
Katherine Ye is a Ph.D student in the Computer Science Department.
“We started off by asking: ‘How do people translate mathematical ideas into pictures in their head?'” said Ye. “The secret sauce of our system is to empower people to easily ‘explain’ this translation process to the computer, so the computer can do all the hard work of actually making the picture.”
The computer first learns how the user wants the mathematical objects visualized, such as an arrow or a dot, and it then draws up multiple diagrams. The user selects and edits one of those diagrams.
According to Crane, mathematicians should have no problem learning the special programming language that the team developed.
“Mathematicians can get very picky about notation,” he said. “We let them define whatever notation they want, so they can express themselves naturally.”
Penrose is seen as a step towards something even bigger.
“Our vision is to be able to dust off an old math textbook from the library, drop it into the computer and get a beautifully illustrated book — that way more people understand,” Crane said.
The team that developed Penrose also included Nimo Ni and Jenna Wise, who are Ph.D. students in CMU’s Institute for Software Research (ISR); Jonathan Aldrich, professor in ISR; Joshua Sunshine, ISR senior research fellow; Max Krieger, cognitive science undergraduate; and Dor Ma’ayan, former master’s student at the Technion-Israel Institute of Technology.
The research was supported by the National Science Foundation, Defense Advanced Research Projects Agency, the Sloan Foundation, Microsoft Research, and the Packard Foundation.
How Riiid! is Helping to Bring in New Era of AI-Education
Riiid is a South-Korean based series C startup company with $31.3 million in funding. The company develops and provides AI-powered solutions for the education sector, with a specialized focus on standardized testing.
The team at Riiid also conducts research in order to develop the AI models, which are then put on the company’s commercialized platform called “Santa.”
Santa and ITSs
Santa is a multi-platform English Intelligent Tutoring System (ITS), and it contains an AI tutor that provides a one-on-one curriculum for users.
ITSs are receiving a lot of attention in both the AI and education sectors, mostly due to their ability to provide students with personalized learning experiences through the use of deep-learning algorithms. ITSs suggest certain studying strategies for individuals.
Santa is a test prep platform for the Test of English for International Communication (TOEIC). The platform has over one million users in South Korea, specifically for the TOEIC.
After establishing its first United States office in early 2020, Riiid is looking to expand further and go beyond just the TOEIC, with a plan to target other test areas such as the ACT, SAT, and GMAT.
Recent Studies and Research
The company has two recent research papers, with one of the key findings being that deep learning algorithms can help improve student engagement.
One of the papers is titled “Prescribing Deep Attentive Score Prediction Attracts Improved Student Engagement,” which was accepted into the top-tier AI education conference Educational Data Mining (EDM).
The team ran a controlled A/B test on the ITS with two separate models based on collaborative filtering and deep-learning algorithms.
After being tested on 78,000 users, it was determined that the deep learning algorithms resulted in higher student morale, such as higher diagnostic test completion ratio and number of questions answered. It also resulted in more active engagement on Santa, shown through a higher purchase rate and improved total profit.
The second paper was titled “Deep Attentive Study Session Dropout Prediction in Mobile Learning Environment.” It was accepted by the global AI education conference CSEDU.
The paper focused on student engagement, and the team sought insight into student dropout prediction, specifically in regard to study session dropout prediction in a mobile learning environment. By observing this problem, the team believed there was a chance to increase student engagement.
The research suggested a method for maximizing learning effects by observing the dropout probability of individual users within the mobile learning environment. Their model is called DAS, or Deep Attentive Study Session Dropout Prediction in Mobile Learning Environment.
Through the use of deep attentive computations that extract information out of student interactions, Riiid’s model can accurately predict dropout probability.
The Santa platform was incorporated into the model, providing questions that were determined to have a low-dropout probability. By recommending certain questions, students were more likely to stay engaged and continue studying, rather than dropping out of the session.
According to the research team, “To the best of our knowledge, this is the first attempt to investigate study session dropout in a mobile learning environment.”
Riiid is one of the world’s leading startups for developing ITSs and providing AI-solutions in the education sector. As education and AI technology become more interconnected, companies like Riiid will usher in a new era of learning methods and systems, while trying to overcome the current challenges surrounding student engagement.
The Future of Speech Scoring – Thought Leaders
Across the world, the number of English language learners continues to rise. Educational institutions and employers need to be able to assess the English proficiency of language learners – in particular, their speaking ability, since spoken language remains among the most essential language abilities. The challenge, for both assessment developers and end users, is finding a way to do so that is accurate, fast and financially viable. As part of this challenge, scoring these assessments comes with its own set of factors, especially when we consider the different areas (speech, writing, etc.) one is being tested on. With the demand for English-language skills across the globe only expected to increase, what would the future of speech scoring need to look like in order to meet these needs?
The answer to that question, in part, is found in the evolution of speech scoring to date. Rating constructed spoken responses has historically been done using human raters. This process, however, tends to be expensive and slow, and has additional challenges including scalability and various shortcomings of human raters themselves (e.g., rater subjectivity or bias). As discussed in our book Automated Speaking Assessment: Using Language Technologies to Score Spontaneous Speech, in order to address these challenges, an increasing number of assessments now make use of automated speech scoring technology as the sole source of scoring or in combination with human raters. Before deploying automated scoring engines, however, their performance needs to be thoroughly evaluated, particularly in relation to the score reliability, validity (does the system measure what it is supposed to?) and fairness (i.e., the system should not introduce bias related to population subgroups such as gender or native language).
Since 2006, ETS’s own speech scoring engine, SpeechRater®, has been operationalized in the TOEFL® Practice Online (TPO) assessment (used by prospective test takers to prepare for the TOEFL iBT® assessment), and since 2019, SpeechRater has also been used, along with human raters, for scoring the speaking section of the TOEFL iBT® assessment. The engine evaluates a wide range of speaking proficiency for spontaneous non-native speech, including pronunciation and fluency, vocabulary range and grammar, and higher-level speaking abilities related to coherence and progression of ideas. These features are computed by using natural language processing (NLP) and speech processing algorithms. A statistical model is then applied to these features in order to assign a final score to a test taker’s response.
While this model is trained on previously observed data scored by human raters, it is also reviewed by content experts to maximize its validity. If a response is found to be non-scorable due to audio quality or other issues, the engine can flag it for further review to avoid generating a potentially unreliable or invalid score. Human raters are always involved in the scoring of spoken responses in the high-stakes TOEFL iBT speaking assessment.
As human raters and SpeechRater are currently used together to score test takers’ responses in high-stakes speaking assessments, both play a part in what the future of scoring English language proficiency can be. Human raters have the ability to understand the content and discourse organization of a spoken response in a deep way. In contrast, automated speech scoring engines can more precisely measure certain detailed aspects of speech, such as fluency or pronunciation, exhibit perfect consistency over time, can reduce overall scoring time and cost, and are more easily scaled to support large testing volumes. When human raters and automated speech scoring systems are combined, the resulting system can benefit from the strengths of each scoring approach.
In order to continuously evolve automated speech scoring engines, research and development needs to focus on the following aspects, among others:
- Building automatic speech recognition systems with higher accuracy: Since most features of a speech scoring system rely directly or indirectly on this component of the system that converts the test taker’s speech to a text transcription, highly accurate automatic speech recognition is essential for obtaining valid features;
- Exploration of new ways to combine human and automated scores: In order to take full advantage of the respective strengths of human rater scores and automated engine scores, more ways of combining this evidence need to be explored;
- Accounting for abnormalities in responses, both technical and behavioral: High-performing filters capable of flagging such responses and excluding them from automated scoring are necessary to help ensure the validity and reliability of the resulting assessment scores;
- Assessment of spontaneous or conversational speech that occurs most often in day-to-day life: While automated scoring of such interactive speech is an important goal, these items present numerous scoring challenges, including overall evaluation and scoring;
- Exploring deep learning technologies for automated speech scoring: This relatively recent paradigm within machine learning has produced substantial performance increases on many artificial intelligence (AI) tasks in recent years (e.g., automatic speech recognition, image recognition), and therefore it is likely that automated scoring also may benefit from using this technology. However, since most of these systems can be considered “black-box” approaches, attention to the interpretability of the resulting score will be important to maintain some level of transparency.
To accommodate a growing and changing English-language learner population, next-generation speech scoring systems must expand automation and the range of what they are able to measure, enabling consistency and scalability. That is not to say the human element will be removed, especially for high-stakes assessments. Human raters will likely remain essential for capturing certain aspects of speech that will remain hard to evaluate accurately by automated scoring systems for a while to come, including the detailed aspects of spoken content and discourse. Using automated speech scoring systems in isolation for consequential assessments also runs the risk of not identifying problematic responses by test takers— for instance, responses that are off-topic or plagiarized, and, as a consequence, can lead to reduced validity and reliability. Using both human raters and automated scoring systems in combination may be the best way for scoring speech in high-stakes assessments for the foreseeable future, particularly if spontaneous or conversational speech is evaluated.
ETS works with education institutions, businesses and governments to conduct research and develop assessment programs that provide meaningful information they can count on to evaluate people and programs. ETS develops, administers and scores more than 50 million tests annually in more than 180 countries at more than 9,000 locations worldwide. We design our assessments with industry-leading insight, rigorous research and an uncompromising commitment to quality so that we can help education and workplace communities make informed decisions. To learn more visit ETS.
- Phil Duffy, VP of Product, Program & UX Design at Brain Corp – Interview Series
- Adi Singh, Product Manager in Robotics at Canonical – Interview Series
- Clearview AI Halts Facial Recognition Services in Canada Amid Investigation
- Mike Lahiff, CEO at ZeroEyes – Interview Series
- U.S. Sees First Case of Wrongful Arrest Due to Bad Algorithm