How Data Works to Support DIY Learning

Blog Series, EdTech, GenDIY, Platforms & Data

Jim Goodell

Noah is a 17-year-old multi-lingual student. He can speak six languages even though his family speaks only English and his public high school offers classes in only two non-English languages. Noah didn’t have formal opportunities to pursue his linguistic interests, so he took matters into his own hands, discovering online tools and social networks for self-directed learning.

Noah’s story provides some good examples of the kinds of data and technology enabling do-it-yourself learning.

Generation Do-It-Yourself (GenDIY) has unprecedented opportunities to chart their own course for lifelong learning as part of a career pathway, to reach a personal academic goal, or just to satisfy a curiosity.

The data used to match learning experiences with personal needs, preferences, and ability levels, and data within online learning applications to provide continuous feedback, are empowering learners like Noah to move beyond the constraints of traditional education.

Do-it-yourself learning is taking place on two levels:

  1. Formal systems of education are adopting student-centered options, giving students voice and choice, and visibility into how short-term choices support longer term career goals, and
  2. Learners of all ages are acting on their own, discovering and using technology enabled tools to reach their own learning goals.


Prior to high school Noah took an online course in Latin. He worked through a book and viewed videos at his own pace. At the time Noah was home schooled, but schools across the U.S. and around the world are also leveraging a rich set of online options to offer courses that they cannot staff. Course choice opens doors for students, especially in communities that cannot attract teachers with specialized subject matter expertise, or cannot fill a class with enough students to justify the course.

After he discovered his interest in language learning, a friend told Noah about a free language-learning tool that he happened to read about in a technology blog. That tool was Duolingo, the award winning free website and app.

Data for Discovery

Noah was fortunate to have a friend point him toward Duolingo, but data is also helping the GenDIY self-discover the right DIY learning tools and opportunities.

Linked data on the Web supports discovery of learning resources (courses, apps, learning experiences, and social learning opportunities). Metadata (data about data) is being used by the major search engines to better filter search results to meet learner needs and preferences. Publishers of learning resources tag web pages with metadata attributes, such as specific competencies addressed and intended audience, in a format that the search engines can read. Metadata may include tags about accessibility of the resource, such as if a video is closed captioned for the hearing impaired. This helps the self-directed learner find resources to fit personal needs and preferences. is a standard for tagging web content developed through collaboration of the major search engines such as Google, Yahoo, Bing, and Yandex.

Paradata” gives DIY learners indicators of learning resource usefulness, for example, many Facebook “likes” for a language learners group increases the visibility of the group and becomes a paradata assertion about its usefulness. Likewise social media posts with links to a page describing a learning resource say something about its popularity, or a formal endorsement of the resource by an organization (such as a state education agency) may be captured in a public repository, such as the Learning Registry.

Gamification and Intelligent Tutoring Data

With the help of Duolingo Noah learned Spanish, Portuguese, French, and Irish well enough to engage in conversations, and a bit of 11 other languages.  Apart from Duolingo, he is also learning Haitian/Creole using other web resources and with a friend at school who speaks the language.

Factors that make Duolingo an effective tool include its bite-sized assessment-as-learning lessons and continuous game-like feedback. This is competency-based tutoring at its best. Learners advance only after demonstrating mastery on granularly defined competencies, such as translating a specific word or phrase. Feedback is instantaneous and focused on correcting specific weaknesses. I see a lot of similarities between principles within gamification and learning sciences, both draw from an expanding knowledge of how the human brain develops and adapts to new challenges.  Game mechanics address learner motivation, providing the right level of challenge at the right time (zone of proximal development), building new knowledge/skills on existing knowledge/skills (constructivism), goal setting and visibility into thinking and progress (learner agency).

To deliver this kind of experience for the learner requires a rich set of data behind each assessment item (the granular competency being assessed, what a correct or incorrect answer means and what remedial feedback to give, etc.), detailed data collected every time the learner attempts to answer to guide feedback and progress, and data about the competencies and competency-based pathway.

“Big Data” and a Warning about Learning Styles Data

The theory of learning styles has been intensely reviewed, tested and debunked,” but well meaning organizations still offer learning style assessments and attempt to use the data to personalize learning.

Yes, big data sets can be used by recommendation engines to help filter all possible learning activities down to a few that are a good fit, just like Google targets advertising and Amazon suggests products “you also might like.” However, the notion that a person is a fixed type of learner that can be classified using a one-time assessment is oversimplified. Preferences change over time, the “best” instructional/study methods will vary based on context, and students may need to try multiple modes of instruction (see a concept in different ways) before mastering some learning objectives. It may be helpful for a learner to think about what kind of learning mode they generally prefer, but multiple options for each lesson allow the learner to choose how they right now. Even Google search results give a list of options and let the user pick…I don’t know anyone that regularly uses the “I’m feeling lucky” option.

The mode of presentation (visual, auditory, kinetic, etc.) is just one of many variables factor into selecting a learning activity. Being precise about the granular competency that the learning activity addresses, and the quality of the resource, is more important than the mode of presentation.

Analytics engines, informed by big data, can do more than predict how well a learning activity will work for a student.  They can help create conditions for motivation and engagement to help the learner reach personal goals.

Social Learning

Noah learns with friends on social media including Google hangouts and Facebook language learners groups. He also seeks out native speakers of the languages he is learning. When visiting the city where a relative lives, he made it a point to walk into a Portuguese bakery and start a conversation with the people working there.

Through school choice, he is attending a high school outside of his home district and enrolled in a French class just to get required credit for graduation, but he doesn’t think he’s learning anything there that he has or could learn on his own initiative. And his friends on social media are more at his level for conversations in French. So next semester his high school teacher will create a special “French 5” independent study option in which Noah will help teach French to freshmen.

Peer assessment can be an effective part of DIY learning. For some subjects data may be collected with online rubric-based peer assessment tools. Assessment-for-learning data is informs feedback.

Data for Feedback

There are three levels of feedback to support student-centered learning:

  1. Immediate feedback given during the learning activity after each click/response,
  2. Feedback at the end of a lesson that answers the question “What next?”
  3. Dashboards and progress maps that answer the question “How am I doing in reaching short and long-term goals?”

The 3rd kind of feedback allows learners to carry out personal learning plans as a kind of GPS guiding them to longer-term goals.

Data for Planning and Decision-Making

DIY learners are motivated by a purpose. Noah‘s fascination with linguistics motivated him to take ownership of his own learning. That interest is leading to decisions about college and career.  Often the purpose for learning is to gain abilities needed to support a cause, calling, or career goal.  Noah sees himself pursuing a career as a translator, but realizes that his interests and goals may change in the future.

Emerging sources of data will help DIY learners map backwards to identify credentials needed to support cause or career, and the competencies required to attain each credential. There is a trend in higher education and workforce training to offer stackable credentials such as a certificate that counts toward a degree. Projects such as the Credential Registry plan to provide data to help DIY learners make informed decisions about long-term learning goals and alternative pathways to reaching those goals.

The DIY learner then can track progress toward goals with the right data about achievements. Most of the time progress data is not in control of the learner and constrained to a specific context, such as language learning data within Duolingo, mathematics data in Khan Academy, course transcript data in a high school or college information system. However, several initiatives are working to give students control of their data. Initiatives like the Badge Alliance have published standards for the data representing achievements, and other organizations are building on previous work toward student-centered, secure, verifiable claims and credentials.

Data about pathways, plans, and progress can be combined and presented in a dashboard for the DIY learner. This is already available within silos, but someday learners will be able to get a more complete picture.

Finally, the same kind of “paradata” used to rate quality and fit of individual learning resources can also be used to inform bigger decisions, such as quality, fit, and cost-effectiveness of college programs.

Now, Noah is considering a college that has a large language department with a good reputation, but that doesn’t tell him if the program is better than other options at preparing people to do what he wants to do after college. It also doesn’t tell him if the program is the most cost effective way of reaching his long-term goals. Some of this information can be discovered/collected from unstructured data, e.g. within social media and surveys. Other data might be generated through “big data” analytics. (Existing “college recommendation engines” tend to be more about evaluating the student’s chances of being accepted, rather than evaluating the value that a college program offers its graduates.)

A Vocabulary for Talking about GenDIY Education Data

The Common Education Data Standards (CEDS) defines the meaning of data elements used to support DIY learning including data for discovery of learning resources/opportunities, data used in assessment-as-learning and intelligent tutoring systems, data for planning and decision-making (including competency and credentials definitions, and achievement tracking). includes a searchable glossary of data “vocabulary” that is aligned to many of the other standards mentioned in this article. Other standards address the protocols and technical details for interoperability of systems and content for each of the kinds of data.


About “GenDIY”
eduInnovation and Getting Smart have partnered with The J.A. and Kathryn Albertson Family Foundation to produce a thought leadership campaign called Generation Do-It-Yourself (GenDIY)– how young people are hacking a pathway to a career they love – on The Huffington Post This campaign about reimagining secondary and postsecondary education and career skills will explore the new generation building a global economy and experiences that are impact driven and entrepreneurial. For more on GenDIY:

Jim Goodell is a Senior Analyst at Quality Information Partners, Inc. (QIP). Follow Jim on Twitter, @jgoodell2.

Stay in-the-know with all things EdTech and innovations in learning by signing up to receive the weekly Smart Update. This post includes mentions of a Getting Smart partner. For a full list of partners, affiliate organizations and all other disclosures please see our Partner page.


Jessie Chuang /

Great writing!
Presenting data in a meaningful way to learners in real-time and in a bigger picture of pathway later is the best use of learning data, especially learning “process” data so that learners can adjust their behaviors, like driving with a dashboard. With gamification and social elements, it could influence learners’ actions to maximize data’s impact.
Designing the presentation of data with purposes in mind, and helping learners know how to look at their own data are the priorities.