As ITS’s Marketing Director and a self-proclaimed generative AI enthusiast, I’ve made it my mission to learn everything I could about AIG and begin implementing it in our business processes over the last year. At this point, I consider myself an “AI apprentice,” developing more every day. Sort of like a Jedi Padawan, but with less “force-sensitive” powers 😉.
While I understand the strengths that AI brings to a corporate setting, test development is another story. When it comes to the assessment industry, generative AI, in some cases, can hinder the test development process, especially when you consider the copyright and accreditation repercussions (and probably other topics, but those aren’t my expertise. See, I am not a lawyer, and this is not legal advice). I’m convinced there are other areas in the credentialing process where generative AI has powerful applications. That’s why, when ITS released our latest product, SparkAI™, we vision-boarded it to encompass all things AI, across all ITS products, and not just content generation.
That’s not to say that automatic item generation (AIG) doesn’t have its time and place for use! In the right setting, testing programs can expedite writing significantly with AIG, which is quite attractive given that timelines for test development cycles can span well over a year. Practice exams are also a great use case for AIG where simple recall items are created.
But, perhaps you’re in a similar category as me, riding the AI high, while still rightfully curious. Are the risks worth the reward? This blog will explore questions around AIG to help you decide if you’re ready to get started.
The Promised FAQ
Before we start, a caveat is that my answers explain what our AIG feature, SparkAI, offers. There are other products on the market, and they all have their own development teams, configurations, etc. So, for the benefit of ITS customers and fans, we will explore the answers as they relate to SparkAI in Item Workshop (our item banking solution).
What types of generative AI models does SparkAI support?
SparkAI is “foundation-model agnostic” which means it’s not tied to a specific foundational or base language model and supports a wide range of providers and their models. In the context of generative AI, a foundation model refers to a pre-trained language model that serves as the starting point for various downstream tasks. For example, models like OpenAI’s GPT or Google’s BERT are considered foundation models. These models are pre-trained on large datasets to learn general language representations, and they can then be fine-tuned for specific tasks.
Can I bring my proprietary model?
Sure can! Clients can bring their already trained, proprietary models and use them in Item Workshop. We can even allow you to control which models are available for use in which content folders. So, for example, if you have an IT exam on cloud architecture, you may want to use a model that’s specifically trained on that topic to write those items. Let’s say you also have a content folder on database administration—you may want a model that’s trained on that topic, not cloud architecture. So, you would enable that model instead which allows you to control what people are selecting and what they’re using to generate their items.
What type of test items can be created with SparkAI?
Currently, SparkAI can generate multiple choice (single or multiple selection) items, reading passages, and input text (fill in the blank) items. Clients have control to generate reading passages themselves, but they can also generate new multiple choice or input items based on stimuli, or existing reading passages.
Are there biases that need to be addressed when using AI?
Many in the industry are raising concerns about potential biases in training data. It’s like having a super-smart assistant—sometimes it might unintentionally bring in some unintended baggage. If the data used to train the AIG model is biased, the generated assessment items might also reflect those biases. Imagine if your assistant learned language from a narrow set of sources—it might not fully grasp the rich diversity of expressions.
How does AI handle diverse subject areas or domains in assessment creation, such as science, mathematics, language, etc.?
Large language models (LLMs) are super versatile! They can handle diverse subject areas and can be further trained to specialize in specific domains (each of which is known as a “corpus”). Though LLMs are not great at math, they will get better over time.
What role does human intervention or review play in the AI process? How much oversight is necessary?
It’s important to validate the quality and effectiveness of AI item development through expert reviews, pilot testing, and continuous improvement based on user feedback. Additionally, incorporating educational expertise and domain knowledge into the process enhances the overall quality of the generated items. SparkAI requires human intervention as part of the review and approval process. This ensures that human-centric qualities of relevance, fairness, and ethical considerations are maintained.
Are there specific technical requirements or resources needed to implement AI effectively?
No, that’s the beauty of generative AI and why it’s become so popular. It’s simplified, so anyone with any level of technical background can use it to generate items. SparkAI guides you through the process of selecting classifications and modifying the model all in one place so that you don’t need to be a prompt engineer.
What’s next for SparkAI?
Item generation is just the beginning! Our Product team is hard at work planning future releases in areas like security, analysis, and more. Be the first to hear about new releases by joining our mailing list.
Always Changing
The landscape of generative AI is changing daily. Almost every day we’re shown new features in our favorite AI tools, YouTubers are showing us alternative use cases, or AI companies are making global headlines. While these FAQ stand true today, they are expected to evolve tomorrow.
Have more questions not covered here? Help us grow our FAQ library by contacting me at acrowley@testsys.com.
About the Author

Amanda Crowley is the Director of Marketing at ITS. She has ten years of experience in assessment and is passionate about helping businesses with marketing strategy and brand building. Amanda holds a Bachelor of Science in Digital Audiences from Arizona State University and has earned multiple certifications from Google, LinkedIn, and Microsoft. As Marketing Chair for the 2024 and 2025 Innovations in Testing Conference (“ATP”), she collaborates with industry leaders and experts to promote innovation and best practices in testing. Amanda is a strong supporter of industry organizations such as ATP, ITCC, and I.C.E., and enjoys consulting locally in the Baltimore area.
Leave a Reply