UX Designer
Researcher
08/2023 -
12/2023
Figma
FigJam
Ashley Frith
Margot Lin
Benedicte Knudson
Katie Mclntyre
User Research
Ideation
Prototype
Large language models (LLMs), capable of analyzing extensive datasets to comprehend and generate text, have revolutionized the artificial intelligence (AI) industry.
However, despite their growing popularity and the desire of many individuals and organizations to leverage LLMs for diverse purposes, a notable barrier persists: users often find themselves lacking the technical expertise needed to develop customized LLM applications tailored to their specific needs and datasets, specifically, in two techniques: fine-tuning LLMs with custom data and prompt engineering.
We designed sandbox.ai, an LLM application builder that empower less-experienced users without background in coding in crafting custom LLM applications.
An intuitive interface that minimized frustration and maximized engagement with the tool.
Users can resolve inaccessible terminology encountered during the LLM creation process at any time, without the need to switch pages for self-inquiry.
Integration with third-party applications not only satisfies the high customization needs but also enhances the adaptability of the platform.
Novice users can build an understanding of LLM from scratch. Additionally, they can access inspiration and guidance for LLM application use cases.
To better understand the problem space, we started by investigating existing papers and threads on social media such as Reddit. We discovered the business and personal context as well as novice and expert users within the space, and we decided to focus on the needs of the novice users who are struggling most, while supporting expert users.
We also looked into the market to examine whether existing products can help international students better order in ethnic restaurants.
Basic Feature
Basic Feature
We found that:
We first used a survey to 1) help us recruit participants for the interview and contextual inquiry. 2) quickly learn about users’ needs, preferences, and ideas. We received 22 responses in total, and found that:
We conducted the semi-structured interview with 5 industry experts who are working in the field of AI to gain insights about the problem space and use cases.
In the meanwhile, we conducted the contextual inquiry with 5 novice users, who are guided through creating a chatbot using their selected LLM application builder, to better understand our target users’ needs and challenges.
Through transcript coding and affinity mapping, we found that:
Because of the pain points, a number of people are hindered from achieving their goals, and become frustrated with the process, leading to disengagement or abandonment of building their own LLMs.
How might we support less-experienced users in crafting custom LLM applications using their own data?
Based on the insights we collected from user research, our team brainstormed separately first, then convened together to share all the ideas and settled on the following core functionalities:
We also conducted methods of moodboard, metaphor, and SCAMPER to bring up innovative solutions, and drew out sketches containing these core functionalities.
Then we narrowed down to two initial variations based on group discussion, created the following wireframes.
We held feedback session with 4 novice users and our industry contact from Databricks. We found that:
As a result, we have ultimately decided to implement both variations, categorized as Basic and Advanced modes, allowing users the freedom to choose:
We held another feedback session with 4 different novice users and our industry contact. We made iterations based on their feedback:
To evaluate our final prototypes, we conducted cognitive walkthrough and usability testing with 7 users.
The aim of our evaluation session is to assess: the prototype accessibility, the alignment with user needs, , efficiency in core functional tasks, and the learning curve for novice users.
With these goals in mind, we moderated the cognitive walkthrough with 3 expert users and the usability testing with 4 novice users. Here's a brief overview of the process for both methods:
1. Have the users walk through the application, think aloud, and answer guiding questions for each screen
2. Invite users to fill out a post-task survey
3. Code the transcript, categorize codes, and analyze data
4. Group discussion & Data visualization of survey results
Here are the data visualization of the surveys:
From the evaluation sessions and the survey, we found that:
Based on the evaluation session, while there have been successes, our evaluation has also pinpointed areas for improvement:
Accessibility Awareness
We learned to start with accessibility in mind to ensure the design is usable for all from the outset. Remember to constantly check that designs comply with accessibility standards like color contrast and keyboard navigation, and regularly gather user feedback to identify and address clarity issues promptly.
Feedback Integration
backgroundsDuring the design process we acknowledge that user feedback can sometimes be conflicting. As designers, we learned to actively seek diverse feedback to better understand user needs, and strive to balance simplicity and complexity in features to accommodate users with different backgrounds.