Enhancing Educational Support with GenAI: How Lyceo is Integrating AI into its Learning Framework
18
October
2024
No ratings yet.
The ongoing teacher shortage in the Netherlands is a growing concern, creating disruptions that impact the quality of education and limiting students’ future opportunities. With some classes and even entire school days being canceled, and certain subjects no longer taught, education has taken a hit. As a response, many parents have turned to private tutoring or homework assistance for their children, while schools increasingly seek external educational services. Among these, Lyceo has emerged as the largest provider.
As more and more schools rely on Lyceo, the company is able to leverage AI technology to address various educational challenges and automate tasks. With the introduction of the Lyceo GenAI learning tool, the company’s virtual tutors will be able to support students by answering questions and providing timely feedback on assignments. The tool will offer personalized insights, highlighting students’ strengths and identifying areas where they can improve. By considering diverse learning preferences and abilities, Lyceo can create tailored teaching strategies and resources for each student. This technology not only provides real-time explanations but also extends continuous support, even during late-night study sessions. This self-paced approach is particularly beneficial for those students who prefer to study according to their own schedules.
Additionally, Lyceo’s GenAI-powered chatbots will enhance customer service by assisting parents in obtaining answers immediately. The chatbots are designed to provide information and perform tasks. The informative chatbots will deliver pre-set information to help parents with questions about pricing or suitable programs tailored to a student’s needs. In contrast, task-based chatbots are programmed to handle specific requests, such as scheduling tutoring sessions for students.
However, integrating GenAI into Lyceo’s business model involves considerable investment. The costs for implementing generative AI can range from minimal to several million euros, depending on the specific use case and scale. While smaller companies may benefit from free versions of generative AI tools, like ChatGPT, Lyceo will likely need to invest in customized AI services to develop the online learning tool and sophisticated chatbots tailored to their needs.
The potential benefits make this investment worthwhile, enabling Lyceo to improve its educational support services and continue to meet the evolving demands of schools, students and parents.
NutriNet – a personal assistant for your grocery shopping
18
October
2024
No ratings yet.
Have you ever taken hours browsing around supermarkets searching for the most nutritious food options? Did it take you too much time figuring out which recipes to cook best, with the groceries you bought? Struggles when dealing with food are numerous, starting from choosing products with appropriate nutrients simply not knowing enough recipes.
Therefore, we introduce NutriNet – a mobile application that is going to make shopping and meal planning easier!
NutriNet simplifies grocery shopping and meal planning, by analyzing which products and recipes fit the users’ desired grocery item wishes, nutrition values and store preferences best. NutriNet aims to address the challenges being implicit to personalized nutrition and helps consumers make healthier food choices by simplifying the grocery shopping and meal planning process. This shall be done by eliminating the need to perfectly understand all nutrition values or to search for numerous recipes. NutriNet completes all these tasks for you in real-time and provides clear and accessible recommendations for you.
• Real-time personal assitant: NutriNet acts as a multifunctional application providing value to consumers by solving various food related problems in real-time. Appearing as a chatbot, it is aimed at taking general grocery shopping lists or meal wishes as input query, combined with preferences for food characteristics (e.g., nutrients, allergies) and grocery stores. It then provides brand specific and personalized grocery shopping lists, as well as meal recommendations, if so desired. Moreover, grocery items can also be added into the initial grocery shopping list query, by scanning them with the integrated AR tool. The product will then be detected visually and thus will be integrated into the shopping list input query.
• Long-term customer engagement: NutriNet distinguishes from competition by providing personalized and customized advice. This is possible, as NutriNet consists of a database, having incorporated stock and product information of the major supermarkets in the Netherlands. In contrast, classic applications, which try to meet similar needs (e.g., meal recommendation) rather focus on counting nutrients for the purpose of short-term weight loss, instead of personalizing grocery shopping lists and meal recommendation to enable a healthier lifestyle for users. Those applications are usable on short term but are proven to have low adherence over the time (Chen et al., 2015).
• Personalized recommendations: NutriNet leverages generative AI to offer accurately personalized recommendations. Users can simply enter their preferences, while prompting a grocery list or a meal, such as gluten-free or high in protein, and the generative AI powered application will provide accurately personalized results. However, personalization and raising awareness for healthy foods are not the only purposes of NutriNet. It also addresses sustainability issues that supermarkets are facing. By gathering consumer purchase and search data in the application, consulting services can be offered to supermarkets, enabling them to plan ordering and stockholding processes more efficient. Hence, supermarkets should be able to reduce food waste due to overstocking on long-term.
NutriNet offers a 2-sided network platform, yielding important value to both, consumers as well as supermarkets.
Contributors
574051 – Duong Dao 728070 – David Wurzer 738898 – David Do 562387 – Roxi Ni
References
Chen, J., Berkman, W., Bardouh, M., Ng, C. Y. K., & Allman-Farinelli, M. (2019). The use of a food logging app in the naturalistic setting fails to provide accurate measurements of nutrients and poses usability challenges. Nutrition, 57, 208-216.
Innovating Learning with Canv-AI: A GenAI Solution for Canvas LMS
17
October
2024
No ratings yet.
In today’s educational landscape, generative AI (GenAI) is reshaping how students and instructors interact with learning platforms. A promising example is Canv-AI, an AI-powered tool designed to integrate into the widely used Canvas Learning Management System (LMS). This tool aims to transform both student learning and faculty workload by leveraging advanced AI features to provide personalized, real-time support.
The integration of Canv-AI focuses on two primary groups: students and professors. For students, the key feature is a chatbot that can answer course-specific questions, provide personalized feedback, and generate practice quizzes or mock exams. These features are designed to enhance active learning, where students actively engage with course material, improving their understanding and retention. Instead of navigating dense course content alone, students have instant access to interactive support tailored to their learning needs.
Professors benefit from Canv-AI through a dashboard that tracks student performance and identifies areas where students struggle the most. This insight allows instructors to adjust their teaching strategies in real-time, offering targeted support without waiting for students to seek help. Additionally, the chatbot can help reduce the faculty workload by answering common questions about lecture notes or deadlines, allowing professors to focus more on core teaching tasks.
From a business perspective, Canv-AI aligns with Canvas’s existing subscription-based revenue model. It is offered as an add-on package, giving universities access to AI-driven tools for improving educational outcomes. The pricing strategy is competitive, with a projected $2,000 annual fee for universities already using Canvas. The integration also brings the potential for a significant return on investment, with an estimated 29.7% ROI after the first year. By attracting 15% of Canvas’s current university customers, Canv-AI is expected to generate over $700,000 in profit during its first year.
The technological backbone of Canv-AI relies on large language models (LLMs) and retrieval-augmented generation (RAG). These technologies allow the system to understand and respond to complex queries based on course materials, ensuring students receive relevant and accurate information. The system is designed to be scalable, using Amazon Web Services (AWS) to handle real-time AI interactions efficiently.
However, the integration of GenAI into educational systems does come with challenges. One concern is data security, especially the protection of student information. To address this, Canv-AI proposes the use of Role-Based Access Control (RBAC), ensuring that sensitive data is only accessible to authorized users. Another challenge is AI accuracy. To avoid misinformation, Canv-AI offers options for professors to review and customize the chatbot’s responses, ensuring alignment with course content.
In conclusion, Canv-AI offers a transformative solution for Canvas LMS by enhancing the learning experience for students and reducing the workload for professors. By integrating GenAI, Canvas can stay competitive in the educational technology market, delivering personalized, data-driven learning solutions. With the right safeguards in place, Canv-AI represents a promising step forward for digital education.
Authors: Team 50
John Albin Bergström (563470jb)
Oryna Malchenko (592143om)
Yasin Elkattan (593972yk)
Daniel Fejes (605931fd)
From Dense Texts to Dynamic Videos: The Synopsis.ai Web App
17
October
2024
No ratings yet.
Team 6: Noah van Lienden, Dan Gong, Ravdeep Singh & Maciej Wiecko.
Ever found yourself staring blankly at a 50-page academic paper, wondering if there’s a faster, more engaging way to grasp the key points? What if that dense text could transform into a lively video, complete with animations and a friendly narrator? Welcome to the future of learning with our Synopsis.ai web app!
The Education Technology (EdTech) market is skyrocketing. In 2023, the global EdTech market hit a whopping $144.6 billion and is projected to triple by 2032. With advancements in AI, augmented reality (AR), virtual reality (VR), and more, the way we learn is evolving faster and changing day to day. Generative AI is the new superstar in the EdTech universe. Tools like Scholarcy are helping students by turning lengthy texts into bite-sized summaries. But let’s face it—reading summaries can still feel like, well, reading. How great would it be if you could watch a video instead?
Enter Synopsis, the groundbreaking web app that’s set to revolutionize how we digest academic content. Synopsis uses advanced AI to convert scholarly articles into short, engaging videos. It’s like having your own personal explainer video for every complex paper you need to read. You can customize these videos and choose either a lecture format or an animated video format. Furthermore, users can select their desired video length, content granularity and even add subtitles!
All this new content is not only wonderful for student learning with our web app, but also Researches, Educators and even Content Creators! All these different users can have different uses of our platform, and can each bring value in new ways to themselves, or even to others!
So how does this magic work behind the scenes? Synopsis leverages state-of-the-art AI models like GPT-4 and BERT, fine-tuned on vast academic datasets. It collaborates with AI research institutions to stay ahead of technological advancements and works with designers to create customizable templates and animations. While there are tools that summarize texts or create videos, none combine both in an educational context. Synopsis fills this market gap by offering a seamless solution that transforms academic articles into personalized video summaries.
In a world where attention spans are dwindling, and visual content reigns supreme, Synopsis is poised to make a significant impact. By making learning more accessible and enjoyable, it’s not just keeping up with the future of education—it’s helping to shape it!
Learning how to code? Let Generative AI help you!
12
October
2024
No ratings yet.
When I started on my Python learning journey with Datacamp, I was excited, but I also faced challenges that tested my patience. As someone from a non-technical background, the structured logic of coding felt overwhelming. Even Python’s supposedly beginner-friendly syntax often appeared complex, especially when I encountered errors that I couldn’t quickly resolve.
Early on, one of the most frustrating issues I faced was debugging. It often led to roadblocks in simple mistakes like indentation errors or variable mismanagement. Despite the comprehensive learning modules on Datacamp, I often found it difficult to understand why my code wasn’t functioning as expected. Traditional resources usually provided solutions that didn’t quite align with my specific problem.
Moreover, applying theoretical concepts like loops, functions, and list comprehensions in practice was a significant challenge. While I could follow along with the lessons, I often found myself lost when it came time to solve problems independently. It became clear that I needed more personalized explanations to bridge the gap between theory and application.
That’s when I began using Datacamp’s integrated AI assistant, which proved to be a lifesaver. The AI provided on-demand explanations of the coding assignments, breaking down what each line of code was doing and helping me understand the purpose behind every function and operation. For example, when working on loops, the AI would offer examples and explain them in simpler terms, helping me grasp how to apply these concepts to real-world problems. It even helped me understand more complex concepts like recursion by providing step-by-step explanations and visualizations.
The AI didn’t just solve problems for me—it taught me how to approach coding challenges. Offering multiple ways to write a function or fix an error encouraged me to think critically about my coding style and improved my overall understanding.
I know this may sound like a promotion, but I genuinely recommend Datacamp to anyone interested in learning to code. It provides the most interactive learning experience, and the integrated AI makes the journey much smoother and more enjoyable.
Does Easy-To-Use Local Image Generation AI Applications Have Commercial Potential?
10
October
2024
No ratings yet.
There are many AI applications for image generation, but many of them are based on the Internet and the cloud, and are charged by subscription or based on the number of times used. Unlike these, Stable Diffusion WebUI, as an open source and free image generation tool, has attracted widespread attention with its powerful capabilities. However, it also has a relatively high threshold for use. Based on my experience in using Stable Diffusion WebUI, I will briefly talk about the potential commercial prospects of low-threshold, easy-to-use local image generation AI applications.
Advantages of Stable Diffusion WebUI
The advantage of Stable Diffusion WebUI is not only that it is completely open source and free, but also that its model is deployed locally (that is, similar to the end-side AI mentioned at the iPhone16 conference). Since it runs directly on local hardware, it does not need to upload any data to the cloud. This means that users can fully call on the computing resources of their own devices without connecting to the network.
Compared with cloud applications such as MidJourney, the protection of data privacy is a major advantage of local applications. Neither the user’s input nor the generated images are uploaded to the server, but are processed locally, which is suitable for users who are sensitive to data security and privacy.
At the same time, because it runs on local hardware, its performance can be very high. It can flexibly call on the computing power of a high-end GPU, which is especially suitable for users with high performance hardware. It is not affected by network bandwidth and gives full play to its powerful image generation capabilities. All this makes it an extremely excellent tool.
The Threshold of Using Stable Diffusion WebUI
Although Stable Diffusion WebUI is powerful, its threshold of use also makes many ordinary users discouraged. This is my personal experience in installation, model import and debugging parameters. First of all, the installation and operation process is relatively complicated. You need to download tools and deal with many environmental dependencies, such as Python and other necessary libraries. These steps are relatively complicated for many new users. Without the help of various forums, blogs and GPT, I would definitely not be able to do it. In addition, the selection and import of models is also a big challenge. Although there are a large number of free model resources on the Internet, it is not easy to find a model that suits your needs. It will also take a lot of time to choose the SORA model. In the end, you need to adjust many parameters including the number of steps, sampling method, and resolution by yourself to achieve the desired effect. Its complexity also makes it difficult for users to control.
Launch Simple Applications to Attract More Users
If we can launch a simpler and easier-to-use app based on Stable Diffusion WebUI, which is aimed at the consumer market and the general public, we will be able to push it to a wider market. The key is to lower the threshold for use. By integrating environmental dependencies, such as pre-installing all necessary libraries and operating environments, users can skip the configuration process. The application can also provide a one-click model download and purchase channel to help users quickly obtain high-quality generated models. At the same time, user-friendly interface design is also essential. While following various interactive design principles, optimize UI and UX, and simplify complex parameter adjustments into several key options. Let ordinary users easily generate the pictures they want without losing flexibility. For example, users can control the quality and style of the generated pictures through simple sliders or preset modes, avoiding complex technical details.
Commercial Potential
From a commercial perspective, local image generation applications based on Stable Diffusion WebUI have broad prospects. The use of Stable Diffusion WebUI not only avoids the occupation of cloud resources, but also makes pricing more flexible. Compared with the payment model of conventional online image generation AI such as MidJourney, Stable Diffusion’s local application can adopt a variety of pricing strategies, such as buying out software, paying for advanced models, or allowing users to use the generation service unlimited times within a certain period of time through a subscription system. Overall, through the simplified operation experience and flexible and low pricing, we may be able to build an “unpopular” image generation AI application based on the Stable Diffusion WebUI, attracting a wide audience and looking forward to its large-scale application.
My Personal Chef
10
October
2024
5/5 (2)
Since its launch, I’ve mainly used ChatGPT to help me summarize academic articles and brainstorm ideas. Yesterday, I decided to let the LLM help me decide what to eat that night (OpenAI, 2024).
I rearranged the ingredients in my fridge so the main items were visible, moving condiments toward the back. I took a picture and uploaded it to ChatGPT with the prompt:
“What can I make for dinner tonight with these ingredients? I’m willing to go to the store for some additional ingredients needed.”
That’s it—just as if I asked a friend on the phone for a suggestion on what to make for dinner tonight, I asked the OpenAI model.
What it returned blew me away: (please note that formatted the output for visual appearances in this blog post).
“Here’s a simple recipe using the ingredients you have, plus some optional additions you could pick up from the store.
Butterscotch Pumpkin and Vegetable Curry
Ingredients from your pantry/fridge:
Butterscotch pumpkin
Carrots
Zucchini
Red bell peppers
Onions
Potatoes (optional, for a heartier curry)
Curry powder or paste (red or yellow)
Additions from the store (optional):
Coconut milk (for a creamier curry)
Fresh cilantro (for garnish)
Protein like chickpeas, tofu, or chicken
Instructions:
Prepare the Veggies: Peel and cube the butterscotch pumpkin, and chop the zucchini, bell peppers, carrots, and potatoes.
Sauté the Aromatics: In a large pot, sauté the onions in oil until they’re translucent.
Add Spices: If you have curry powder or paste, add it to the onions and cook for a minute to release the flavours. If you’re using the peanut sauce as a curry base, add it now along with a dash of water.
Add Veggies: Add the butterscotch pumpkin, carrots, and potatoes (if using). Let them cook for a few minutes.
Simmer: Add coconut milk (if using) or water/stock, and simmer until the pumpkin and other veggies are tender (about 20 minutes).
Add Other Veggies: Toss in zucchini and bell peppers, letting them cook for the last 5-7 minutes.
Season: Taste and adjust seasoning with salt, pepper, or a touch of sugar if needed.
Serve: Serve your curry over rice (you have poke rice) and garnish with fresh cilantro if you like.
This will result in a rich, creamy, and slightly sweet curry with the pumpkin flavor shining through!Would you like to go ahead with this, or need adjustments?”
Amazing, right?! I was instantly inspired to cook this delicious-sounding meal and ran to the store for some chicken thighs and cilantro. Lucky for me, I already had the coconut milk, which the AI didn’t recognize in the picture.
For this blog post, I asked GPT to create an image inspired by this recipe:
While writing this blog post, I wondered if ChatGPT could give me an accurate estimate of the nutritional value of the dish, so I asked:
Caloric breakdown (per serving):
B. Pumpkin (100g)
Calories: 45
Carbohydrates: 11g
Protein: 1g
Fat: 0g
Carrots (50g)
Calories: 20
Carbohydrates: 4.5g
Protein: 0.5g
Fat: 0g
Zucchini (50g)
Calories: 8
Carbohydrates: 1.5g
Protein: 0.5g
Fat: 0g
Total:
Calories: 605 kcal
Carbohydrates: 86g
Protein: 28.8g
Fat: 15.5g
Bell Pepper (50g)
Calories: 15
Carbohydrates: 3g
Protein: 0.5g
Fat: 0g
Onion (30g)
Calories: 12
Carbohydrates: 3g
Protein: 0.3g
Fat: 0g
Coconut Milk (50ml)
Calories: 90
Carbohydrates: 2g
Protein: 1g
Fat: 9g
Curry Paste (15g)
Calories: 25
Carbohydrates: 5g
Protein: 0.5g
Fat: 0g
Rice (200g cooked)
Calories: 260
Carbohydrates: 56g
Protein: 5g
Fat: 0.5g
Chicken Thighs (100g)
Calories: 130
Carbohydrates: 0g
Protein: 19g
Fat: 6g
I am stunned! I checked the values online, and the caloric values only differed slightly (Grutto, 2024a; Voedingscentrum, 2024b). Not only does it give dinner suggestions and inspire me to make dinner with the ingredients I already have at home, but I can also keep a rough estimate of the macros I’ve consumed. Imagine being on a diet and going out to eat with a friend. You want to keep track of your caloric intake but don’t want to bother bringing a scale to weigh your food with every new dish. Now you just take a picture, upload it to your Instagram story, and later upload it to ChatGPT to make a nutritional value estimate for you.
While this experience was undeniably impressive for me, it’s important to consider some limitations. For instance, the AI didn’t recognize the coconut milk in my picture, which was a key ingredient in the recipe it suggested. This highlights that image recognition technology isn’t foolproof, and you cannot rely on it 100%. Also, while the nutritional estimates were close to official sources, they weren’t exact. For those with strict dietary requirements or allergies, relying solely on AI for nutritional information might not be a wise idea. Lastly, uploading photos of your fridge or meals means sharing personal data with an AI service and thus can be a privacy concern. It’s important to be mindful of what you’re sharing and know how that data might be used or stored.
My experiment with using ChatGPT as a personal AI-chef was both enlightening and exciting. The ease of requesting a tailored dinner suggestion and a nutritional breakdown based on the contents of my fridge, shows me the potential of AI in everyday life. While there are limitations to consider, the benefits show a glimpse of the exciting future to come.
In the end, I find it amazing how the technology I first only use as a study and search tool, can also inspire me in other parts of everyday life. With my personal AI-chef, dinner dilemmas are a thing of the past.
My Experience with GenAI: Improving Efficiency or Becoming Stupid?
9
October
2024
No ratings yet.
I work as a part-time data analyst at a software company, where I analyze sales data. My 9-5 mainly consists of writing code, specifically using SQL in Google Bigquery and creating dashboards in PowerBI. I love using GenAI to help me write queries faster which would have taken me a long time to compose by myself. Additionally, I am a student and use GenAI to help me better understand course content or inspire me on what to write about during assignments. Generally, I would say that GenAI benefits my life as I can get more done in less time, however, from time to time I start to question whether I am not just becoming lazy.
I use GenAI on a daily (almost hourly) basis and rely on it in many ways. I mainly use ChatGPT 3.5, when ChatGPT 4o’s free limit has been reached, and Gemini, when ChatGPT is down. Based on my own experience, I can say that being good at ‘AI prompting’ is a real skill in the field of data analytics as it can drastically improve the efficiency with which you write queries, and therefore, the speed with which you finish tasks. My manager recently even held a knowledge-sharing meeting in which he discussed the best practices to use for data analysts when interacting with ChatGPT. Using GenAI has become a real thing in the field of data analytics, and is not something to be ashamed of.
However, I cannot help but sometimes be slightly embarrassed when I read back the questions I’ve asked ChatGPT. It seems that with any task that requires a little bit of effort or critical thinking, I automatically open the ChatGPT tab in my browser to help me come up with the right approach to solve the task at hand. I don’t even try to solve things by myself anymore, which makes me question: is this something to be desired?
As explained by ChatGPT in the image, using GenAI indeed frees up more brain space for things that are important. If I can use less time to get more work done, this improves my work efficiency and also gives me more time for things that I find more valuable, such as spending time with family or friends. Right now, it is still too soon to be able to determine the impact that using GenAI will have on our own (human) intelligence. In the meantime, we should just continue using it for repetitive tasks that would normally take much of our valuable time and hope that it is not ChatGPT’s plan to stupidify humanity before it can take over the world.
The Anatomy of an AI generated TikTok post
9
October
2024
No ratings yet.
Many of us like to indulge in scrolling our phones in our free time. It’s a quick guaranteed way to kill some time while waiting around, or gathering the strength to finally get out of bed. While our options were limited to classic picture-and-text posts until a few years ago, the meteoric rise of short-form video content has been dominating the doomscrolling niche in recent times. After personally seeing a specific type of content (rehashed reddit drama stories), I began to wonder, could you automate this with GenAI?
Down the rabbit hole
I first began by analyzing the catalyst for my idea, hoping to find the exact mental tasks necessary to pass to GenAI to automate it, however, I got sidetracked. After listening to the specific video closely, I have noticed that the voice and sound design were extremely high quality, it was as if someone who specialized in narrating commercials took some time out of their working hours, and used their studio setup in order to record the voiceover. The display of text and editing also seemed off for a typical TikTok post, it had a few errors that in my mind were “no-brainers”, editing choices that made the story less understandable. That’s when I realized that I didn’t spend my time looking at someone’s geniune effort, I was staring right at my own idea, implemented before I did it.
Shifting gears
At that point, I was both disappointed and amazed, my original idea would probably not find the success that I hoped it could, after all the market for AI generated short stories in video form was presumably already saturated, however, I wanted to know more about this topic. I never personally found this type of content to be of amazing artistic quality, but it could be the springboard for future GenAI development in entertainment. I decided to still go ahead with the project until a proof-of-concept stage, to capitalize on the learning opportunity.
When finally I had the code up and running, churning out videos with little-to-no human input, taking a fraction of the time (and artistic integrity) that a human would, I felt like I stumbled onto the crux of emotional manipulation of contemporary social media. I felt like I found something that every one of my friends should at least know about. Even more disturbingly, during the research for this project, I have not found any complete documentation of what this content is and how it’s made. I only had my account of my experiences and thoughts to go on.
This blogpost outlines my findings and opinions on how this content is made, with the hope that at least getting this knowledge out there will restore some fairness in knowing what you consume to social media.
However, there are still a topic of discussion I could not answer even at the end of the project. Is this truly a new form of entertainment? Am I quick to condemn something new and exciting? Therefore throughout the blogpost I will try to give as much practical information as possible (without encouraging or condemning the non-illegal parts) and hope that this topic will sort itself out in due time if more people have the ability to do it.
How the sausage is made
After conducting some research (if you can call scrolling TikTok for a few hours research), I have found some elements that are common across many of these videos:
Profile picture
Automating away the process of selecting a profile picture seems to be an exercise in futility, after all, why spend any time on automating something that will only occur once, right? Well, online speculation is focused on the conjecture that most platforms purposefully de-amplify AI generated content, and the channel itself will hit a low ceiling on how high it can get in the algorithm. Therefore, I speculate that most organizations that run channels like the one I was trying to create, actually run multiple channels concurrently, thus cheating the “algorithmic ceiling” imposed on them.
Voice
The main way that the underlying story is conveyed to the viewer, is the AI generated voiceover. It is extremely common, with only a small percentage of these videos opting for a musical background or simply no voice. These voices can be created by training a neural network on a few thousand hours of publicly available speeches of one specific person and the transcript. The main objective is to find a voice that is both soothing and fitting for the type of story. Voice can only be done “wrong” after all, by selecting a voice that is irritating or hard to understand, excluding potential viewers from consuming the content.
Most posters use a third-party service (most commonly Elevenlabs) to acquire this aspect of the post. Most AI voiceover services (including Elevenlabs) charge between 30-100 euros per month of API access (compared to a rumored figure of 20-50$ of income per million views), but offer a free tier, given that the user first registers with their Google account. This creates a large incentive for unscrupulus (but strongly business minded) posters to buy stolen Google accounts (which entails supporting the theft of gmail accounts) to generate the voiceover. Free alternatives exist, but require a quite strong PC to actually run and generally slightly underperform when compared to paid services.
Story
Finding an interesting story to put into video form is hard work. However, as most of these stories are selected for their strong language, uncontroversial interpretability and general relatability, there is a really good proxy to look for when selecting them. As Reddit has a rating system, these stories are easy to find, go on a subreddit (a sub forum where related topics are discussed), sort by highest rated, set the filter to “highest rated of all time”, and just like that, you have a scoop…. or at least I initially thought so.
The problem with this approach is that these stories are “internet famous”, there is a high likelyhood that your viewer will have heard the story before; They’ll listen for the first few seconds, conclude that they don’t need to listen to it again, swipe down, and tank your video’s rating to the bottom of the algorithm.
Thankfully, if you are willing to abandon artistic integrity (or don’t view it as such), you can fine tune a generative AI model to write the stories for you. By collecting the top stories from a given story niche (or better yet, collecting it from many niches and using an unsupervised machine learning algorithm to classify them into niches), you can make sure that your generative AI model will be able to create tall tales to entertain the world.
Scenery
A defining feature of these types of short-form videos is the background video. Most posters strive to find something that occupies the part of the viewers brain that isn’t actively engaged in listening to the story. For this, bright colors and easy to interpret picture content is a must. The end goal is to totally engage the viewer at all levels, making them focused on the content. Familiar video content (the games Subway Surfers, Minecraft and GTA 5 work exceptionally well) makes interpretation more fun for the viewer, evoking feelings of either nostalgia or active interest.
Generative AI does not play a role in this part of the content. While it is theoretically possible to train an AI model to play games for background video, it is currently easier to find a large chunk of video content that can be cut up into many small pieces.
Again, the problem of the most obvious solution being suboptimal rises. There is a finite amount of explicitly royalty free content that fit well with this medium. If viewers recognize the specific background video from a competing channel, it will engage them less. Therefore, some posters opt to either pay royalties (again, hopelessly eating away at the profit margin), or just stealing content that wasn’t royalty free.
As most of the footage originates from YouTube and the story content is posted on TikTok, this leverages a gap in content moderation policy. The two sites rarely coordinate on copyright issues, especially if the piece of content is of low value (for example someone’s ages old minecraft gameplay), ensuring that this method entails low risk.
Editing
The last component of a post of this nature is the editing. There are 3 main aspects that have to be covered, cutting the background footage to last until the story does, displaying subtitles and animating said subtitles. There are currently many Python libraries that offer rudimentary video editing that fit this purpose, such as MoviePy.
A simple approach would entail defining static rules around these tasks. The subtitles in this type of content usually utilize a “pop” effect, where the text enlarges and slightly shrinks in quick succession. This captures the viewer’s attention, as humans are hard-wired to pay attention to fast moving objects, naturally drawing the viewers attention to the subtitles.
This leaves only two tasks, cutting the video to fit the story (keeping in mind to only show attention grabbing sequences) and displaying the subtitles (keeping in mind to display related words together). For this task, GenAI outperforms static rules, most general knowledge large language models that support video RAG (retrieval augmented generation) can be prompted to accomplish these tasks. Better yet, they’re able to write code themselves that they can interact with, setting the appropriate parameters for each video.
Running these models locally might not make business sense, as these models require a strong PC, which might prohibitively eat away at the profit margin. Third-party solutions do exist and I trust that by now you have noticed a pattern. The token pricing is too high to be sustainable for the enterprise. All that is necessary to acquire this capability below market price is to get a hold of a stolen API key for any state-of-the-art model, offloading the computational task to a server far away.
FIN
With this, you now know all the details that I uncovered behind this type of content. I found that this content is very troublesome to make. My overarching theory behind the creation pipeline is that it is simply too expensive to create compared to the little income it brings in for any organization. Tiktok reportedly pays around 20-50$ per million views, this is simply not enough to support an ethical creation of this type of content right now. However, I sincerely believe that this will change, at which point the internet will have to collectively decide the fate of short-form storytime content. We all play a part in this conversation, so I encourage to leave your opinion down below in the comments section.
Can GenAI Manage My Day? A One-Day Personal Assistant Experiment
6
October
2024
No ratings yet.
With the increased applications of generative AI, I got interested in whether it could be of use in organising my life. Thus, I used it as a personal assistant for an entire day. My goal was to improve my productivity, reduce my procrastination, and make sense of my schedule. For this experiment, I used ChatGPT-4o, Monica Ai, and Taqtiq to assist me throughout the day.
To create a starting structure, I used GPT-4o to prepare a daily schedule incorporating the time for my work, studies and sports. Next to this, I added a request for a workout plan and some recipes for meals. The result was a comprehensive and personalised outline, that made my day well-structured and easy to follow.
For my work, I used the Monica AI extension as my all-in-one assistant. Powered by LLM’s like GPT-4, Claude, and Gemini, it provides an in-screen assistant that offers several functions such as an AI mind map, a writing agent, a search engine, summary options for both text and video, a translator, and an image generator. Of these options I mainly used the writing agent to help me write emails in the right tone. This significantly reduced the time I usually spend on answering emails and improved the clarity of my communication.
As a supporting feature for my work meetings the Taqtiq extension was of great use. As an AI transcriber it summarises important content and it processes conversations in real-time, converting discussions into concise pieces of text. This made my meetings more effective as I didn’t have to worry about missing important points.
For my studies, the content search on different economical concepts with GPT-4o was of great use, as the feedback made it easier to understand my study materials and acted as a boost to memorise theoretical frameworks.
By the end of the day, using these tools reduced my procrastination and made me feel more in control of my schedule. I completed tasks more quickly and had extra time to focus on projects that were previously on the backburner.
My experiment reflects the continued expansion in the use of AI in personal work. As highlighted by Murgia (2024) in the Financial Times, major companies like Google, OpenAI, and Apple are racing to develop advanced AI-powered personal assistants. Developments of “multimodal” AI tools which interpret voice, video, images, and code within a single interface, provide revolutionising advancements in understanding and executing complex tasks. Our interaction with the digital world is strongly enhanced as systems support our daily planning.
Despite the benefits, there are shortcomings to consider, particularly regarding privacy and data security. Relying heavily on AI assistants involves sharing personal and potentially sensitive information, raising concerns about how this data is stored and used. For example, I couldn’t use the Monica AI tool for certain email responses because the emails contained personal information from clients. ChatGPT is already vague about its data storage policies, and even more so with these extensions. The same applied to meetings; I had asked for permission to record the meeting beforehand. However, it’s possible to record without consent, potentially violating my colleagues’ privacy.
Currently, interactions with AI assistants are still mostly text-based, but I believe the future holds the potential for real-life AI assistants that we can speak to directly, receiving immediate responses without delay. My experience using AI tools as a personal assistant was largely positive; they significantly boosted my productivity and helped me stay organised. However, due to privacy concerns, it’s not something I will rely on extensively just yet.
As AI continues to advance, the possibilities seem endless; but would you be comfortable using an AI assistant in your daily life given the current privacy risks? And what features would make an AI truly indispensable to you?