Team Quest: Developing a cutting-edge solution in just 9 hours
To begin is the most important part of any quest and by far the most courageous. - Plato
On July 28, 2022, we had a one-day team challenge to explore and develop a proof of concept (POC). Each team of data scientists and interns could choose one specific cutting-edge technology to work with over nine hours (e.g., GPT3, Web scraping at scale, computer vision, IoT, machine learning, explainable AI, etc.).
What was the mission? To develop a POC in cutting-edge technology that could benefit Agilytic and our clients.
“Planning the first Agilytic Quest stemmed from an enthusiastic desire from colleagues to try a new challenge. That was the starting point. Also, our project accelerator, the Agilytic Lab, is a growing initiative. So, it was an excellent opportunity to combine the kick-start of the Lab, pursue new challenges, and explore cutting-edge technologies,” noted Alex Schouleur, the Quest’s organizer.
So, why in just 9 hours?
“We wanted to find the tradeoff. The idea is to give enough time for people to develop a POC, but at the same time, we wanted to create an intensive sprint. That's an essential part of making it challenging,” said Alex.
The morning started with a kick-off and brainstorming to choose a topic using one of the cutting-edge solutions. By the end of the challenge, each team had to present its solution. Agilyticers were encouraged to explore some concepts on their own to maximize their impact. Here’s what each team developed and experienced together!
Team 1 - Automated code correction and suggestions with Alex Schouleur and Guillaume Carton
In your words, what did you develop as a team, and why did you decide to work on this?
Alex: We decided to develop two use cases of GPT3, a natural language processing algorithm developed by the company OpenAI, an artificial intelligence research laboratory. They built GPT3 from web data, everything on Wikipedia, websites, Reddit, social media, etc., aggregating everything into one huge dataset and trained algorithms on this. First, we built a tool that can help coders document their code. This task can be a significant pain, tedious, boring, and documentation is often not complete or qualitative enough. With GPT3, we ran some tests, and the results were impressive. With the click of a button, we could generate excellent code documentation. The second case was for automatic bug fixing. Where we could copy-paste Python code into our tool, and with the click of a button, the tool gives back the code without bugs. Then, we did a broader exploration of GPT3 - and identified a lot of potential.
What element of the Quest did you like the most?
Alex: It was cool to see everybody working in teams, building something together and sharing knowledge and expertise. It was nice to hear the other teams’ ideas. All our projects were diverse and very interesting. We didn’t want to make it feel too competitive, as the goal was to build something great for Agilytic. This made it more of a team-building exercise rather than a pure competition.
Guillaume: It allowed me to work on a new technology I had never worked on before. Technologies are evolving so fast nowadays that it is interesting and important to be able to focus on cutting-edge technologies from time to time. And being able to share this moment with other colleagues made it even more fun.
What was the biggest challenge or roadblock you faced?
Alex: The biggest challenge was keeping the business objective in mind. While GPT3 is impressive, fun to use, and has a wow effect, all this is fluff if we don't use it for the right purposes and business objectives. So, translation of good technology into real projects. That was the most challenging part.
What did you learn that you didn't know before?
Alex: I didn’t know how powerful GPT3 was. I had some expectation that it would be okay or average, maybe not the best quality, but it’s quite easy to confuse the responses with that of a human.
Guillaume: While I had only vaguely heard about GPT3, the Quest allowed me to learn a lot more about the subject and to see what the potential use cases for this technology could be.
Was there a funny or surprising moment you shared as a team?
Alex: Yes! You can ask any question to GPT3. We were amused by answers to technical questions we usually ask people who apply to Agilytic, and GPT3 aced them!
What would be some use cases for this solution if put into production?
Alex: There are a lot of different applications. Already, many companies are using it as their primary algorithm. For example, Duolingo uses it, and other companies use it for spelling and grammar correction tools, chatbots, and everything related to document classification, summarization and even code generation.
Guillaume: One cool thing is that we can already use the first use case internally to reduce the time spent on documentation while ensuring the quality of the documentation generated.
Team 2 - Analyzing athletic performance data with Adrien Debray, Javier Tarrio, and Nico Grassetto
In your words, what did you develop as a team, and why did you decide to work on this?
Javier: We wanted to investigate the potential of using publicly available information from Strava on the performance of professional athletes. We chose to focus on the recently held Tour de France. This topic appeared naturally since the race had finished four days before, and one of us was passionate about cycling.
Nico: We wanted to see how far we could go with publicly available data and try our best to develop analysis or prediction.
Adrien: As a cycling fan and Strava user, I knew there was some publicly available data about cyclists and other sporters on the platform. I was curious to see if this data could bring us a deeper understanding of the riders' profiles, which tactics might be successful and many other aspects.
What element of the Quest did you like the most?
Javier: I liked the openness of the topic choices, which allowed us to see three takes on entirely different ideas by the end of the day.
Nico: I liked the idea that we could use this for clients and the Quest's openness towards potential topics.
What was the biggest challenge or roadblock you faced?
Javier: We overestimated the ease of scraping data from the source. We tried different possibilities to do the job, but in the end, it took a lot of our most limited resource: time. Organization-wise, we could have done better with the code-sharing process.
Nico: We overestimated the challenge of getting data about the Tour de France (or just data for that matter).
What did you learn that you didn't know before?
Javier: The magnitude of professional cyclists that publicly share their performance data. This is but one aspect of the vast amount of data that the internet of things (in this case, a wearable) is producing, and data is information, and information is power.
Nico: The importance of good planning before starting a project as well as the importance of Github in such moments. At first, we decided to write our respective codes in notebooks, but in the end, this turned out to be a major roadblock time-wise.
Was there a funny or surprising moment you shared as a team?
Adrien: I remember Javier writing a to-do list of everything we still had to do in a very limited amount of time. We realized it would be hard to arrive to a final solution within the day. But at least, we all agreed about the last element of the to-do's: “18:30: Have a beer”.
What would be some use cases for this solution if put into production?
Javier: For individual athletes: identify segments of their activity in which they should improve by comparing to other athletes that outperform them in those segments; for sports teams: scout for promising young athletes; for gambling houses/apps: help to calculate the pay-off of a bet; for gamblers: help identify good bets.
Team 3 - Document Layout Analysis with Guillaume Lamine and Arnaud Briol
In your words, what did you develop as a team, and why did you decide to work on this?
Arnaud: We decided to work on document layout analysis. It is a mix of Natural Language Processing and Computer Vision. It is a topic that both of us have been working on in our previous projects. As it constantly evolves, we wanted to evaluate some of the latest developments. The goal was to identify if it was worth investing more in these new models and if they could be useful in some projects.
What element of the Quest did you like the most?
Arnaud: Trying out new technologies that have just been released. That’s exciting!
What was the biggest challenge or roadblock you faced?
Arnaud: Training these kinds of models is challenging because they require a lot of data and GPUs. Moreover, we lost time configuring an environment to run these models.
What did you learn that you didn't know before?
Arnaud: I first discovered quite a few of these document layout analysis models that are open-source and sometimes pre-trained. I also learned about an interesting open-source tool called LabelImg recommended by a colleague. It allows for labeling images to create a dataset.
Was there a funny or surprising moment you shared as a team?
Arnaud: During the first three hours of the Quest, we had a few funny team meetings to decide if we were going to change the subject or if we were really going to tackle this mastodon in one day.
What would be some use cases for this solution if put into production?
Arnaud: It could help on all projects with scanned documents to automatically extract information such as prices, names, signatures, logos, tables with data, and even equations!
What’s next? To the Lab!
These POC solutions developed from the Quest day will kick-start our Agilytic Lab, an incubator to share and internalize knowledge in the team, making us grow through collective experience and individual practice.
The Lab functions as a project accelerator to facilitate and deliver projects, giving a stable centralized hub for knowledge, making it possible to explore new ideas and increase the range of valuable services we offer.