Introducing Otto the Audit Bot
“Nothing is so painful to the human mind as a great and sudden change.”
― Mary Shelley, Frankenstein
I tried to stifle images of Dr. Frankenstein’s laboratory on that stormy night when his creation came to life. The mix of awe and terror as his creation’s eyes flicker awake …
But there I was.
After just a few short hours of bumbling my way through a handful of coding templates I found online and several rounds of working with ChatGPT and Bard to QC my sloppy code, I was conversing with my own fine-tuned audit expert robot, whom I’d just named Otto.
I ran through a series of highly specific questions and compared Otto’s answers to those of the off-the-shelf LLM-powered chatbots. And time and time again, Otto was providing better answers.
In the two years since the publication of my book Deep Finance: Corporate Finance in the Information Age, I have been on a quest to show how new technologies can and will transform our profession. From my first rule-based (and quite limited) finance chatbot built using Amazon’s Lex language, through a couple of iterations of Python applications that could perform FP&A at scale, and now to the creation of Otto, my programming skills have gotten no better. What has changed over that time is the underlying technology.
Why is a finance guy building his own bots?
And it has become increasingly clear that the world is about to be divided into two groups: those who harness the power of this new technology, and those who fade into obsolescence. If that sounds harsh, two recent studies shed light on the rapidly advancing future:
Awe and terror, right?
- The Effects of AI on Knowledge Worker Productivity and Quality
- Team of AI bots develops software in 7 minutes instead of 4 weeks
I’ve spent the past 20 years at the intersection of finance and technology. Having realized early in my career that the secret to providing value to the organizations I served was by moving the nature of my work from mindless tasks to mindful thought. And with limited resources, the only way I was going to accomplish that was through automation and efficiency. As technology has evolved over the course of my career, these efficiencies have become increasingly easy to come by — to the point today where many may wonder where they and their skillset fit in an AI-driven future.
Finance and accounting professionals are no strangers to the extensive research and analysis required to address industry-specific inquiries. Traditionally, this has involved navigating through a sea of regulations, guidelines, and historical data using digital repositories and spreadsheet tools. This manual approach is time-consuming and susceptible to human error — a concern that holds significant weight in a field where accuracy is paramount.
But now with the burgeoning technology behind AI, there is a whole new pathway to streamline research and inquiry handling within the finance and accounting sector. Unlike static digital tools, AI has the potential to interact, learn, and adapt dynamically.
But one can only talk about these kinds of promises in the abstract for so long without backing up those claims with concrete evidence. I figured if I was going to keep talking about the importance of embracing these tools and technologies, I needed to be able to demonstrate how they work.
But I couldn’t do this alone. I needed an assistant who could do the heavy lifting for me.
So who is Otto? Otto is a prototype accounting chatbot with a pretty narrow scope. I trained him to speak like an accountant and gave him an encyclopedic knowledge of some very specific subset of the field. I decided to train him as an expert at everyone’s favorite subjects: audit and compliance.
If standard off-the-shelf chatbots like OpenAI’s ChatGPT, Google’s Bard, and Anthropic’s Claude are generalist assistants who can answer questions on a vast universe of topics, Otto is a specialist who is laser focused on answering questions very specific to his domain.
To create our specialist, we take a foundation model (GPT-3.5), tweak its focus, and give it a new specific set of information to work with.
For those keen on exploring the technical aspects, I have documented each segment of this project as separate Google Colab projects. The projects offer a hands-on understanding of what’s going on under the hood and hopefully encourage replication and exploration of the process.
The first step in creating a specialist chatbot is to fine tune an LLM for your specific needs.
Fine-Tuning GPT-3.5 Turbo
We used the outline provided by OpenAI to fine tune GPT-3.5 Turbo: https://openai.com/blog/gpt-3-5-turbo-fine-tuning-and-api-updates
Fine-tuning adapts a general-purpose model to specific use cases, enhancing its performance to meet particular requirements.
The process improved steerability for better instruction following, ensured reliable output formatting for consistent responses, and allowed for a custom tone that aligns with the professional demeanor of the sector. Additionally, fine-tuning reduced the size of prompts, which sped up API calls and cut costs — both of which are vital for real-time financial applications.
For the prototype version of this tool, we kept the project pretty simple — selecting 50 questions and answers from previous Certified Public Accountant (CPA) exams, and another 50 from the Chartered Financial Analyst (CFA) exam, covering a wide range of topics within finance and accounting. For an enterprise-level production bot, this fine tuning dataset could be greatly expanded for more nuanced results. (Fine-Tuning Questions.)
With this dataset in hand, the fine-tuning process began. Using OpenAI’s method, GPT-3.5 was trained to better understand and respond to industry-specific queries.
This set the foundation for integrating Retrieval Augmented Generation (RAG), advancing the model’s capability to address the nuanced demands of the finance and accounting domain.
RAG: Enhancing Models With Specificity
Armed with a fine-tuned version of ChatGPT, the next step was to further enhance its ability to provide specific, context-rich responses. This is where Retrieval-Augmented Generation (RAG) came into play. Unlike traditional language models that solely rely on their training data, RAG broadens the model’s horizon by integrating an external knowledge base. This additional layer enables the model to pull in real-time or external information, making the responses more accurate and contextually enriched.
Initially, the RAG model retrieves relevant data from various sources based on user input. This data is then used to augment the prompt, aiding the LLM in generating more accurate, updated responses. This mechanism not only bridges knowledge gaps in LLMs but also aids in providing current information, thereby increasing trust and reducing inaccuracies in responses.
In a second Google Colab project, we took two accounting documents:
Again, in a production version of this tool one could imagine a much broader set of documents such as regulatory and standards manuals, including:
Once these documents were uploaded, we were able to use them to augment the chatbot’s responses, allowing GPT-3.5 to consult this external reservoir of information when generating replies to finance and accounting queries.
- Auditing Standards of the Public Company Accounting Oversight Board
- Internal Control Strategies for Compliance with the Sarbanes-Oxley Act of 2002
- FASB Accounting Standards Codification (ASC)
- International Financial Reporting Standards (IFRS) Handbook
- PCAOB Auditing Standards Manual
- Sarbanes-Oxley Act (SOX) Compliance Guide
However, while RAG significantly upped the ante in terms of response quality, it also brought along a challenge — increased computational costs and slower response times. This trade-off between performance and efficiency is a crucial aspect to consider, especially in a field where timely information can be critical.
The result of this implementation was a more informed GPT-3.5, capable of providing enhanced, context-rich responses to a myriad of financial and accounting inquiries. The RAG-augmented model demonstrated a promising stride towards developing chatbots that can serve as reliable, knowledgeable companions for finance professionals.
While this project is nowhere near production ready, it shows the potential of this type of tool, which could be used to create an infinite number of custom tuned domain experts for industries, professions or companies, and hopefully provides some insight into how tools like this could be used to superpower the workforce.
Identify your path to CFO success by taking our CFO Readiness Assessmentᵀᴹ.
Become a Member today and get 30% off on-demand courses, tools and coaching!
For the most up to date and relevant accounting, finance, treasury and leadership headlines all in one place subscribe to The Balanced Digest.
Follow us on Linkedin!