There is no substitute for old-fashioned detective work in the area of historical research. Historians spend long hours in libraries and archives combing through boxes filled with documents in search of clues. Technology is already helping by allowing organizations to digitize their holdings so people can do genealogy work and other research from anywhere as long as they have a computer and WiFi. But technology can always do better, right?
During the week of March 11th, I was fortunate enough to participate in a program sponsored by my employer, MassMutual, called “Data Days for Good.” The program is a partnership between Boston University (BU) Spark Innovation Lab and MassMutual where MassMutual employees mentor BU undergraduate and graduate computer and data science students as they attempt to solve a real-life problem for a local nonprofit.
The original DRAINS team from left to right: Naman Bagaria, Sindhuja Kumar, Sahir Doshi, Vajay Fish, Valentina Haddad, Alé Lanz, and Cindy Zhang. Credit BU Spark
During this event, I mentored a group of students where I tasked them with answering the following question for the benefit of the Historical Society: is it possible to use Generative AI (also known as GenAI and ChatGPT) to identify the location of deeds with racist restrictions within the Registry of Deeds database?
To understand why I was asking the students this question, I think it helps to go back to the History Note my fellow board member Beth Hoff and I wrote entitled “…lot shall not be re-sold to a colored person, an Italian or a Polander.” In that post, there was a map where all of the lots in the Brookline plan that were subject to racist deed restrictions were highlighted. This map took many hours to produce because finding and sorting through the deeds for those with these restrictions was a highly manual process. As I began to gain exposure to GenAI as part of my job, I started to wonder if it was possible to create a solution to this problem with this new technology.
On our first day of working together, I shared with the students the problem I and my fellow historical researchers were facing and they eagerly began working. They showed great enthusiasm, asked great questions, and were very open to the feedback I had for them. In just three short days, they had built a prototype that would hopefully make the onerous task of fishing for deeds with racist restrictions a thing of the past. We decided to call it DRAINS which stands for Deed Restriction Artificial Intelligence Notification System.
ChatGPT Logo: Credit Wikipedia
So how does DRAINS work? The user first uploads an image of a deed to the application. The image then undergoes a process called Optical Character Recognition (OCR) that converts the image to text. Then the text from the deed is fed into ChatGPT. It is asked if there is a racist restriction on the deed, when the deed was received at the registry, and what book and page the deed was recorded in. If the deed has a race-based restriction on it, it will notify the user and provide the date received, book, and page. Simultaneously, a backup system works using a preset set of terms to try to catch any racist restrictions ChatGPT might have missed.
The results of our testing were promising. DRAINS had over an 86% accuracy rate and was able to complete its analysis in a fraction of the time that it would take a human. While it is still very much a prototype, and it needs some refinement from both a modeling and a software engineering standpoint, these results are very promising.
This summer, I have been fortunate to continue this partnership with BU and Public Interest Technology - New England (PIT-NE) to further develop the system. A new group of students from the PIT-NE university consortium, Grace Chong, Hannah Choe, Reginae Echols, and Arnav Sodhani, picked up where the previous students left off.
Their focus was to work on automating the process of matching a racist deed to a current plot of land on a Geographic Information Systems (GIS) map. When this map is ready, the hope is to have an interactive web-based tool that will enable users to visualize the duration of these restrictions and the ethnic/racial groups they impacted. We hope that this will be a tool for raising awareness of the impact that these restrictions had on Longmeadow and eventually communities all over the country.
Screenshot of the map that is currently being created by the students
(Credit BU)
I look forward to continuing this journey and hope to share it with you as the students learn and build more.