AWS Machine Learning Blog

Real estate brokerage firm John L. Scott uses Amazon Textract and Amazon Comprehend to strike racially restrictive language from property deeds for homeowners

Founded more than 91 years ago in Seattle, John L. Scott Real Estate’s core value is Living Life as a Contribution®. The firm helps homebuyers find and buy the home of their dreams, while also helping sellers move into the next chapter of their home ownership journey. John L. Scott currently operates over 100 offices with more than 3,000 agents throughout Washington, Oregon, Idaho, and California.

When company operating officer Phil McBride joined the company in 2007, one of his initial challenges was to shift the company’s public website from an on-premises environment to a cloud-hosted one. According to McBride, a world of resources opened up to John L. Scott once the company started working with AWS to build an easily controlled, cloud-enabled environment.

Today, McBride is taking on the challenge of uncovering and modifying decades-old discriminatory restrictions in home titles and deeds. What he didn’t expect was enlisting the help of AWS for the undertaking.

In this post, we share how John L. Scott uses Amazon Textract and Amazon Comprehend to identify racially restrictive language from such documents.

A problem rooted in historic discrimination

Racial covenants restrict who can buy, sell, lease, or occupy a property based on race (see the following example document). Although no longer enforceable since the Fair Housing Act of 1968, racial covenants became pervasive across the country during the post-World War II housing boom and are still present in the titles of millions of homes. Racial covenants are direct evidence of the real estate industry’s complicity and complacency when it came to the government’s racist policies of the past, including redlining.

In 2019, McBride spoke in support of Washington state legislation that served as the next step in correcting the historic injustice of racial language in covenants. In 2021, a bill was passed that required real estate agents to provide notice of any unlawful recorded covenant or deed restriction to purchasers at the time of sale. A year after the legislation passed and homeowners were notified, John L. Scott discovered that only five homeowners in the state of Washington acted on updating their own property deeds.

“The challenge lies in the sheer volume of properties in the state of Washington, and the current system to update your deeds,” McBride said. “The process to update still is very complicated, so only the most motivated homeowners would put in the research and legwork to modify their deed. This just wasn’t going to happen at scale.”

Initial efforts to find restrictive language have found university students and community volunteers manually reading documents and recording findings. But in Washington state alone, millions of documents needed to be analyzed. A manual approach wouldn’t scale effectively.

Machine learning overcomes manual and complicated processes

With the support of AWS Global Impact Computing Specialists and Solutions Architects, John L. Scott has built an intelligent document processing solution that helps homeowners easily identify racially restrictive covenants in their property title documents. This intelligent document processing solution uses machine learning to scan titles, deeds, and other property documents, searching the text for racially restrictive language. The Washington State Association of County Auditors is also working with John L. Scott to provide digitized deeds, titles, and CC&Rs from their database, starting with King County, Washington.

Once these racial covenants are identified, John L. Scott team members guide homeowners through the process of modifying the discriminatory restrictions from their home’s title, with the support of online notary services such as Notarize.

With a goal of building a solution that the lean team at John L. Scott could manage, McBride’s team worked with AWS to evaluate different services and stitch them together in a modular, repeatable way that met the team’s vision and principles for speed and scale. To minimize management overhead and maximize scalability, the team worked together to build a serverless architecture for handling document ingestion and restrictive language identification using several key AWS services:

  • Amazon Simple Storage Service – Documents are stored in an Amazon S3 data lake for secure and highly available storage.
  • AWS Lambda – Documents are processed by Lambda as they arrive in the S3 data lake. Original document images are split into single-page files and analyzed with Amazon Textract (text detection) and Amazon Comprehend (text analysis).
  • Amazon Textract – Amazon Textract automatically converts raw images into text blocks, which are scanned using fuzzy string pattern matching for restrictive language. When restrictive language is identified, Lambda functions create new image files that highlight the language using the coordinates supplied by Amazon Textract. Finally, records of the restrictive findings are stored in an Amazon DynamoDB table.
  • Amazon Comprehend – Amazon Comprehend analyzes the text output from Amazon Textract and identifies useful data (entities) like dates and locations within the text. This information is key to identifying where and when restrictions were in effect.

The following diagram illustrates the architecture of the serverless ingestion and identification pipeline.

Building from this foundation, the team also incorporates parcel information (via GeoJSON and shapefiles) from county governments to identify affected property owners so they can be notified and begin the process of remediation. A forthcoming public website will also soon allow property owners to input their address to see if their property is affected by restrictive documents.

Setting a new example for the 21st Century

When asked about what’s next, McBride said working with Amazon Textract and Amazon Comprehend has helped his team serve as an example to other counties and real estate firms across the country who want to bring the project into their geographic area.

“Not all areas will have robust programs like we do in Washington state, with University of Washington volunteers indexing deeds and notifying the homeowners,” McBride said. “However, we hope offering this intelligent document processing solution in the public domain will help others drive change in their local communities.”

Learn more


About the authors

Jeff Stockamp is a Senior Solutions Architect based in Seattle, Washington. Jeff helps guide customers as they build well architected-applications and migrate workloads to AWS. Jeff is a constant builder and spends his spare time building Legos with his son.

Jarman Hauser is a Business Development and Go-to-Market Strategy leader at AWS. He works with customers on leveraging technology in unique ways to solve some of the worlds most challenging social, environmental, and economic challenges globally.

Moussa Koulbou is a Senior Solutions Architecture leader at AWS. He helps customers shape their cloud strategy and accelerate their digital velocity by creating the connection between intent and action. He leads a high-performing Solutions Architects team to deliver enterprise-grade solutions that leverage AWS cutting-edge technology to enable growth and solve the most critical business and social problems.