May & June access restricted

The University of Pittsburgh Law Building is currently undergoing major renovations. Access to the building and law library is restricted until July. Please email plas@pitt.edu or call (412) 648-1490 if you need assistance.

Center for Text Analytic Methods in Legal Studies

Applying new analytic techniques to new sources of legal text data to evaluate legal hypotheses in ways not previously possible.


With support from a Pitt Momentum Funds 2020-21 Teaming Grants Award, we created the Center for Text Analytic Methods in Legal Studies, a multi-disciplinary collaboration across the Schools of Law, Computing and Information, the RAND Corporation, Worcester Polytechnic Institute (WPI), and Duquesne Law. The aim is to apply new machine learning (ML) and natural language processing (NLP) techniques to corpora of legal text data to evaluate socially relevant empirical hypotheses in new ways. An increasing awareness of social issues in the legal domain requires deeper investigation into court decisions and other legal texts. The Center develops and applies NLP/ML tools to evaluate hypotheses about court decisions concerning social issues involving bias, racism, gender equality, immigration, public health, crime, and education.

New developments in NLP and ML and the availability of large text corpora, such as the Harvard Law School Caselaw Access Project’s data comprising 6.7 million federal and state court decisions, make it possible to analyze legal texts as never before. The new tools enable collecting data-supported evidence on the existence of entities, patterns, and relationships in the legal data, so that one can assess hypotheses about law with new kinds of empirically based arguments. The Center will focus on developing and applying the NLP/ML tools to evaluate hypotheses about systemic aspects of court decisions involving social issues. The Center engages legal domain experts at RAND, Pitt Law, and Duquesne Law in applying the new techniques and text corpora to investigate hypotheses in their specialty areas. It will explore the pedagogical potential of engaging law and pre-law students in annotating legal cases to improve case reading skills and train machine learning models.