Center for Text Analytic Methods in Legal Studies

Pitt Momentum Funds 2022 Scaling Grant

 

The Center for Text Analytic Methods in Legal Studies is a research collaboration of experts from the University of Pittsburgh’s Schools of Law and Computing and Information, the RAND Corporation, Duquesne Law School, and Worcester Polytechnic Institute. Their goal is to apply newly developed machine learning and natural language processing techniques to newly available sources of legal text data to evaluate legal and social questions involving racism, gender equality, immigration, public health, crime, or education that have real world policy implications and that traditionally have not been evaluable to the same extent without text analytic tools. Initially, the Center is investigating drug interdiction automobile stop cases concerning the constitutionality of police decisions to search. Such cases are a persistent cause of racial friction and have led to thousands of court decisions at the state and federal level. The team seeks to identify factors on which courts rely in assessing if police have “reasonable suspicion” to detain a motorist for further investigation (e.g., a police dog drug sniff), to assign statistical weights to these factors across thousands of cases, to identify explicit or implicit racial bias in the cases and explore their relationships to factor weights and case outcomes, and to draw out the social and legal policy implications of their findings. [Link to poster.]


Pitt Momentum Funds 2020-21 Teaming Grant

 

With support from a Pitt Momentum Funds 2020-21 Teaming Grants Award, we created the Center for Text Analytic Methods in Legal Studies, a multi-disciplinary collaboration across the Schools of Law, Computing and Information, the RAND Corporation, Worcester Polytechnic Institute (WPI), and Duquesne Law. The aim is to apply new machine learning (ML) and natural language processing (NLP) techniques to corpora of legal text data to evaluate socially relevant empirical hypotheses in new ways. An increasing awareness of social issues in the legal domain requires deeper investigation into court decisions and other legal texts. The Center develops and applies NLP/ML tools to evaluate hypotheses about court decisions concerning social issues involving bias, racism, gender equality, immigration, public health, crime, and education.

New developments in NLP and ML and the availability of large text corpora, such as the Harvard Law School Caselaw Access Project’s data comprising 6.7 million federal and state court decisions, make it possible to analyze legal texts as never before. The new tools enable collecting data-supported evidence on the existence of entities, patterns, and relationships in the legal data, so that one can assess hypotheses about law with new kinds of empirically based arguments. The Center will focus on developing and applying the NLP/ML tools to evaluate hypotheses about systemic aspects of court decisions involving social issues. The Center engages legal domain experts at RAND, Pitt Law, and Duquesne Law in applying the new techniques and text corpora to investigate hypotheses in their specialty areas. It will explore the pedagogical potential of engaging law and pre-law students in annotating legal cases to improve case reading skills and train machine learning models.