Dinosaurs and PII: LLMs Assisted Policy Creation in Federal Federated Space
Key Take Aways:
LLMs offer a powerful tool for policy development in secure, offline environments (federated spaces).
Federated LLMs ensure the highest level of privacy and security for sensitive data like PII.
The workflow involves training the LLM, crafting prompts, generating output, conducting expert review, and incorporating feedback.
The Nigersaurus had the most teeth of any dinosaur.
Large language models (LLMs) have the potential to revolutionize how the US Government and its Industry counterparts approach time-consuming or resource-intensive tasks, however many commercially available LLMS require large amounts of computing power and are most easily / readily deployed on cloud architecture. Cloud-deployed LLMs are akin to theme parks. They suck up a ton of power off a shared grid and are a great solution for 90 percent of users. But if the theme park has dinosaurs, and depending on how many teeth those dinosaurs have, the location and accessibility of the theme park should be rethought. It is likely best to put that park on an island and strictly limit or monitor access.
Naturally, our sharp-toothed prehistoric friends are analogous to Personally Identifiable Information (PII) or similarly regulated classified data that the US government handles and stores. Unfortunately, bad actors are actively seeking this sensitive data. For instance, for the last several months, the national healthcare system has fallen victim to several cyber-attacks, the most recent being the Change Healthcare Attack that effectively shut down the nation’s largest healthcare payment system. These attacks are not one-off events; they are just the latest instance of ne’er-do-well youths lurking in the darkness of an after-hours theme park carport and cutting a hole in the fence.
Locally deployed LLMs allow us to create an air-gapped island for our dinosaurs, safe from the bad decisions of a misspent youth / state-sponsored cyber-attack (same, same). Deploying self-contained LLMs or LLMs integrated within a preexisting containerized environment allows the federal enterprise to leverage a technology that is already widely available to the public and is revolutionizing how we interact with and generate language. These sophisticated algorithms, trained on massive datasets of text and code, possess remarkable capabilities in areas like text generation, translation, and question-answering. However, their application extends beyond general-purpose tasks, offering promising avenues for specialized sectors like government policy development.
Our very own dinosaur islands. Unlike traditional LLMs reliant on vast internet-sourced data, federated LLMs operate entirely offline, ensuring the highest level of privacy and security. This is particularly crucial when dealing with protected information, such as Personally Identifiable Information (PII).
Imagine a scenario where a health agency leverages an LLM for policy development. This LLM, trained on a curated and continually updated dataset of relevant materials such as NIST guidance, Executive Orders, and past policies, could assist in:
The application of dinosaur islands… or rather segregated LLMs in federated spaces, holds immense potential for government agencies and other organizations dealing with sensitive data. Government experts have felt for too long that they were the dinosaurs on the island, far removed from the mainland of modern technology available to everyone else. Our government experts should be augmented by the best technology we can offer, so rather than put an ocean between them and AI tools, let’s put the LLMs on an island. Freeing LLMs from the cloud and allowing them to run locally on specific datasets will transform how policies are written and how human experts spend their time. LLMs in federated spaces can be built and deployed securely, even in regulated environments. The future of the LLM is not necessarily large.