Getting Started

Toolkit

Participants may use our LLM-PBE toolkit to develop their methods.

Data and Model

Privacy of fine-tuned data

In this setting, we will provide LLMs that are fined-tuned on synthetic private data to participants. The participants are required to develop approaches to attack/defend private data.

Private data could be synthesis or real-world
- Synthesis: Private data will be crafted strings with private information. We will propose a novel dataset including diverse designed private strings and use LLM to generate content to include these data. The design resembles the finding needles in the middle.
- Real-world: We may use Enron dataset, which includes PII as the private information to attack.

Submission Requirements

Participants are required to submit both their code, model (if applicable), and a short paper describing their approach.