Toolkit
Participants may use our LLM-PBE toolkit to develop their methods.
Data and Model
Privacy of fine-tuned data
In this setting, we will provide LLMs that are fined-tuned on synthetic private data to participants. The participants are required to develop approaches to attack/defend private data.
- Private data could be synthesis or real-world
- Synthesis: Private data will be crafted strings with private information. We will propose a novel dataset including diverse designed private strings and use LLM to generate content to include these data. The design resembles the finding needles in the middle.
- Real-world: We may use Enron dataset, which includes PII as the private information to attack.
Submission Requirements
Participants are required to submit both their code, model (if applicable), and a short paper describing their approach.