Use external Google Cloud Storage buckets for fine-tuning while keeping your data private with secure, isolated access
Use external Google Cloud Storage (GCS) buckets for fine-tuning while keeping your data private. Fireworks creates proxy datasets that reference your external buckets—data is only accessed during fine-tuning within a secure, isolated cluster.
Your data never leaves your GCS bucket except during fine-tuning, ensuring maximum privacy and security.
You need to grant access to three service accounts:
fireworks-control-plane@fw-ai-cp-prod.iam.gserviceaccount.com
storage.buckets.getIamPolicy
permissioninference@fw-ai-cp-prod.iam.gserviceaccount.com
Create a Proxy Dataset
Create a dataset that references your external GCS bucket:
Ensure your gsutil path points directly to the JSONL file. If the file is in a folder, make sure the folder contains only the intended file.
Start Fine-tuning
Use the proxy dataset to create a fine-tuning job:
For additional options, run: firectl create sftj -h
Your data never leaves your GCS bucket except during fine-tuning
Access is limited to isolated fine-tuning clusters
Reference external data without copying or moving files