Data Management

bioAF provides centralized data management with automatic organization and lifecycle policies.

File browser

The Data & Files section gives you a unified view of all data on the platform:

Browse by experiment, project, or file type
See file metadata (size, upload date, source)
Upload new files via drag-and-drop
Download files individually or in bulk

File browser with experiment-organized file tree, metadata columns, and download buttons

Data lifecycle

bioAF organizes data across four storage tiers that map to the research workflow:

Tier	Purpose	Retention
Ingest	Incoming files from sequencer or upload	Moved to Raw after processing
Raw	Original FASTQ files, untouched	Permanent
Working	Intermediate pipeline outputs	30 days (configurable)
Results	Final outputs (h5ad, plots, QC reports)	Permanent

i Where is my data stored?

All data is stored in Google Cloud Storage (GCS) buckets in your own GCP project. You have full ownership and access, bioAF organizes and manages the data but never moves it outside your project.

Auto-ingest

If your sequencing core or instrument is on the same network, bioAF can automatically detect and import new FASTQ files as they’re produced. This uses Google Cloud Pub/Sub to watch for new files and trigger the import pipeline.

Dataset browser

The Datasets view provides a higher-level view of your data organized by experiment and analysis stage, making it easy to find specific results without navigating the raw file tree.