Mirage automatically finds and fills computer vision data gaps with synthetic data
Filling perception dataset gaps is hard because real data is expensive, hard to gather, and incomplete.
--->
The average cost of real data is $5 per labelled image, while 80% of it goes unused. In addition, it is hard to gather, with an average of 2 month turnaround, and incomplete (edge cases).
90% of generated synthetic data is thrown out because it does not help improve the perception models. Experiments to integrate synthetic data took us over 1.5 years to obtain meaningful improvements.
Out of box synthetic data WILL NOT integrate into existing perception pipelines. --->
Mirage tackles the synthetic data problems of redundancy and integration difficulty head on:
1. Active learning finds dataset weaknesses to reduce redundancy with targeted synthetic data
2. Our neural reconstruction (NeRF) constructs synthetic 3D scenes to fill dataset gaps
3. Integration [with real data] is faster with our generative models (remove sim-to-real gap)
Sign Up