Each year, the world generates more data than the previous year. In 2020 alone, an estimated 59 zettabytes of data will be "created, captured, copied, and consumed," according to the International Data Corporation - enough to fill about a trillion 64-gigabyte hard drives.
But just because data are proliferating doesn't mean everyone can actually use them. Companies and institutions, rightfully concerned with their users' privacy, often restrict access to datasets - sometimes within their own teams. And now that the Covid-19 pandemic has shut down labs and offices, preventing people from visiting centralized data stores, sharing information safely is even more difficult.
Without access to data, it's hard to make tools that actually work. Enter synthetic data: artificial information developers and engineers can use as a stand-in for real data.