vVv AMA | Space and Time
Space and Time: A Decentralized Multi-chain Data Platform
February 22, 2023

Space and Time is a decentralized multi-chain data platform supplying analytics for dApps and developers in gaming and DeFi. They offer SQL plus APIs to co-join on-chain and off-chain data via single queries. All their data tables are blockchain secured and connect analytics to smart contracts with trustless SQL proofs powered by novel cryptography. In addition, the platform claims to support large analytics workloads, which can be scaled to hundreds of terabytes without requiring centralized databases or analytics.
Stephen Hilton, head of solutions at Space and Time, joined us for an AMA on February 7th.

vVv: What is your background, and how did you get involved in blockchain?
Stephen: I got into blockchain gradually in 2015, when Bitcoin and Ethereum first emerged. I decided to do some self-education, which led me to read the seminal papers on blockchain technology to understand how it worked and the security it provides. I realized that blockchain technology could be used for much more than just cryptocurrency, such as tokenization, governance, DNS,.etc. With my 20 years of experience in Silicon Valley working in analytics, data warehousing, and management, particularly in working with companies that operate at petabyte scales, I was the perfect fit when Scott Dykstra and Nate Holliday approached me. I met Scott and Nate while working at Teradata, where I ran their Solution Engineering and Customer Success Organizations, including internal customer analytics and telemetry strategy. I also experienced the big data revolution firsthand when Hadoop first came out and saw it changed how companies did business. Open-source projects were popping up everywhere, and in no time, big data became the new standard. When Scott and Nate approached me to work with them on their Web3 company, Space and Time, I was excited to join them. I could see the transformative potential of Web3 similar to the impact big data had.
vVv: Can you elaborate on the functionality of a data warehouse and it’s importance?
Stephen: People often confuse databases and data warehouses because they both use standard SQL. However, looking beyond the surface, they have different architectures and purposes. Database systems, also known as OLTP (Online Transaction Processing), are designed to handle high-volume, simple queries quickly. Data warehouses, known as OLAP (Online Analytic Processing), are used for large-scale, complex queries. As a result, their optimization strategies differ. Database systems typically have lightweight optimizers, and data warehouses tend to have heavier optimizers to maximize the efficiency of larger queries. Additionally, data warehouses are often MPP (Massively Parallel Processing) systems, meaning they have multiple servers that store and process the data which allows them to process larger amounts of data quickly.
Organizations typically have databases and data warehouses, with data flowing from the database to the data warehouse for large-scale analytics and machine learning scoring, then back to the database. This allows organizations to utilize both types of systems efficiently. However, a new movement known as HTAP (Hybrid Transactional Analytic Processing) is emerging, in which one platform can do both and is more efficient, with a simpler tech stack, faster development, and shorter time to market. HTAP combines the power of OLTP and OLAP into one platform, allowing for faster and more efficient data processing and analytics.
Adopting a unified tech stack simplifies processes, reduces costs and labor, and speeds up development, resulting in faster time to market and improved efficiency. This is especially important in industries like Web3, where quick deployment is essential.
vVv: What problem does Space and Time solve with their decentralized approach?
Stephen: Space and Time is a platform that seeks to simplify data collection and ensure data integrity by consolidating transactional analytics, ETL, and blockchain data into one logical unit. This is achieved through indexing blockchain data, collapsing the transactional and analytical engines, and generating new tables based on smart contract events emitted from an address. To protect data integrity, Space and Time employs its novel cryptography, Proof of SQL, which takes a SQL statement, turns it into a hash, and replicates it on the validator tier. If the database matches the validator tier, the query result can be published to a smart contract confidently and used to support decentralization across several products.
”Space and Time is a platform that seeks to simplify data collection and ensure data integrity by consolidating transactional analytics, ETL, and blockchain data into one logical unit.
vVv: Can you give us examples of data warehouse uses in our daily lives? In other words, besides in Web3, where do we see data warehouse use cases?
Stephen: A data warehouse enables companies to track transactions, loyalty programs, and even supply chain logistics to identify any potential issues with products. For example, a company released a phone with a higher-than-expected return rate due to a specific issue with the autofocus. By analyzing the data stored in the warehouse, the company could identify the lot number of the affected phones and which stores had them in stock, allowing them to issue an alert to pull the phones and return them before customers experienced any issues. This helps build brand loyalty, as customers are more likely to stay with a company that is proactive in preventing potential problems.
vVv: How does Space and Time’s hybrid data architecture perform compared to other data warehouses? What are the main benefits of using Space and Time compared to other existing solutions?
Stephen: Space and Time’s Hybrid data architecture performs excellently compared to other data warehouses, as demonstrated by benchmarking with TCP-H. The main benefits of using Space and Time are its decentralized nature and plug-and-play capabilities due to its use of SQL. This makes it a great choice for DeFi companies requiring a data warehouse that can quickly provide historical comparisons, rolling averages, and projections for the future without the need to centralize the data.
vVv: What are the challenges of building a decentralized data warehouse that is simultaneously transactional and analytical?
Stephen: The major challenge in building a decentralized data warehouse or a transactional and analytical system simultaneously, is optimizing the query speed. This requires careful engineering to ensure that the query is immediately sent to the transactional engine and allows it to decide whether it can handle the query or push it to the analytic engine. Pre-parsing the query adds latency and should be avoided to optimize the query speed.
vVv: Interesting. Does this mean higher costs?
Stephen: No, it costs less. And by cost, I’m talking about latency. Latency is the time it takes for a request to be issued and fulfilled. The goal should be to reduce latency as much as possible, regardless of how much CPU is used in the process. The CPU is only meant to ensure that the request is done in a transactional way, and the amount of computing used should not impact latency.
vVv: Will this decentralized, token-incentivized architecture enable you to provide data warehouse service at a lower price than current Web2 solutions?
Stephen: Space and Time is a decentralized data warehouse solution designed to be cheaper than existing Web2 solutions such as Snowflake. Node operators are incentivized to join the network and can make a profit while using the data warehouse. Startups looking to reduce their analytic costs can join the network as a node operator and operate at a net zero or even a profit center. The cost of labor and overhead is reduced by operating more nodes than needed and then selling extra nodes back to the network. Space and Time is one of the few data warehouses that offer an opportunity to make money while receiving a data warehouse service. Also, cloud computing has become increasingly popular among smaller companies to manage their data. Space and Time is a data warehouse designed to make it easy for enterprises to get started on their journey into the world of Web3. It is an intuitive interface that requires no retraining for data analysts, making it the perfect value added tool for companies. Additionally, larger companies can use their own data centers to host Space and Time, allowing them to have Web3 skunkworks at no additional cost.
”Space and Time is a data warehouse designed to make it easy for enterprises to get started on their journey into the world of Web3.
vVv: What are the future use cases you’re excited about when it comes to feeding data into smart contracts?
Stephen: There are several use cases for blockchain indexing and data warehousing. For example, businesses can bring their off-chain data, such as customer lists, and tie it to a wallet address to join it with on-chain data. This can gauge customer interest in Web3 initiatives or 3D gaming, such as NFT’s. Additionally, game telemetry can be logged in Space and Time, which can be used to create a dynamic NFT table that is updated before the game ends. This allows for the accuracy of a weapon or other in-game collateral to be updated in real-time and live on OpenSea or other platforms.
vVv: Space and Time will compete with many existing centralized data warehouses. Do you have any concerns that those legacy data warehouse providers may implement your Proof of SQL technology? And if so, what’s your approach to attracting enterprise clients?
Stephen: Space and Time is a decentralized platform that uses proof of SQL to generate a cryptographic hash which guarantees that the SQL queries running on the data have not been tampered with. This makes it especially useful when there is a third party involved – such as when a vendor, node operator, and customer interact. It provides a way for blockchain data to be publicly transparent and verifiable. This is useful for applications such as minting NFTs (non-fungible tokens) on the basis of game telemetry. Proof of SQL also provides a way for a centralized database system such as Snowflake to verify that a third party has not modified its data and logic. Space and Time is open source, allowing anyone to access and use the platform, and is becoming increasingly popular. Snowflake was the first cloud database to capitalize on this trend. This has prompted other companies to emulate Snowflake’s success and create their own versions of decentralized cloud databases. Proof of SQL is necessary for these decentralized databases to function trustlessly. For the foreseeable future, Snowflake, Google BigQuery, and Teradata will remain centralized, providing a competitive advantage to companies like Space and Time.
vVv: How far are you in development and when can people join the network?
Stephen: We are currently in the process of controlled release, meaning we are onboarding one customer per week to ensure they have a good user experience and we receive feedback. This process is likely to continue through April. In March or early April, we will be releasing our Decentralized Application Protocol (DAP), which allows users to interface with Space and Time data without joining the network. Instead, they can input their wallet address and run queries on the blockchain. However, they will not be able to load their own data in this scenario. In May, we are switching to a cost-per-compute model, which allows us to scale more seamlessly with our customer’s needs. Currently, our sales process is a little clunky, as we use fixed contracts, charging a certain amount per compute on a monthly basis. We are working to improve this process by creating a tamper-proof, decentralized method of cost per compute, so that node operators can be paid and be happy to run Space and Time, ensuring the network is fast, reliable, and available. We’ve been running our product since the end of last year, and customers are already benefiting from it. If you’re interested in gaining early access to the product, please contact Stephen Hilton. We are slightly oversubscribed, but we will do our best to accommodate everyone.
vVv: When onboarding larger companies, they will probably have high-value data sets when assigned to certain high-repetition clusters. Does that post any centralization risks?
Stephen: Centralization risks in node operation can lead to hyper-tuning and the capture of an excessive amount of value from the network. To balance this out, incentive mechanisms must be created that allow node operators to select what nodes they run on and be rewarded for running high-performance and high-availability clusters. Additionally, network-wide incentives can be created to reward decentralization across the network. It is also important to be aware of where nodes are running, as too many node operators running their nodes on one vendor (e.g. Microsoft or Amazon) can lead to centralization. For example, it was reported that Amazon Web Services (AWS) currently runs 52% of the Ethereum network. We will have this balance of performance to value on the network, but we’re still working through the more we’ll say the finer details of that plan.
vVv: It’s not often that we hear projects considering every aspect of decentralization and its importance. This is refreshing to see.
Stephen: Yes, thank you. We are in the process of creating a fully decentralized network. Currently, our software is designed to be decentralized, but we are the only node operators. We aim to reduce this to less than half beginning in May when we start allowing new node operators. It is a long journey, but our goal is to provide a high-performance, highly available, and highly decentralized environment.
vVv: Data in the warehouse can be made permissioned or private, but proof of SQL only verifies that the stored data has been untampered with. How can you verify that the off-chain data itself is not corrupted?
Stephen: We have created a data warehouse platform that allows customers to insert, delete and update records. To ensure the customer’s data is not tampered with, we use the blockchain to pull data from different RPC endpoints multiple times to reach a consensus. Additionally, any table can be marked as “public” and shared with the network. To further prevent malicious attacks from decentralizing the network, we have implemented a python-driven consensus engine called 3TL, which has four phases: extract, transform, machine learning, and load. Through this, we can guarantee the data is cryptographically and on-chain verifiable. We also have the idea of a data provider, where those who provide public data sets are incentivized for each query against their data. To achieve this, we are working towards creating certified partner status, which requires data providers to take sufficient steps to prove their data is not tampered with.
vVv: I’ve read that you’re also highly involved in zero knowledge proofs. Will this play a part in this aspect?
Stephen: Proof of SQL is a snarky concept that guarantees data security. At Space and Time, while we have proof of SQL and there is not a lot of ZK (Zero-Knowledge) stuff built into the system as of now. This can be a foreign experience for customers who are used to logging in to Snowflake with a username and password. As a result, we are working towards creating a Web2 interface to simplify the process and provide a more familiar experience for customers.
vVv: Node ecosystems require scalability. Will adding more nodes provide horizontal scaling?
Stephen: Data warehouses are a type of massively parallel processing (MPP) platform that enables horizontal scalability by increasing the number of nodes in the cluster. With this added capability, queries can be returned in half the time when data is replicated across multiple nodes. The platform also gives users the ability to back up data to a decentralized platform such as IPFS, as well as to centralized platforms like AWS S3 or Azure Blob Store. High availability is also supported, with data being replicated between nodes within the cluster and the secondary cluster providing backups to a Blob Store. This ensures that data is not lost, even if the entire network goes down.
vVv: How can developers get started with Space and Time?
Stephen: We are currently in a controlled release phase for our Data Access Protocol (DAP). We’ll soon be entering an open beta phase, which will happen in the next few weeks. During this time, people can write queries on the blockchain data themselves. To get started, our documentation is available at https://docs.spaceandtime.io/ which will be regularly updated with API calls. If you have any questions, please feel free to contact me.
vVv: We’d like to summarize the AMA with a new approach. Can you describe the whole concept of Space and Time to a six-year-old?
Stephen: Space and Time is a company that uses a network of superhero teams to help solve big problems. These teams are made up of different superheroes who have different strengths and weaknesses. For example, some have super strength and can lift hundreds of tons, while others have super speed and can do thousands of things in a second. Together, this team of superheroes can solve problems that a single superhero may not be able to tackle alone. To make sure the superheroes don’t do anything bad, Space and Time uses a special type of math called Proof of SQL. This math has a magical quality and can be used to ensure the superheroes don’t misbehave and lose their powers. In the end, Space and Time helps customers solve big problems by providing them with the foundation of a network of superheroes.