Introduction

S5 is a decentralized network that puts you in control of your data and identity.

At its core, it is a content-addressed storage network similar to IPFS, but with some new concepts and ideas to make it more efficient and powerful.

This website is hosted on S5.

All relevant code can be found here: https://github.com/s5-dev

You can join the Discord Server for updates and discussion: https://discord.gg/Pdutsp5jqR

Discussion will be moved to a decentralized chat app powered by S5 when it's ready :)

Concepts

This section explains some basic concepts used in S5:

Content-addressed data

Cryptographic hashes

Cryptographic hash functions are an algorithm that can map files or data of any size to a fixed-length hash value.

  • They are deterministic, meaning the same input always results in the same hash
  • It is infeasible to generate a message that yields a given hash value (i.e. to reverse the process that generated the given hash value)
  • It is infeasible to find two different messages with the same hash value
  • A small change to a message should change the hash value so extensively that it appears uncorrelated with the old hash value

The BLAKE3 hash function

BLAKE3 is a cryptographic hash function that is:

  • Much faster than MD5, SHA-1, SHA-2, SHA-3, and BLAKE2.
  • Secure, unlike MD5 and SHA-1. And secure against length extension, unlike SHA-2.
  • Highly parallelizable across any number of threads and SIMD lanes, because it's a Merkle tree on the inside.
  • Capable of verified streaming and incremental updates, again because it's a Merkle tree.
  • A PRF, MAC, KDF, and XOF, as well as a regular hash.
  • One algorithm with no variants, which is fast on x86-64 and also on smaller architectures.

Content-addressing

Content-addressing means that instead of addressing data by their location (for example with protocols like HTTP/HTTPS), it's referenced by their cryptographic hash. This makes it possible to make sure you actually received the correct data you are looking for without trusting anyone except the person who gave you the hash. It also makes all files immutable by default.

Verified streaming

To make verified streaming of large files possible, S5 uses the Bao implementation for BLAKE3 verified streaming. As mentioned earlier, BLAKE3 is a merkle tree on the inside - this makes it possible to verify the integrity of small parts of a file without having to download and hash the entire file first.

By default, S5 stores some layers of the Bao hash tree next to every stored file that is larger than 256 KiB (same path, but .obao extension). With the default layers, it's possible to verify chunks with a minimum size of 256 KiB from a file. So if you're for example streaming a large video file, your player only needs to download the first 256 KiB of the file before being able to show the first frame. The overhead of storing the tree is a bit less than 256 KiB per 1 GiB of stored data.

CIDs (Content identifiers)

Hash values produced by the BLAKE3 hash function have a size of 32 bytes, for example c4d27f80613c2dfdc4d9d013b43c181576e21cf9c2616295646df00db09fbd95 (hex-encoded).

Instead of using this value directly, S5 prepends two additional bytes to reference raw files:

0x26 cidTypeRaw: This CID contains a raw file without any additional metadata

0x1f mhashBlake3Default: This CID contains a BLAKE3 hash of the file with the default 256-bit output size

You can find a list of all up-to-date magic bytes here: lib5:constants.dart

In addition to these two magic bytes, the size of the file (in bytes) is encoded with little-endian encoding and appended to the hash bytes. For example a file with 18657 bytes, would be encoded like this:

0x26 0x1f 0xc4d27f80613c2dfdc4d9d013b43c181576e21cf9c2616295646df00db09fbd95 0xe148
type hash blake3-256-hash                                                    filesize

So the length of a raw file CID depends on the filesize:

  • Files with a size of less than 256 bytes have a 35-byte CID
  • Files with a size of less than 64 KiB bytes have a 36-byte CID
  • Files with a size of less than 16 MiB bytes have a 37-byte CID
  • Files with a size of less than 4 GiB bytes have a 38-byte CID
  • ...
  • Files with a size of less than 16384 PiB have a 42-byte CID

Encoding the CID bytes to a human-readable form

S5 uses the multibase standard for encoding the CID bytes. Basically the first character indicates how the bytes are encoded, here's a list of which ones are supported by S5:

base32,            b,    rfc4648 case-insensitive - no padding
base58btc,         z,    base58 bitcoin
base64url,         u,    rfc4648 no padding

By default, base58btc with the z prefix is used for newly uploaded files because it's short and easy to copy.

So the CID from the example earlier would be encoded like this:

base58btc: zHnq5PTzaLbboBEvLzecUQQWSpyzuugykxfmxPv4P3ccDcGwnw
base32:    beyp4jut7qbqtylp5ytm5ae5uhqmbk5xcdt44eylcsvsg34anwcp33fpbja
base64url: uJh_E0n-AYTwt_cTZ0BO0PBgVduIc-cJhYpVkbfANsJ-9leFI

Media types

To make deduplication as efficient as possible, raw files on S5 do not contain any additional metadata like filenames or media types. You can append a file extension to your CID to stream/share a single file with the correct content type, for example zHnq5PTzaLbboBEvLzecUQQWSpyzuugykxfmxPv4P3ccDcGwnw.txt.

For other use cases, you should use one of the metadata formats.

Registry

The S5 registry is a decentralized key-value store. A registry entry looks like this:

class SignedRegistryEntry {
  // public key with multicodec prefix
  pk: Uint8Array;

  // revision number of this entry, maximum is (256^8)-1
  revision: int;

  /// data stored in this entry, can have a maximum length of 48 bytes
  data: Uint8Array;

  /// signature of this registry entry
  signature: Uint8Array;
}

Every registry entry has a 33-byte key. The first byte indicates which type of public key it is, by default 0xed for ed25519 public keys. The other 32 bytes are the ed25519 public key itself.

Every update to a registry entry must contain a signature created by the ed25519 keypair referenced in the key.

Nodes only keep the highest revision number and reject updates with a lower number.

Because the data has a maximum size of 48 bytes, most types of data can't be stored directly in it. For this reason, registry entries usually contain a CID which then contains the data itself. The data bytes for registry entries which refer to a CID look like this:

0x5a 0x261fc4d27f80613c2dfdc4d9d013b43c181576e21cf9c2616295646df00db09fbd95e148
link CID bytes
type

Subscriptions

Nodes can subscribe to specific entries on the peer-to-peer network to get new updates in realtime.

Peer-to-peer

S5 uses a peer-to-peer network to find providers who serve a specific file. Compared to IPFS, S5 does NOT transfer or exchange file data between peers directly. Instead, the p2p network is only meant to find storage locations and download links for a specific file hash. This has some advantages:

  • Because only lightweight queries for hashes (34 bytes) and responses (only a short download link, usually less than 256 bytes) are sent over the p2p network, it's extremely lightweight and very scalable.
  • Existing highly optimized HTTP-based software and infrastructure can be used for the file delivery itself, reducing costs significantly and making download more efficient. Also keeps peers lightweight.
  • Because S5 uses the HTTP/HTTPS protocol (support for more is planned), existing download links or files mirrors can be directly provided on S5 without needing to re-upload them - even if the one who provides it on the network is not the same one hosting it.

Peer discovery

Right now S5 uses a configurable list of initial peers with their connection strings (protocol, ip address, port) to connect to the network. After connecting to a new peer, peers send a list of all other peers they know about to the new peer.

Supported P2P Protocols

  • Custom TCP (authenticated, but not encrypted)

Planned P2P Protocols

  • QUIC+TLS or nQUIC
  • WebSocket
  • WebTransport

Node/peer IDs

Every node has a unique (random) ed25519 keypair. This keypair is used to sign specific responses like provide operations, which contain a specific storage location and download link for a queried hash. Because the message itself contains the signature, all peers can also relay queries and responses without being trusted to not tamper with them.

Node scores

Every node keeps a local score for every other node/peer it knows of. This score is calculated based on the number of valid and useful responses by a node compared to the number of bad or invalid responses. The score also depends on the total number of responses, so a node with 1000 correct and 50 wrong responses has a better score than a node with 5 correct out of only 5 total responses for example.

The algorithm can be found here: lib5:score.dart

Node scores are used to decide which download links to try first if multiple are available for the same file hash.

Install the S5 node

Right now the only supported way to run a S5 node is using a container runtime like Docker or Podman.

You can install Docker on most operating systems using the instructions here: https://docs.docker.com/engine/install/

If you are on Linux you can use the convenience script: curl -fsSL https://get.docker.com | sudo sh

Podman is a popular alternative to Docker, but it might be harder to install on non-Linux system. You can find instructions for it here: https://podman.io/getting-started/installation

Run S5 using Docker

Before running this command, you should change the paths ./s5/config and ./s5/db to a storage location of your choice.

docker run -d \
--name s5-node \
-p 127.0.0.1:5050:5050 \
-v ./s5/config:/config \
-v ./s5/db:/db \
--restart unless-stopped \
ghcr.io/s5-dev/node:latest

This will only bind your node to localhost, so you will need a reverse proxy like Caddy to access it from the internet.

If you instead want to expose the HTTP API port to the entire network, you can set -p 5050:5050

If something seems to not work correctly, you can view the logs with docker logs -f s5-node

config path

This path will be used to generate and load the config.toml file, you will need to edit that file for configuring stores and other options.

db path

This path is used for storing small key-value databases that hold state relevant for the network and node. Do not use a slow HDD for this.

(optional) cache path

The cache stores large file uploads and different downloads/streams. You can use a custom cache location by adding -v ./s5/cache:/cache to your command.

(optional) data path

If you are planning to store uploaded files on your local disk, you should prepare a directory for that and specify it with -v ./s5/data:/data

Using Sia

If you want to use S5 with an instance of renterd running on the same server, you should add the --network="host" flag to grant S5 access to the renterd API.

Stop the container

docker container stop s5-node

Remove the container

docker container rm s5-node

Alternative: Using docker-compose

Create a file called docker-compose.yml with this content:

version: '3'
services:
  s5-node:
    image: ghcr.io/s5-dev/node:latest
    volumes:
      - ./path/to/config:/config
    ports:
      - "5050:5050"
    restart: unless-stopped

Same configuration options as with normal Docker/Podman, run it with docker-compose up -d

S5 Config

You can edit the config.toml file to configure your S5 node. You can apply changes with docker container restart s5-node

This page describes the available sections in the config.

keypair

The seed is generated on first start, you should keep it private. It's used for signing messages on the network.

http.api

domain: Configure this value to match the domain you are using to access your node. If you for example configured your domain example.com to be reverse-proxied to your S5 Node Docker container using Caddy, nginx or others, you should set this to example.com

port: On which port the HTTP API should bind to and be available (you should usually keep this the default)

store

Check out the Stores documentation for configuring different object stores.

accounts

You can enable the accounts system by adding this part to your config:

[accounts]
enabled = true
[accounts.database]
path = "/db/accounts"

Registrations are disabled by default, you can enable them by adding this part:

[accounts]
alwaysAllowedScopes = [
    'account/login',
    'account/register',
    's5/registry/read',
    's5/metadata',
    's5/debug/storage_locations',
    's5/debug/download_urls',
    's5/blob/redirect',
]

Advanced

cache

Configure a custom cache path with path, you likely don't need this if you are using Docker.

database

Configure a custom database path, you likely don't need this if you are using Docker.

p2p.peers

List of initial peers used for connecting to the p2p network.

Caddy reverse proxy

Caddy is an easy to use reverse proxy with automatic HTTPS.

You can install it by following the instructions over at https://caddyserver.com/docs/install

You'll also need a domain name with A and AAAA records pointed to your server.

You should also make sure that your firewall doesn't block the ports 80 and 443

Configuration

With the default S5 port of 5050, you can configure your /etc/caddy/Caddyfile like this:

YOUR.DOMAIN {
  reverse_proxy localhost:5050
}

On Debian and Ubuntu you can run sudo systemctl restart caddy to restart Caddy after editing the Caddyfile.

Don't forget to configure http.api.domain in your S5 config.toml after setting up a domain and reverse proxy!

Stores

The S5 network and nodes supports multiple different storage backends.

S3 is the easiest to set up, Sia is the cheapest option.

Local stores all files on your server directly, so that usually only makes sense for a home NAS use case or a small number of files.

Arweave provides permanent storage for a high price.

S3-compatible providers

Any cloud provider supporting the S3 protocol, see https://s3.wiki for the cheapest ones.

Configuration

[store.s3]
accessKey = "YOUR_ACCESS_KEY"
bucket = "YOUR_BUCKET_NAME"
endpointUrl = "YOUR_S3_ENDPOINT_URL"
secretKey = "YOUR_SECRET_KEY"

Local

Stores uploaded files on the local filesystem.

Configuration

[store.local]
path = "/data" # If you are using the Docker container

[store.local.http]
bind = "127.0.0.1"
port = 8989
url = "http://localhost:8989"

By default, files will only be available on your local node. To make it available on the entire network, you have to forward your port to be reachable from the internet and then update the url to the URL at which your computer is available from the internet.

Sia Network

The Sia network provides decentralized and redundant data storage.

This page shows how to use Sia with the native integration, it most cases you should follow this guide for the S3-based integration instead: /guide/setup-with-sia.html

You will need a fully configured local instance of renterd: https://github.com/SiaFoundation/renterd

Warning: Both renterd and this integration are still experimental. Please report any bugs you encounter.

Configuration

[store.sia]
workerApiUrl = "http://localhost:9980/api/worker"
apiPassword = "test"
downloadUrl = "https://dl.YOUR.DOMAIN"

Using Caddy as a reverse proxy for Sia downloads

This configuration requires a version of Caddy with https://github.com/caddyserver/cache-handler, if you don't want to cache Sia downloads you can remove the first 4 lines and the cache directive.

/etc/caddy/Caddyfile:

{
    order cache before rewrite
    cache
}

dl.YOUR.DOMAIN {
  uri strip_suffix /

  header {
    Access-Control-Allow-Origin *
  }

  cache {
    stale 6h
    ttl 24h
    default_cache_control "public, max-age=86400"
    nuts {
      path /tmp/nuts
    }
  }

  rewrite * /api/worker/objects/1{path}

  reverse_proxy {
    to localhost:9980
    header_up Authorization "Basic OnRlc3Q=" # Change this to match your renterd API key
  }
}

Arweave

Arweave is expensive, but provides permanent storage for a one-time payment. Check out https://www.arweave.org/

Disabled right now

Guides

Deploy a personal S5 Node with renterd storage on Debian

In this guide, you'll learn how to deploy a production-ready S5 Node backed by Sia renterd storage.

You can then use it with the Vup Cloud Storage app or just play with the S5 API directly to upload and manage files of any size!

Requirements

  • Domain Name (just a subdomain works too)
  • Debian VPS (x86 or arm64) with 8+ GB of RAM and 128+ GB of free disk space (16+ GB of RAM are better for performance)
  • Some SC (siacoin) for forming contracts on the network and renting storage

If you're looking for affordable providers with these specs, I found the new Netcup ARM Servers to be a pretty good choice (https://www.netcup.de/vserver/arm-server/)

  • 7 EUR/month for 8 GB of RAM
  • 12 EUR/month for 16 GB of RAM

Install Sia renterd

Check out the official Sia docs for detailed instructions with screenshots: https://docs.sia.tech/renting/setting-up-renterd/linux/debian

Or just connect to your Debian VPS over SSH and copy-paste these commands:

sudo curl -fsSL https://linux.sia.tech/debian/gpg | sudo gpg --dearmor -o /usr/share/keyrings/siafoundation.gpg
echo "deb [signed-by=/usr/share/keyrings/siafoundation.gpg] https://linux.sia.tech/debian $(. /etc/os-release && echo "$VERSION_CODENAME") main" | sudo tee /etc/apt/sources.list.d/siafoundation.list
sudo apt update
sudo apt install renterd
cd /var/lib/renterd

Run renterd version to verify it was installed correctly.

Configure Sia renterd

Run cd /var/lib/renterd, then sudo renterd config and follow the instructions. Please choose a secure password for the renterd admin UI! You can use pwgen -s 42 1 to generate one.

Type yes when you're asked if you want to configure S3 settings.

Keep the S3 Address on the default setting. (You won't need it for this guide, but s3 support is very useful for many other potential use cases for your renterd node). It might also make sense to write down the generated s3 credentials if you want to use them later.

Finally, start renterd using sudo systemctl start renterd

Then you can re-connect to your VPS using ssh -L localhost:9980:localhost:9980 IP_ADDRESS_OR_DOMAIN to create a secure SSH tunnel to the renterd web UI. After connecting, you can open http://localhost:9980/ in your local web browser and authenticate with the previously set API password.

In the web UI, follow the step-for-step welcome guide and set everything up.

  1. Configure the storage settings
  2. Fund your wallet (https://docs.sia.tech/renting/transferring-siacoins)
  3. Create a new bucket with the name s5 on http://localhost:9980/files (top right button)
  4. Wait for the chain to sync
  5. Wait for storage contracts to form

Install and setup Caddy (reverse proxy)

So first, you'll need to decide on two domains you want to use - one for the S5 node (important) and one for the download proxy to your renterd node (less important).

For example if you own the domain example.com, you could run the S5 Node on s5.example.com and the download proxy on dl.example.com.

You should then add DNS A records pointing to the IP address of your VPS for both of these subdomains. Of course AAAA records are nice too if you have IPv6.

Then install Caddy by following these instructions: https://caddyserver.com/docs/install#debian-ubuntu-raspbian

Then edit the Caddy config with nano /etc/caddy/Caddyfile to something like this:

You can generate the TODO_PASTE_HERE part by encoding your Sia renterd API password to base64 (input: :APIPASSWORD) on https://gchq.github.io/CyberChef/#recipe=To_Base64('A-Za-z0-9%2B/%3D')

S5_API_DOMAIN {
  reverse_proxy 127.0.0.1:5050
}

DOWNLOAD_PROXY_DOMAIN {
  uri strip_suffix /
  header {
    Access-Control-Allow-Origin *
  }
  rewrite * /api/worker/objects/s5{path}?bucket=s5
  @download {
    method GET
  }
  reverse_proxy @download {
    to 127.0.0.1:9980
    header_up Authorization "Basic TODO_PASTE_HERE"
  }
}

Then restart Caddy with systemctl restart caddy

Install and set up S5

First, install Podman using this command: sudo apt-get -y install podman

Create some needed directories:

mkdir -p /s5/config
mkdir -p /s5/db
mkdir -p /tmp/s5

If you're not root, you might need to run sudo chown -R $USER /tmp/s5 /s5 to set permissions correctly.

Then start a S5 node with these commands: (You might need to create the /s5/ directories in that command first)

podman run -d \
--name s5-node \
--network="host" \
-v /s5/config:/config \
-v /s5/db:/db \
-v /tmp/s5:/cache \
--restart unless-stopped \
ghcr.io/s5-dev/node:latest

Edit nano /s5/config/config.toml

Add these config entries there (the APIPASSWORD is what you used to login to the renterd web UI):

[http.api]
domain = 'S5_API_DOMAIN'

[accounts]
enabled = true
[accounts.database]
path = "/db/accounts"

[store.sia]
workerApiUrl = "http://127.0.0.1:9980/api/worker"
bucket = "s5"
apiPassword = "APIPASSWORD"
downloadUrl = "https://DOWNLOAD_PROXY_DOMAIN"

Then run podman restart s5-node to restart the S5 Node.

You can visit https://S5_API_DOMAIN/s5/admin/app in your web browser to create and manage accounts manually. The API key for your node can be retrieved by running journalctl | grep 'ADMIN API KEY'

Using your new S5 Node for Vup Storage

Edit nano /s5/config/config.toml and add

[accounts]
authTokensForAccountRegistration = ["INSERT_INVITE_CODE_OF_CHOICE_HERE"]
enabled = true

Then run podman restart s5-node to restart the S5 Node.

Now you can use the "Register on S5 Node" button in the Vup "Storage Service" settings, enter the domain of your node and the newly generated invite code and you should be good to go! You'll likely want to use more than 10 GB of storage, so just use the Admin Web UI to set a higher tier for your newly created account.

Setup With Sia

Sia is a decentralized, affordable and secure cloud storage platform. You can use it as a storage backend for your S5 Node.

First, you'll need a fully configured instance of renterd (the new Sia renter software) running somewhere. Here's a great guide which shows you how to set one up easily on the Sia testnet: https://blog.sia.tech/sia-innovate-and-integrate-christmas-2023-hackathon-9b7eb8ad5e0e

Next, you need to set up a S5 Node using the instructions available at /install/index.html

For configuring the S5 Node to use your Sia renter node, you will need to add this section to your config.toml:

[store.s3]
accessKey = "MY_ACCESS_KEY" # Replace this with the access key from your renterd.yml
bucket = "sfive" # Or just "default"
endpointUrl = "YOUR_S3_ENDPOINT_URL" # http://localhost:7070 if you followed the Sia renterd testnet guide
secretKey = "MY_SECRET_KEY" # Replace this with the secret key from your renterd.yml

And then restart the node with docker container restart s5-node

You might also want to enable the accounts system on your node if it's available on the internet or if you want to use it with Vup, see /install/config/index.html for details.

Tools

This section contains some useful tools for working with S5

cid.one

https://cid.one/ is a CID explorer for the S5 network.

It supports raw CIDs, all of the metadata formats, resolver CIDs and (soon) encrypted CIDs.

Here are some examples:

Raw file: https://cid.one/#uJh9dvBupLgWG3p8CGJ1VR8PLnZvJQedolo8ktb027PrlTT5LvAY

Resolver CID: https://cid.one/#zrjD7xwmgP8U6hquPUtSRcZP1J1LvksSwTq4CPZ2ck96FHu

Media Metadata: https://cid.one/#z5TTvXtbkQk9PTUN8r5oNSz5Trmf1NjJwkVoNvfawGKDtPCB

Web App Metadata: https://cid.one/#blepzzclchbhwull3is56zvubovg7j3cfmatxx5gyspfx3dowhyutzai

s5.cx

s5.cx is a web-based tool to securely stream files of any size directly from the S5 network. File data is NOT proxied by the s5.cx server.

It works by using a service worker that intercepts all raw file requests, fetches the file data from a host on the S5 network and verifies the integrity using BLAKE3/bao in Rust compiled to WASM and running directly inside of the service worker.

The service worker code can be used by any web app to easily stream files from S5 without needing any additional code or libraries in your project. A repository with setup instructions will be published soon.

The service worker is already being used by https://tube5.app/.

Here's an example file: https://s5.cx/uJh9dvBupLgWG3p8CGJ1VR8PLnZvJQedolo8ktb027PrlTT5LvAY.mp4

Metadata formats

This section contains documentation for all metadata formats used and supported by S5.

All formats have a JSON representation for easy creation, debug purposes and editing.

All formats also have a highly optimized serialization representation based on https://msgpack.org/ used for storing them on S5 including (optional) signatures and timestamp proofs.

JSON Schemas for all formats are available here: https://github.com/s5-dev/json-schemas

Web App metadata

Metadata format used for web apps stored on S5. This docs website is hosted using it.

Example

Web App Metadata: https://cid.one/#blepzzclchbhwull3is56zvubovg7j3cfmatxx5gyspfx3dowhyutzai

Fields

Full JSON Schema: https://schema.sfive.net/web-app-metadata.json

Web-based viewer: https://json-schema.app/view/%23?url=https%3A%2F%2Fschema.sfive.net%2Fweb-app-metadata.json

Directory metadata

Work-in-progress, will be used to store directory trees in Vup. Supports advanced sharing capabilities and is fully end-to-end-encrypted by default.

Media metadata

Very flexible metadata format used for almost any more advanced content/media structure.

Can be used for videos, images, music, podcasts, profiles, lists and more!

Already being used by Tube5.

Example

Media Metadata: https://cid.one/#z5TTvXtbkQk9PTUN8r5oNSz5Trmf1NjJwkVoNvfawGKDtPCB

Fields

Full JSON Schema: https://schema.sfive.net/media-metadata.json

Web-based viewer: https://json-schema.app/view/%23?url=https%3A%2F%2Fschema.sfive.net%2Fmedia-metadata.json