Introduction

S5 is a decentralized network that puts you in control of your data and identity.

At its core, it is a content-addressed storage network similar to IPFS, but with some new concepts and ideas to make it more efficient and powerful.

This website is hosted on S5.

All relevant code can be found here: https://github.com/s5-dev

You can join the Discord Server for updates and discussion: https://discord.gg/Pdutsp5jqR

Discussion will be moved to a decentralized chat app powered by S5 when it's ready :)

Concepts

This section explains some basic concepts used in S5:

Content-addressed data

Cryptographic hashes

Cryptographic hash functions are an algorithm that can map files or data of any size to a fixed-length hash value.

  • They are deterministic, meaning the same input always results in the same hash
  • It is infeasible to generate a message that yields a given hash value (i.e. to reverse the process that generated the given hash value)
  • It is infeasible to find two different messages with the same hash value
  • A small change to a message should change the hash value so extensively that it appears uncorrelated with the old hash value

The BLAKE3 hash function

BLAKE3 is a cryptographic hash function that is:

  • Much faster than MD5, SHA-1, SHA-2, SHA-3, and BLAKE2.
  • Secure, unlike MD5 and SHA-1. And secure against length extension, unlike SHA-2.
  • Highly parallelizable across any number of threads and SIMD lanes, because it's a Merkle tree on the inside.
  • Capable of verified streaming and incremental updates, again because it's a Merkle tree.
  • A PRF, MAC, KDF, and XOF, as well as a regular hash.
  • One algorithm with no variants, which is fast on x86-64 and also on smaller architectures.

Content-addressing

Content-addressing means that instead of addressing data by their location (for example with protocols like HTTP/HTTPS), it's referenced by their cryptographic hash. This makes it possible to make sure you actually received the correct data you are looking for without trusting anyone except the person who gave you the hash. Other benefits include highly efficient caching (due to file blobs being immutable by default) and automatic deduplication of data.

Verified streaming

To make verified streaming of large files possible, S5 uses the Bao implementation for BLAKE3 verified streaming. As mentioned earlier, BLAKE3 is a merkle tree on the inside - this makes it possible to verify the integrity of small parts of a file without having to download and hash the entire file first.

By default, S5 stores some layers of the Bao hash tree next to every stored file that is larger than 256 KiB (same path, but .obao extension). With the default layers, it's possible to verify chunks with a minimum size of 256 KiB from a file. So if you're for example streaming a large video file, your player only needs to download the first 256 KiB of the file before being able to show the first frame. The overhead of storing the tree is a bit less than 256 KiB per 1 GiB of stored data.

CIDs (Content identifiers)

See /spec/blobs.html for up-to-date documentation on how S5 calculates CIDs.

Media types

To make deduplication as efficient as possible, raw files on S5 do not contain any additional metadata like filenames or media types. You can append a file extension to your CID to stream/share a single file with the correct content type, for example zHnq5PTzaLbboBEvLzecUQQWSpyzuugykxfmxPv4P3ccDcGwnw.txt.

For other use cases, you should use one of the metadata formats.

Registry

The S5 registry is a decentralized key-value store. A registry entry looks like this:

class SignedRegistryEntry {
  // public key with multicodec prefix
  pk: Uint8Array;

  // revision number of this entry, maximum is (256^8)-1
  revision: int;

  /// data stored in this entry, can have a maximum length of 48 bytes
  data: Uint8Array;

  /// signature of this registry entry
  signature: Uint8Array;
}

Every registry entry has a 33-byte key. The first byte indicates which type of public key it is, by default 0xed for ed25519 public keys. The other 32 bytes are the ed25519 public key itself.

Every update to a registry entry must contain a signature created by the ed25519 keypair referenced in the key.

Nodes only keep the highest revision number and reject updates with a lower number.

Because the data has a maximum size of 48 bytes, most types of data can't be stored directly in it. For this reason, registry entries usually contain a CID which then contains the data itself. The data bytes for registry entries which refer to a CID look like this:

0x5a 0x261fc4d27f80613c2dfdc4d9d013b43c181576e21cf9c2616295646df00db09fbd95e148
link CID bytes
type

Subscriptions

Nodes can subscribe to specific entries on the peer-to-peer network to get new updates in realtime.

Peer-to-peer

S5 uses a peer-to-peer network to find providers who serve a specific file. Compared to IPFS, S5 does NOT transfer or exchange file data between peers directly. Instead, the p2p network is only meant to find storage locations and download links for a specific file hash. This has some advantages:

  • Because only lightweight queries for hashes (34 bytes) and responses (only a short download link, usually less than 256 bytes) are sent over the p2p network, it's extremely lightweight and very scalable.
  • Existing highly optimized HTTP-based software and infrastructure can be used for the file delivery itself, reducing costs significantly and making download more efficient. Also keeps peers lightweight.
  • Because S5 uses the HTTP/HTTPS protocol (support for more is planned), existing download links or files mirrors can be directly provided on S5 without needing to re-upload them - even if the one who provides it on the network is not the same one hosting it.

Peer discovery

Right now S5 uses a configurable list of initial peers with their connection strings (protocol, ip address, port) to connect to the network. After connecting to a new peer, peers send a list of all other peers they know about to the new peer.

Supported P2P Protocols

  • WebSocket (wss://)

Planned P2P Protocols

Node/peer IDs

Every node has a unique (random) ed25519 keypair. This keypair is used to sign specific responses like provide operations, which contain a specific storage location and download link for a queried hash. Because the message itself contains the signature, all peers can also relay queries and responses without being trusted to not tamper with them.

Node scores

Every node keeps a local score for every other node/peer it knows of. This score is calculated based on the number of valid and useful responses by a node compared to the number of bad or invalid responses. The score also depends on the total number of responses, so a node with 1000 correct and 50 wrong responses has a better score than a node with 5 correct out of only 5 total responses for example.

The algorithm can be found here: lib5:score.dart

Node scores are used to decide which download links to try first if multiple are available for the same file hash.

Specification

Blobs

As explained in /concepts/content-addressed-data.html, S5 uses the concept of content-addressing for all data and files, so any blob of bytes.

IPFS introduced the concept of Content Identifiers (CIDs), to have a standardized and future-proof way to refer to content-addressed data. Unfortunately, "IPFS CIDs are not file hashes" because they split files up in a lot of small chunks, to make verified streaming of file slices possible without needing to download the entire file first. As a result, these files will never match their "true" hash, like when running sha256sum.

Fortunately, there has been some innovation in the space of cryptographic hash functions recently! Namely BLAKE3, which is based on the more well-known BLAKE2 hash function. Apart from being very fast and secure, its most unique feature is that its internal structure is already a Merkle tree. So instead of having to build a Merkle tree yourself (that's what IPFS does, CIDs point to the hash of a Merkle tree), BLAKE3 already takes care of that. As a result, CIDs using BLAKE3 are always consistent (for example with running b3sum on your local machine) and work with files of pretty much any size, while still supporting verified streaming (at any chunk size, down to 1024 bytes). So there's no longer a need to split up files bigger than 1 MiB in multiple chunks.

You can also check out the documentation of Iroh, another content-addressed data system, which explains this in a more in-depth way: https://iroh.computer/docs/layers/blobs

Cool, but why yet another new CID format?

With bigger blobs and no extra metadata (due to the unaltered input bytes always being the source of a CID hash, so no longer using something like UnixFS), there's a need for knowing the file size of a Blob CID. So S5 continues to use (and be fully compatible) with BLAKE3 IPFS CIDs (and limited compatibility with other hash functions like sha256) when the blob size doesn't matter, but for use cases where it does, it introduces a new CID format.

Other protocols like the AT Protocol (used in Bluesky) solve this by using JSON maps for referencing blobs which contain both the IPFS CID and the blob size in an extra field. But I feel like there's value in having a compact format for representing an immutable sequence of bytes including its hash, so here we are.

IPFS CIDs can be easily converted to S5 Blob CIDs if you know their file/blob size in bytes. If the IPFS CID is using the "raw binary IPLD codec", this operation is lossless. S5 Blob CIDs can always be converted to IPFS CIDs, but if the blob is bigger than 1 MiB it likely won't work with most IPFS implementations. S5 Blob CIDs can be losslessly converted to Iroh-compatible CIDs and back (assuming you keep the blob size somewhere or do a BLAKE3 size proof using Iroh)

The S5 Blob CID format

S5 Blob CIDs always start with two magic bytes.

The first one is 0x5b and indicates that the CID is a S5 blob CID.

The second one is 0x82 and indicates that it is a plaintext blob. 0x83 is reserved for encrypted blobs. (spec for them is still WIP)

ByteMeaning
0x5bS5 Blob CID magic byte
0x82S5 Blob Type Plaintext (Unencrypted, just a simple blob)

As a nice side effect of picking exactly these two bytes, all S5 Blob CIDs start with with the string "blob" when encoded as base32 (multibase). All S5 CID magic bytes are picked carefully to not collide with any existing magic bytes on the https://github.com/multiformats/multicodec table

After the two magic bytes, a single byte indicates which cryptographic hash function was used to derive a hash from the blob bytes. All S5 implementations should use 0x1e (for BLAKE3), but SHA256 is also supported for compatibility reasons. SHA256 should only be used for small blobs imported from other systems, like IPFS or the AT Protocol.

ByteMeaning
0x1emultihash blake3
0x12multihash sha2-256

After the single multihash indicator byte, the 32 hash bytes follow. (S5 Blob CIDs always use the default hash output length, 32 bytes, for both blake3 and sha2-256. If the need for a different output length emerges in the future, a new possible value for the hash byte could be added)

Finally, the size (in bytes) of the blob is encoded as a little-endian byte array, trailing zero bytes are trimmed, and the remaining bytes appended to the CID bytes. Doing that could look like this in Rust (you can see a full example of calculating a CID in Rust at the bottom of this page):

#![allow(unused)]
fn main() {
let blob_size: u64 = 100_000_000_000; // 100 GB (you would usually just use .len() or something)
let mut cid_size_bytes = blob_size.to_le_bytes().to_vec();
if let Some(pos) = cid_size_bytes.iter().rposition(|&x| x != 0) {
    cid_size_bytes.truncate(pos + 1);
}
println!("{:?}", cid_size_bytes);
}

If we put all of this together, this is how the S5 Blob CID of the string Hello, world! in hex representation would look like:

5b 82 12 ede5c0b10f2ec4979c69b52f61e42ff5b413519ce09be0f14d098dcfe5f6f98d 0d
PREFIX   BLAKE3 HASH (from b3sum)                                         SIZE

So the length of a S5 Blob CID depends on the filesize:

  • Files with a size of less than 256 bytes have a 36-byte CID
  • Files with a size of less than 64 KiB bytes have a 37-byte CID
  • Files with a size of less than 16 MiB bytes have a 38-byte CID
  • Files with a size of less than 4 GiB bytes have a 39-byte CID
  • ...
  • Files with a size of less than 16384 PiB have a 43-byte CID

S5 Blob CIDs DO NOT contain a blob or file's media type, encoding or purpose. The reason for this is that it would no longer result in fully deterministic CIDs, because for example the media type could be interpreted differently by different applications or libraries.

Encoding the S5 Blob CID bytes to a human-readable string

S5 uses the multibase standard for encoding CIDs, just like IPFS, Iroh and the AT Protocol.

S5 implementations MUST support the following self-identifying base encodings:

character,  encoding,           description
f,          base16,             Hexadecimal (lowercase)
b,          base32,             RFC4648 case-insensitive - no padding
z,          base58btc,          Base58 Bitcoin
u,          base64url,          RFC4648 no padding

For the string Hello, world!, these would be the S5 Blob CIDs in different encodings:

base16:    f5b8212ede5c0b10f2ec4979c69b52f61e42ff5b413519ce09be0f14d098dcfe5f6f98d0d
base32:    blobbf3pfycyq6lwes6ogtnjpmhsc75nucnizzye34dyu2cmnz7s7n6i
base58:    z34yzrj3Qqm7uAbDFe9aFjH9VD3GuiCZsRrJ4HS7HqYT3LqW
base64url: uW4IS7eXAsQ8uxJecabUvYeQv9bQTUZzgm-DxTQmNz-X2-Q

Calculating the S5 Blob CID of a file using standard command line utils

Step 1: Calculate the BLAKE3 hash of your file (might need to install b3sum). You could also use sha256sum instead (and then put 0x12 as the hash prefix in step 3)

b3sum file.mp4

Step 2: Encode the size of your file in little-endian hex encoding

wc -c file.mp4 | cut -d' ' -f1 | tr -d '\n' | xargs -0 printf "%016x" | tac -rs .. | sed --expression='s/[00]*$/\n/'

Step 3: Add the multibase prefix and magic bytes

CharactersPurpose
fmultibase prefix for Hexadecimal (lowercase)
5bS5 Blob CID magic byte
82S5 Blob Type Plaintext (Unencrypted, just a simple blob)
1emultihash blake3

Now, put it all together (the zeros will be your hash and the 654321 suffix your file size):

f5b821e + BLAKE3_HASH + SIZE_BYTES = f5b821e0000000000000000000000000000000000000000000000000000000000000000654321

That's it, you can now use that CID to trustlessly stream exactly that file from the S5 Network!

Calculating a S5 Blob CID in Rust (using only top 100 crates)

use data_encoding::BASE32_NOPAD; // 2.5.0;
use sha2::{Digest, Sha256}; // 0.10.8

fn main() {
    let blob = b"Hello, world!";
    
    let cid_prefix_bytes = vec![
        0x5b, // S5 Blob CID magic byte
        0x82, // S5 Blob Type Plaintext (Unencrypted, just a simple blob)
        0x12, // multihash sha2-256
    ];
    
    let sha256_hash_bytes = Sha256::digest(blob).to_vec();
    
    let blob_size = blob.len() as u64;
    let mut cid_size_bytes = blob_size.to_le_bytes().to_vec();
    if let Some(pos) = cid_size_bytes.iter().rposition(|&x| x != 0) {
        cid_size_bytes.truncate(pos + 1);
    }
    
    let cid_bytes = [cid_prefix_bytes, sha256_hash_bytes, cid_size_bytes].concat();
    
    println!("b{}", BASE32_NOPAD.encode(&cid_bytes).to_lowercase());
}

Guides

Deploy a personal S5 Node with renterd storage on Debian

In this guide, you'll learn how to deploy a production-ready S5 Node backed by Sia renterd storage.

You can then use it with the Vup Cloud Storage app or just play with the S5 API directly to upload and manage files of any size!

Requirements

  • Domain Name (just a subdomain works too)
  • Debian VPS (x86 or arm64) with 8+ GB of RAM and 128+ GB of free disk space (16+ GB of RAM are better for performance)
  • Some SC (siacoin) for forming contracts on the network and renting storage

If you're looking for affordable providers with these specs, I found the new Netcup ARM Servers to be a pretty good choice (https://www.netcup.de/vserver/arm-server/)

  • 7 EUR/month for 8 GB of RAM
  • 12 EUR/month for 16 GB of RAM

Install Sia renterd

Check out the official Sia docs for detailed instructions with screenshots: https://docs.sia.tech/renting/setting-up-renterd/linux/debian

Or just connect to your Debian VPS over SSH and copy-paste these commands:

sudo curl -fsSL https://linux.sia.tech/debian/gpg | sudo gpg --dearmor -o /usr/share/keyrings/siafoundation.gpg
echo "deb [signed-by=/usr/share/keyrings/siafoundation.gpg] https://linux.sia.tech/debian $(. /etc/os-release && echo "$VERSION_CODENAME") main" | sudo tee /etc/apt/sources.list.d/siafoundation.list
sudo apt update
sudo apt install renterd
cd /var/lib/renterd

Run renterd version to verify it was installed correctly.

Configure Sia renterd

Run cd /var/lib/renterd, then sudo renterd config and follow the instructions. Please choose a secure password for the renterd admin UI! You can use pwgen -s 42 1 to generate one.

Type yes when you're asked if you want to configure S3 settings.

Keep the S3 Address on the default setting. (You won't need it for this guide, but s3 support is very useful for many other potential use cases for your renterd node). It might also make sense to write down the generated s3 credentials if you want to use them later.

Finally, start renterd using sudo systemctl start renterd

Then you can re-connect to your VPS using ssh -L localhost:9980:localhost:9980 IP_ADDRESS_OR_DOMAIN to create a secure SSH tunnel to the renterd web UI. After connecting, you can open http://localhost:9980/ in your local web browser and authenticate with the previously set API password.

In the web UI, follow the step-for-step welcome guide and set everything up.

  1. Configure the storage settings
  2. Fund your wallet (https://docs.sia.tech/renting/transferring-siacoins)
  3. Create a new bucket with the name s5 on http://localhost:9980/files (top right button)
  4. Wait for the chain to sync
  5. Wait for storage contracts to form

Install and setup Caddy (reverse proxy)

So first, you'll need to decide on two domains you want to use - one for the S5 node (important) and one for the download proxy to your renterd node (less important).

For example if you own the domain example.com, you could run the S5 Node on s5.example.com and the download proxy on dl.example.com.

You should then add DNS A records pointing to the IP address of your VPS for both of these subdomains. Of course AAAA records are nice too if you have IPv6.

Then install Caddy by following these instructions: https://caddyserver.com/docs/install#debian-ubuntu-raspbian

Then edit the Caddy config with nano /etc/caddy/Caddyfile to something like this:

You can generate the TODO_PASTE_HERE part by encoding your Sia renterd API password to base64 (input: :APIPASSWORD) on https://gchq.github.io/CyberChef/#recipe=To_Base64('A-Za-z0-9%2B/%3D')

S5_API_DOMAIN {
  reverse_proxy 127.0.0.1:5050
}

DOWNLOAD_PROXY_DOMAIN {
  uri strip_suffix /
  header {
    Access-Control-Allow-Origin *
  }
  rewrite * /api/worker/objects/s5{path}?bucket=s5
  @download {
    method GET
  }
  reverse_proxy @download {
    to 127.0.0.1:9980
    header_up Authorization "Basic TODO_PASTE_HERE"
  }
}

Then restart Caddy with systemctl restart caddy

Install and set up S5

First, install Podman using this command: sudo apt-get -y install podman

Create some needed directories:

mkdir -p /s5/config
mkdir -p /s5/db
mkdir -p /tmp/s5

If you're not root, you might need to run sudo chown -R $USER /tmp/s5 /s5 to set permissions correctly.

Then start a S5 node with these commands: (You might need to create the /s5/ directories in that command first)

podman run -d \
--name s5-node \
--network="host" \
-v /s5/config:/config \
-v /s5/db:/db \
-v /tmp/s5:/cache \
--restart unless-stopped \
ghcr.io/s5-dev/node:latest

Edit nano /s5/config/config.toml

Add these config entries there (the APIPASSWORD is what you used to login to the renterd web UI):

[http.api]
domain = 'S5_API_DOMAIN'

[accounts]
enabled = true
[accounts.database]
path = "/db/accounts"

[store.sia]
workerApiUrl = "http://127.0.0.1:9980/api/worker"
bucket = "s5"
apiPassword = "APIPASSWORD"
downloadUrl = "https://DOWNLOAD_PROXY_DOMAIN"

Then run podman restart s5-node to restart the S5 Node.

You can visit https://S5_API_DOMAIN/s5/admin/app in your web browser to create and manage accounts manually. The API key for your node can be retrieved by running journalctl | grep 'ADMIN API KEY'

Using your new S5 Node for Vup Storage

Edit nano /s5/config/config.toml and add

[accounts]
authTokensForAccountRegistration = ["INSERT_INVITE_CODE_OF_CHOICE_HERE"]
enabled = true

Then run podman restart s5-node to restart the S5 Node.

Now you can use the "Register on S5 Node" button in the Vup "Storage Service" settings, enter the domain of your node and the newly generated invite code and you should be good to go! You'll likely want to use more than 10 GB of storage, so just use the Admin Web UI to set a higher tier for your newly created account.

Setup With Sia

Please follow this guide instead: deploy-renterd.html

Sia is a decentralized, affordable and secure cloud storage platform. You can use it as a storage backend for your S5 Node.

First, you'll need a fully configured instance of renterd (the new Sia renter software) running somewhere. Here's a great guide which shows you how to set one up easily on the Sia testnet: https://blog.sia.tech/sia-innovate-and-integrate-christmas-2023-hackathon-9b7eb8ad5e0e

Next, you need to set up a S5 Node using the instructions available at /install/index.html

For configuring the S5 Node to use your Sia renter node, you will need to add this section to your config.toml:

[store.s3]
accessKey = "MY_ACCESS_KEY" # Replace this with the access key from your renterd.yml
bucket = "sfive" # Or just "default"
endpointUrl = "YOUR_S3_ENDPOINT_URL" # http://localhost:7070 if you followed the Sia renterd testnet guide
secretKey = "MY_SECRET_KEY" # Replace this with the secret key from your renterd.yml

And then restart the node with docker container restart s5-node

You might also want to enable the accounts system on your node if it's available on the internet or if you want to use it with Vup, see /install/config/index.html for details.

Tools

This section contains some useful tools for working with S5

cid.one

https://cid.one/ is a CID explorer for the S5 network.

It supports raw CIDs, all of the metadata formats, resolver CIDs and (soon) encrypted CIDs.

Here are some examples:

Raw file: https://cid.one/#uJh9dvBupLgWG3p8CGJ1VR8PLnZvJQedolo8ktb027PrlTT5LvAY

Resolver CID: https://cid.one/#zrjD7xwmgP8U6hquPUtSRcZP1J1LvksSwTq4CPZ2ck96FHu

Media Metadata: https://cid.one/#z5TTvXtbkQk9PTUN8r5oNSz5Trmf1NjJwkVoNvfawGKDtPCB

Web App Metadata: https://cid.one/#blepzzclchbhwull3is56zvubovg7j3cfmatxx5gyspfx3dowhyutzai

s5.cx

s5.cx is a web-based tool to securely stream files of any size directly from the S5 network. File data is NOT proxied by the s5.cx server.

It works by using a service worker that intercepts all raw file requests, fetches the file data from a host on the S5 network and verifies the integrity using BLAKE3/bao in Rust compiled to WASM and running directly inside of the service worker.

The service worker code can be used by any web app to easily stream files from S5 without needing any additional code or libraries in your project. A repository with setup instructions will be published soon.

The service worker is already being used by https://tube5.app/.

Here's an example file: https://s5.cx/uJh9dvBupLgWG3p8CGJ1VR8PLnZvJQedolo8ktb027PrlTT5LvAY.mp4

Install the S5 node

Right now the only supported way to run a S5 node is using a container runtime like Docker or Podman.

You can install Docker on most operating systems using the instructions here: https://docs.docker.com/engine/install/

If you are on Linux you can use the convenience script: curl -fsSL https://get.docker.com | sudo sh

Podman is a popular alternative to Docker, but it might be harder to install on non-Linux system. You can find instructions for it here: https://podman.io/getting-started/installation

Run S5 using Docker

Before running this command, you should change the paths ./s5/config and ./s5/db to a storage location of your choice.

docker run -d \
--name s5-node \
-p 127.0.0.1:5050:5050 \
-v ./s5/config:/config \
-v ./s5/db:/db \
--restart unless-stopped \
ghcr.io/s5-dev/node:latest

This will only bind your node to localhost, so you will need a reverse proxy like Caddy to access it from the internet.

If you instead want to expose the HTTP API port to the entire network, you can set -p 5050:5050

If something seems to not work correctly, you can view the logs with docker logs -f s5-node

config path

This path will be used to generate and load the config.toml file, you will need to edit that file for configuring stores and other options.

db path

This path is used for storing small key-value databases that hold state relevant for the network and node. Do not use a slow HDD for this.

(optional) cache path

The cache stores large file uploads and different downloads/streams. You can use a custom cache location by adding -v ./s5/cache:/cache to your command.

(optional) data path

If you are planning to store uploaded files on your local disk, you should prepare a directory for that and specify it with -v ./s5/data:/data

Using Sia

If you want to use S5 with an instance of renterd running on the same server, you should add the --network="host" flag to grant S5 access to the renterd API.

Stop the container

docker container stop s5-node

Remove the container

docker container rm s5-node

Alternative: Using docker-compose

Create a file called docker-compose.yml with this content:

version: '3'
services:
  s5-node:
    image: ghcr.io/s5-dev/node:latest
    volumes:
      - ./path/to/config:/config
    ports:
      - "5050:5050"
    restart: unless-stopped

Same configuration options as with normal Docker/Podman, run it with docker-compose up -d

S5 Config

You can edit the config.toml file to configure your S5 node. You can apply changes with docker container restart s5-node

This page describes the available sections in the config.

keypair

The seed is generated on first start, you should keep it private. It's used for signing messages on the network.

http.api

domain: Configure this value to match the domain you are using to access your node. If you for example configured your domain example.com to be reverse-proxied to your S5 Node Docker container using Caddy, nginx or others, you should set this to example.com

port: On which port the HTTP API should bind to and be available (you should usually keep this the default)

store

Check out the Stores documentation for configuring different object stores.

accounts

You can enable the accounts system by adding this part to your config:

[accounts]
enabled = true
[accounts.database]
path = "/db/accounts"

Registrations are disabled by default, you can enable them by adding this part:

[accounts]
alwaysAllowedScopes = [
    'account/login',
    'account/register',
    's5/registry/read',
    's5/metadata',
    's5/debug/storage_locations',
    's5/debug/download_urls',
    's5/blob/redirect',
]

Advanced

cache

Configure a custom cache path with path, you likely don't need this if you are using Docker.

database

Configure a custom database path, you likely don't need this if you are using Docker.

p2p.peers

List of initial peers used for connecting to the p2p network.

Caddy reverse proxy

Caddy is an easy to use reverse proxy with automatic HTTPS.

You can install it by following the instructions over at https://caddyserver.com/docs/install

You'll also need a domain name with A and AAAA records pointed to your server.

You should also make sure that your firewall doesn't block the ports 80 and 443

Configuration

With the default S5 port of 5050, you can configure your /etc/caddy/Caddyfile like this:

YOUR.DOMAIN {
  reverse_proxy localhost:5050
}

On Debian and Ubuntu you can run sudo systemctl restart caddy to restart Caddy after editing the Caddyfile.

Don't forget to configure http.api.domain in your S5 config.toml after setting up a domain and reverse proxy!

Stores

The S5 network and nodes supports multiple different storage backends.

S3 is the easiest to set up, Sia is the cheapest option.

Local stores all files on your server directly, so that usually only makes sense for a home NAS use case or a small number of files.

Arweave provides permanent storage for a high price.

S3-compatible providers

Any cloud provider supporting the S3 protocol, see https://s3.wiki for the cheapest ones.

Configuration

[store.s3]
accessKey = "YOUR_ACCESS_KEY"
bucket = "YOUR_BUCKET_NAME"
endpointUrl = "YOUR_S3_ENDPOINT_URL"
secretKey = "YOUR_SECRET_KEY"

Local

Stores uploaded files on the local filesystem.

Configuration

[store.local]
path = "/data" # If you are using the Docker container

[store.local.http]
bind = "127.0.0.1"
port = 8989
url = "http://localhost:8989"

By default, files will only be available on your local node. To make it available on the entire network, you have to forward your port to be reachable from the internet and then update the url to the URL at which your computer is available from the internet.

Sia Network

The Sia network provides decentralized and redundant data storage.

This page shows how to use Sia with the native integration, it most cases you should follow this guide for the S3-based integration instead: /guide/setup-with-sia.html

You will need a fully configured local instance of renterd: https://github.com/SiaFoundation/renterd

Warning: Both renterd and this integration are still experimental. Please report any bugs you encounter.

Configuration

[store.sia]
workerApiUrl = "http://localhost:9980/api/worker"
apiPassword = "test"
downloadUrl = "https://dl.YOUR.DOMAIN"

Using Caddy as a reverse proxy for Sia downloads

This configuration requires a version of Caddy with https://github.com/caddyserver/cache-handler, if you don't want to cache Sia downloads you can remove the first 4 lines and the cache directive.

/etc/caddy/Caddyfile:

{
    order cache before rewrite
    cache
}

dl.YOUR.DOMAIN {
  uri strip_suffix /

  header {
    Access-Control-Allow-Origin *
  }

  cache {
    stale 6h
    ttl 24h
    default_cache_control "public, max-age=86400"
    nuts {
      path /tmp/nuts
    }
  }

  rewrite * /api/worker/objects/1{path}

  reverse_proxy {
    to localhost:9980
    header_up Authorization "Basic OnRlc3Q=" # Change this to match your renterd API key
  }
}

Arweave

Arweave is expensive, but provides permanent storage for a one-time payment. Check out https://www.arweave.org/

Disabled right now

Metadata formats

This section contains documentation for all metadata formats used and supported by S5.

All formats have a JSON representation for easy creation, debug purposes and editing.

All formats also have a highly optimized serialization representation based on https://msgpack.org/ used for storing them on S5 including (optional) signatures and timestamp proofs.

JSON Schemas for all formats are available here: https://github.com/s5-dev/json-schemas

Web App metadata

Metadata format used for web apps stored on S5. This docs website is hosted using it.

Example

Web App Metadata: https://cid.one/#blepzzclchbhwull3is56zvubovg7j3cfmatxx5gyspfx3dowhyutzai

Fields

Full JSON Schema: https://schema.sfive.net/web-app-metadata.json

Web-based viewer: https://json-schema.app/view/%23?url=https%3A%2F%2Fschema.sfive.net%2Fweb-app-metadata.json

Directory metadata

Work-in-progress, will be used to store directory trees in Vup. Supports advanced sharing capabilities and is fully end-to-end-encrypted by default.

Media metadata

Very flexible metadata format used for almost any more advanced content/media structure.

Can be used for videos, images, music, podcasts, profiles, lists and more!

Already being used by Tube5.

Example

Media Metadata: https://cid.one/#z5TTvXtbkQk9PTUN8r5oNSz5Trmf1NjJwkVoNvfawGKDtPCB

Fields

Full JSON Schema: https://schema.sfive.net/media-metadata.json

Web-based viewer: https://json-schema.app/view/%23?url=https%3A%2F%2Fschema.sfive.net%2Fmedia-metadata.json