This commit is contained in:
2024-05-09 17:09:00 +03:00
parent 2b1d0dc54c
commit 2eed0b65b7
211 changed files with 154 additions and 144 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 315 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 192 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 154 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 138 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

View File

@@ -0,0 +1,97 @@
# Quantum Safe Storage System for NFT
![](img/nft_architecture.jpg)
The owner of the NFT can upload the data using one of our supported interfaces
- http upload (everything possible on https://nft.storage/ is also possible on our system)
- filesystem
Every person in the world can retrieve the NFT (if allowed) and the data will be verified when doing so. The data is available everywhere in the world using multiple interfaces again (IPFS, HTTP(S), ...). Caching happens on global level. No special software or account on threefold is needed to do this.
The NFT system uses a super reliable storage system underneath which is sustainable for the planet (green) and ultra secure and private. The NFT owner also owns the data.
## Benefits
#### Persistence = owned by the data user (as represented by digital twin)
![](img/nft_storage.jpg)
Is not based on a shared-all architecture.
Whoever stores the data has full control over
- where data is stored (specific locations)
- redundancy policy used
- how long should the data be kept
- CDN policy (where should data be available and how long)
#### Reliability
- data cannot be corrupted
- data cannot be lost
- each time data is fetched back hash (fingerprint) is checked, if issues autorecovery happens
- all data is encrypted and compressed (unique per storage owner)
- data owner chooses the level of redundancy
#### Lookup
- multi URL & storage network support (see further the interfaces section)
- IPFS, HyperDrive URL schema
- unique DNS schema (with long key which is globally unique)
#### CDN support (with caching)
Each file (movie, image) stored is available on many places worldwide.
Each file gets a unique url pointing to the data which can be retrieved on all locations.
Caching happens on each endpoint.
#### Self Healing & Auto Correcting Storage Interface
Any corruption e.g. bitrot gets automatically detected and corrected.
In case of a HD crash or storage node crash the data will automatically be expanded again to fit the chosen redundancy policy.
#### Storage Algoritm = Uses Quantum Safe Storage System as base
Not even a quantum computer can hack data as stored on our QSSS.
The QSSS is a super innovative storage system which works on planetary scale and has many benefits compared to shared and/or replicated storage systems.
It uses forward looking error correcting codes inside.
#### Green
Storage uses upto 10x less energy compared to classic replicated system.
#### Multi Interface
The stored data is available over multiple interfaces at once.
| interface | |
| -------------------------- | ----------------------- |
| IPFS | ![](img/ipfs.jpg) |
| HyperDrive / HyperCore | ![](img/hyperdrive.jpg) |
| http(s) on top of FreeFlow | ![](img/http.jpg) |
| syncthing | ![](img/syncthing.jpg) |
| filesystem | ![](img/filesystem.jpg) |
This allows ultimate flexibility from enduser perspective.
The object (video,image) can easily be embedded in any website or other representation which supports http.
## More Info
* [Zero-OS overview](zos)
* [Quantum Safe Storage System](qsss_home)
* [Quantum Safe Storage Algorithm](qss_algorithm)
* [Smart Contract For IT Layer](smartcontract_it)
!!!def alias:nft_storage,nft_storage_system

View File

@@ -0,0 +1,12 @@
## Quantum Safe Storage use cases
### Backup
A perfect use case for the QSS is backup. Specific capbabilities needed for backup are a core part of a proper backup policy. Characteristics of QSS that makle backups secure, scalable, efficient and sustainable are:
- physical storage devices are always append. The lowest level of the storage devices, ZDB's, are storage engines that work by design as an always append storage device.
- easy provision of these ZDB's makes them almost like old fashioned tape devices that you have on a rotary schedule. Having this capability make is very visible and possible to use, store and phase out stored data in a way that is auditable and can be made very transparant
-
### Archiving
###

View File

@@ -0,0 +1,15 @@
# S3 Service
If you like an S3 interface you can deploy this on top of our eVDC, it works very well together with our [quantumsafe_filesystem](quantumsafe_filesystem).
A good opensource solution delivering an S3 solution is [min.io](https://min.io/).
Thanks to our quantum safe storage layer, you could build fast, robust and reliable storage and archiving solutions.
A typical setup would look like:
![](img/storage_architecture_1.jpg)
> TODO: link to manual on cloud how to deploy minio, using helm (3.0 release)
!!!def alias:s3_storage

View File

@@ -0,0 +1,297 @@
# QSFS getting started on ubuntu setup
## Get components
The following steps can be followed to set up a qsfs instance on a fresh
ubuntu instance.
- Install the fuse kernel module (`apt-get update && apt-get install fuse3`)
- Install the individual components, by downloading the latest release from the
respective release pages:
- 0-db-fs: https://github.com/threefoldtech/0-db-fs/releases
- 0-db: https://github.com/threefoldtech/0-db, if multiple binaries
are available in the assets, choose the one ending in `static`
- 0-stor: https://github.com/threefoldtech/0-stor_v2/releases, if
multiple binaries are available in the assets, choose the one
ending in `musl`
- Make sure all binaries are executable (`chmod +x $binary`)
## Setup and run 0-stor
There are instructions below for a local 0-stor configuration. You can also deploy an eVDC and use the [provided 0-stor configuration](evdc_storage) for a simple cloud hosted solution.
We will run 6 0-db instances as backends for 0-stor. 4 are used for the
metadata, 2 are used for the actual data. The metadata always consists
of 4 nodes. The data backends can be increased. You can choose to either
run 7 separate 0-db processes, or a single process with 7 namespaces.
For the purpose of this setup, we will start 7 separate processes, as
such:
> This assumes you have moved the download 0-db binary to `/tmp/0-db`
```bash
/tmp/0-db --background --mode user --port 9990 --data /tmp/zdb-meta/zdb0/data --index /tmp/zdb-meta/zdb0/index
/tmp/0-db --background --mode user --port 9991 --data /tmp/zdb-meta/zdb1/data --index /tmp/zdb-meta/zdb1/index
/tmp/0-db --background --mode user --port 9992 --data /tmp/zdb-meta/zdb2/data --index /tmp/zdb-meta/zdb2/index
/tmp/0-db --background --mode user --port 9993 --data /tmp/zdb-meta/zdb3/data --index /tmp/zdb-meta/zdb3/index
/tmp/0-db --background --mode seq --port 9980 --data /tmp/zdb-data/zdb0/data --index /tmp/zdb-data/zdb0/index
/tmp/0-db --background --mode seq --port 9981 --data /tmp/zdb-data/zdb1/data --index /tmp/zdb-data/zdb1/index
/tmp/0-db --background --mode seq --port 9982 --data /tmp/zdb-data/zdb2/data --index /tmp/zdb-data/zdb2/index
```
Now that the data storage is running, we can create the config file for
0-stor. The (minimal) config for this example setup will look as follows:
```toml
minimal_shards = 2
expected_shards = 3
redundant_groups = 0
redundant_nodes = 0
socket = "/tmp/zstor.sock"
prometheus_port = 9100
zdb_data_dir_path = "/tmp/zdbfs/data/zdbfs-data"
max_zdb_data_dir_size = 25600
[encryption]
algorithm = "AES"
key = "000001200000000001000300000004000a000f00b00000000000000000000000"
[compression]
algorithm = "snappy"
[meta]
type = "zdb"
[meta.config]
prefix = "someprefix"
[meta.config.encryption]
algorithm = "AES"
key = "0101010101010101010101010101010101010101010101010101010101010101"
[[meta.config.backends]]
address = "[::1]:9990"
[[meta.config.backends]]
address = "[::1]:9991"
[[meta.config.backends]]
address = "[::1]:9992"
[[meta.config.backends]]
address = "[::1]:9993"
[[groups]]
[[groups.backends]]
address = "[::1]:9980"
[[groups.backends]]
address = "[::1]:9981"
[[groups.backends]]
address = "[::1]:9982"
```
> A full explanation of all options can be found in the 0-stor readme:
https://github.com/threefoldtech/0-stor_v2/#config-file-explanation
This guide assumes the config file is saved as `/tmp/zstor_config.toml`.
Now `zstor` can be started. Assuming the downloaded binary was saved as
`/tmp/zstor`:
`/tmp/zstor -c /tmp/zstor_config.toml monitor`. If you don't want the
process to block your terminal, you can start it in the background:
`nohup /tmp/zstor -c /tmp/zstor_config.toml monitor &`.
## Setup and run 0-db
First we will get the hook script. The hook script can be found in the
[quantum_storage repo on github](https://github.com/threefoldtech/quantum-storage).
A slightly modified version is found here:
```bash
#!/usr/bin/env bash
set -ex
action="$1"
instance="$2"
zstorconf="/tmp/zstor_config.toml"
zstorbin="/tmp/zstor"
if [ "$action" == "ready" ]; then
${zstorbin} -c ${zstorconf} test
exit $?
fi
if [ "$action" == "jump-index" ]; then
namespace=$(basename $(dirname $3))
if [ "${namespace}" == "zdbfs-temp" ]; then
# skipping temporary namespace
exit 0
fi
tmpdir=$(mktemp -p /tmp -d zdb.hook.XXXXXXXX.tmp)
dirbase=$(dirname $3)
# upload dirty index files
for dirty in $5; do
file=$(printf "i%d" $dirty)
cp ${dirbase}/${file} ${tmpdir}/
done
${zstorbin} -c ${zstorconf} store -s -d -f ${tmpdir} -k ${dirbase} &
exit 0
fi
if [ "$action" == "jump-data" ]; then
namespace=$(basename $(dirname $3))
if [ "${namespace}" == "zdbfs-temp" ]; then
# skipping temporary namespace
exit 0
fi
# backup data file
${zstorbin} -c ${zstorconf} store -s --file "$3"
exit 0
fi
if [ "$action" == "missing-data" ]; then
# restore missing data file
${zstorbin} -c ${zstorconf} retrieve --file "$3"
exit $?
fi
# unknown action
exit 1
```
> This guide assumes the file is saved as `/tmp/zdbfs/zdb-hook.sh. Make sure the
> file is executable, i.e. chmod +x /tmp/zdbfs/zdb-hook.sh`
The local 0-db which is used by 0-db-fs can be started as follows:
```bash
/tmp/0-db \
--index /tmp/zdbfs/index \
--data /tmp/zdbfs/data \
--datasize 67108864 \
--mode seq \
--hook /tmp/zdbfs/zdb-hook.sh \
--background
```
## Setup and run 0-db-fs
Finally, we will start 0-db-fs. This guides opts to mount the fuse
filesystem in `/mnt`. Again, assuming the 0-db-fs binary was saved as
`/tmp/0-db-fs`:
```bash
/tmp/0-db-fs /mnt -o autons -o background
```
You should now have the qsfs filesystem mounted at `/mnt`. As you write
data, it will save it in the local 0-db, and it's data containers will
be periodically encoded and uploaded to the backend data storage 0-db's.
The data files in the local 0-db will never occupy more than 25GiB of
space (as configured in the 0-stor config file). If a data container is
removed due to space constraints, and data inside of it needs to be
accessed by the filesystem (e.g. a file is being read), then the data
container is recovered from the backend storage 0-db's by 0-stor, and
0-db can subsequently serve this data to 0-db-fs.
### 0-db-fs limitation
Any workload should be supported on this filesystem, with some exceptions:
- Opening a file in 'always append mode' will not have the expected behavior
- There is no support of O_TMPFILE by fuse layer, which is a feature required by
overlayfs, thus this is not supported. Overlayfs is used by Docker for example.
## docker setup
It is possible to run the zstor in a docker container. First, create a data directory
on your host. Then, save the config file in the data directory as `zstor.toml`. Ensure
the storage 0-db's are running as desribed above. Then, run the docker container
as such:
```
docker run -ti --privileged --rm --network host --name fstest -v /path/to/data:/data -v /mnt:/mnt:shared azmy/qsfs
```
The filesystem is now available in `/mnt`.
## Autorepair
Autorepair automatically repairs object stored in the backend when one or more shards
are not reachable anymore. It does this by periodically checking if all the backends
are still reachable. If it detects that one or more of the backends used by an encoded
object are not reachable, the healthy shards are downloaded, the object is restored
and encoded again (possibly with a new config, if it has since changed), and uploaded
again.
Autorepair does not validate the integrity of individual shards. This is protectected
against by having multiple spare (redundant) shards for an object. Corrupt shards
are detected when the object is rebuild, and removed before attempting to rebuild.
Autorepair also does not repair the metadata of objects.
## Monitoring, alerting and statistics
0-stor collects metrics about the system. It can be configured with a 0-db-fs mountpoint,
which will trigger 0-stor to collect 0-db-fs statistics, next to some 0-db statistics
which are always collected. If the `prometheus_port` config option is set, 0-stor
will serve metrics on this port for scraping by prometheus. You can then set up
graphs and alerts in grafana. Some examples include: disk space used vs available
per 0-db backend, total entries in 0-db backends, which backends are tracked, ...
When 0-db-fs monitoring is enabled, statistics are also exported about the filesystem
itself, such as read/write speeds, syscalls, and internal metrics
For a full overview of all available stats, you can set up a prometheus scraper against
a running instance, and use the embedded promQl to see everything available.
## Data safety
As explained in the auto repair section, data is periodically checked and rebuild if
0-db backends become unreachable. This ensures that data, once stored, remains available,
as long as the metadata is still present. When needed, the system can be expanded with more
0-db backends, and the encoding config can be changed if needed (e.g. to change encryption keys).
## Performance
Qsfs is not a high speed filesystem, nor is it a distributed filesystem. It is intended to
be used for archive purposes. For this reason, the qsfs stack focusses on data safety first.
Where needed, reliability is chosen over availability (i.e. we won't write data if we can't
guarantee all the conditions in the required storage profile is met).
With that being said, there are currently 2 limiting factors in the setup:
- speed of the disk on which the local 0-db is running
- network
The first is the speed of the disk for the local 0-db. This imposes a hard limit on
the throughput of the filesystem. Performance testing has shown that write speeds
on the filesystem reach performance of roughly 1/3rd of the raw performance of the
disk for writing, and 1/2nd of the read performance. Note that in the case of _very_
fast disks (mostly NVMe SSD's), the cpu might become a bottleneck if it is old and
has a low clock speed. Though this should not be a problem.
The network is more of a soft cap. All 0-db data files will be encoded and distributed
over the network. This means that the upload speed of the node needs to be able to
handle this data througput. In the case of random data (which is not compressable),
the required upload speed would be the write speed of the 0-db-fs, increased by the
overhead generated by the storage policy. There is no feedback to 0-db-fs if the upload
of data is lagging behind. This means that in cases where a sustained high speed write
load is applied, the local 0-db might eventually grow bigger than the configured size limit
until the upload managed to catch up. If this happens for prolonged periods of time, it
is technically possible to run out of space on the disk. For this reason, you should
always have some extra space available on the disk to account for temprorary cache
excess.
When encoded data needs to be recovered from backend nodes (if it is not in cache),
the read speed will be equal to the connection speed of the slowest backend, as all
shards are recovered before the data is build. This means that recovery of historical
data will generally be a slow process. Since we primarily focus on archive storage,
we do not consider this a priority.

View File

@@ -0,0 +1,9 @@
#!/bin/bash
for name in ./*.mmd
do
output=$(basename $name mmd)png
echo $output
mmdc -i $name -o $output -w 4096 -H 2160 -b transparant
echo $name
done

View File

@@ -0,0 +1,13 @@
graph TD
subgraph Data Origin
file[Large chunk of data = part_1part_2part_3part_4]
parta[part_1]
partb[part_2]
partc[part_3]
partd[part_4]
file -.- |split part_1|parta
file -.- |split part_2|partb
file -.- |split part 3|partc
file -.- |split part 4|partd
parta --> partb --> partc --> partd
end

View File

@@ -0,0 +1,20 @@
graph TD
subgraph Data Substitution
parta[part_1]
partb[part_2]
partc[part_3]
partd[part_4]
parta -.-> vara[ A = part_1]
partb -.-> varb[ B = part_2]
partc -.-> varc[ C = part_3]
partd -.-> vard[ D = part_4]
end
subgraph Create equations with the data parts
eq1[A + B + C + D = 6]
eq2[A + B + C - D = 3]
eq3[A + B - C - D = 10]
eq4[ A - B - C - D = -4]
eq5[ A - B + C + D = 0]
eq6[ A - B - C + D = 5]
vara & varb & varc & vard --> eq1 & eq2 & eq3 & eq4 & eq5 & eq6
end

View File

@@ -0,0 +1,44 @@
graph TD
subgraph Data Origin
file[Large chunk of data = part_1part_2part_3part_4]
parta[part_1]
partb[part_2]
partc[part_3]
partd[part_4]
file -.- |split part_1|parta
file -.- |split part_2|partb
file -.- |split part 3|partc
file -.- |split part 4|partd
parta --> partb --> partc --> partd
parta -.-> vara[ A = part_1]
partb -.-> varb[ B = part_2]
partc -.-> varc[ C = part_3]
partd -.-> vard[ D = part_4]
end
subgraph Create equations with the data parts
eq1[A + B + C + D = 6]
eq2[A + B + C - D = 3]
eq3[A + B - C - D = 10]
eq4[ A - B - C - D = -4]
eq5[ A - B + C + D = 0]
eq6[ A - B - C + D = 5]
vara & varb & varc & vard --> eq1 & eq2 & eq3 & eq4 & eq5 & eq6
end
subgraph Disk 1
eq1 --> |store the unique equation, not the parts|zdb1[A + B + C + D = 6]
end
subgraph Disk 2
eq2 --> |store the unique equation, not the parts|zdb2[A + B + C - D = 3]
end
subgraph Disk 3
eq3 --> |store the unique equation, not the parts|zdb3[A + B - C - D = 10]
end
subgraph Disk 4
eq4 --> |store the unique equation, not the parts|zdb4[A - B - C - D = -4]
end
subgraph Disk 5
eq5 --> |store the unique equation, not the parts|zdb5[ A - B + C + D = 0]
end
subgraph Disk 6
eq6 --> |store the unique equation, not the parts|zdb6[A - B - C + D = 5]
end

View File

@@ -0,0 +1,34 @@
graph TD
subgraph Local laptop, computer or server
user[End User]
protocol[Storage protocol]
qsfs[Filesystem on local OS]
0store[Quantum Safe storage engine]
end
subgraph Grid storage - metadata
etcd1[ETCD-1]
etcd2[ETCD-2]
etcd3[ETCD-3]
end
subgraph Grid storage - zero proof data
zdb1[ZDB-1]
zdb2[ZDB-2]
zdb3[ZDB-3]
zdb4[ZDB-4]
zdb5[ZDB-5]
zdb6[ZDB-6]
zdb7[ZDB-7]
user -.- protocol
protocol -.- qsfs
qsfs --- 0store
0store --- etcd1
0store --- etcd2
0store --- etcd3
0store <-.-> zdb1[ZDB-1]
0store <-.-> zdb2[ZDB-2]
0store <-.-> zdb3[ZDB-3]
0store <-.-> zdb4[ZDB-4]
0store <-.-> zdb5[ZDB-5]
0store <-.-> zdb6[ZDB-...]
0store <-.-> zdb7[ZDB-N]
end

View File

@@ -0,0 +1,9 @@
#!/bin/bash
for name in ./*.mmd
do
output=$(basename $name mmd)png
echo $output
mmdc -i $name -o $output -w 4096 -H 2160 -b transparant
echo $name
done

Binary file not shown.

After

Width:  |  Height:  |  Size: 285 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 238 KiB

View File

@@ -0,0 +1,39 @@
<!-- ![](img/filesystem_abstract.jpg) -->
![](img/qsss_intro_.jpg)
# Quantum Safe Filesystem
A redundant filesystem, can store PB's (millions of gigabytes) of information.
Unique features:
- Unlimited scalable (many petabytes) filesystem
- Quantum Safe:
- On the TFGrid, no farmer knows what the data is about
- Even a quantum computer cannot decrypt
- Data can't be lost
- Protection for [datarot](datarot), data will autorepair
- Data is kept for ever
- Data is dispersed over multiple sites
- Sites can go down, data not lost
- Up to 10x more efficient than storing on classic storage cloud systems
- Can be mounted as filesystem on any OS or any deployment system (OSX, Linux, Windows, Docker, Kubernetes, TFGrid, ...)
- Compatible with +- all data workloads (not high performance data driven workloads like a database)
- Self-healing: when a node or disk lost, storage system can get back to original redundancy level
- Helps with compliance to regulations like GDPR (as the hosting facility has no view on what is stored, information is encrypted and incomplete)
- Hybrid: can be installed onsite, public, private, ...
- Read-write caching on encoding node (the front end)
## Architecture
By using our filesystem inside a Virtual Machine or Kubernetes the TFGrid user can deploy any storage application on top e.g. Minio for S3 storage, OwnCloud as online fileserver.
![](img/qsstorage_architecture.jpg)
Any storage workload can be deployed on top of the zstor.
!!!def alias:quantumsafe_filesystem,planetary_fs,planet_fs,quantumsafe_file_system,zstor,qsfs
!!!include:qsss_toc

View File

@@ -0,0 +1,14 @@
graph TD
subgraph Data Ingress and Egress
qss[Quantum Safe Storage Engine]
end
subgraph Physical Data storage
st1[Virtual Storage Device 1]
st2[Virtual Storage Device 2]
st3[Virtual Storage Device 3]
st4[Virtual Storage Device 4]
st5[Virtual Storage Device 5]
st6[Virtual Storage Device 6]
st7[Virtual Storage Device 7]
qss -.-> st1 & st2 & st3 & st4 & st5 & st6 & st7
end

View File

@@ -0,0 +1,9 @@
#!/bin/bash
for name in ./*.mmd
do
output=$(basename $name mmd)png
echo $output
mmdc -i $name -o $output -w 4096 -H 2160 -b transparant
echo $name
done

View File

@@ -0,0 +1,13 @@
graph TD
subgraph Data Origin
file[Large chunk of data = part_1part_2part_3part_4]
parta[part_1]
partb[part_2]
partc[part_3]
partd[part_4]
file -.- |split part_1|parta
file -.- |split part_2|partb
file -.- |split part 3|partc
file -.- |split part 4|partd
parta --> partb --> partc --> partd
end

View File

@@ -0,0 +1,20 @@
graph TD
subgraph Data Substitution
parta[part_1]
partb[part_2]
partc[part_3]
partd[part_4]
parta -.-> vara[ A = part_1]
partb -.-> varb[ B = part_2]
partc -.-> varc[ C = part_3]
partd -.-> vard[ D = part_4]
end
subgraph Create equations with the data parts
eq1[A + B + C + D = 6]
eq2[A + B + C - D = 3]
eq3[A + B - C - D = 10]
eq4[ A - B - C - D = -4]
eq5[ A - B + C + D = 0]
eq6[ A - B - C + D = 5]
vara & varb & varc & vard --> eq1 & eq2 & eq3 & eq4 & eq5 & eq6
end

View File

@@ -0,0 +1,44 @@
rgraph TD
subgraph Data Origin
file[Large chunk of data = part_1part_2part_3part_4]
parta[part_1]
partb[part_2]
partc[part_3]
partd[part_4]
file -.- |split part_1|parta
file -.- |split part_2|partb
file -.- |split part 3|partc
file -.- |split part 4|partd
parta --> partb --> partc --> partd
parta -.-> vara[ A = part_1]
partb -.-> varb[ B = part_2]
partc -.-> varc[ C = part_3]
partd -.-> vard[ D = part_4]
end
subgraph Create equations with the data parts
eq1[A + B + C + D = 6]
eq2[A + B + C - D = 3]
eq3[A + B - C - D = 10]
eq4[ A - B - C - D = -4]
eq5[ A - B + C + D = 0]
eq6[ A - B - C + D = 5]
vara & varb & varc & vard --> eq1 & eq2 & eq3 & eq4 & eq5 & eq6
end
subgraph Disk 1
eq1 --> |store the unique equation, not the parts|zdb1[A + B + C + D = 6]
end
subgraph Disk 2
eq2 --> |store the unique equation, not the parts|zdb2[A + B + C - D = 3]
end
subgraph Disk 3
eq3 --> |store the unique equation, not the parts|zdb3[A + B - C - D = 10]
end
subgraph Disk 4
eq4 --> |store the unique equation, not the parts|zdb4[A - B - C - D = -4]
end
subgraph Disk 5
eq5 --> |store the unique equation, not the parts|zdb5[ A - B + C + D = 0]
end
subgraph Disk 6
eq6 --> |store the unique equation, not the parts|zdb6[A - B - C + D = 5]
end

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 145 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 101 KiB

View File

@@ -0,0 +1,82 @@
# Quantum Safe Storage Algoritm
![](img/tf_banner_grid_.jpg)
The Quantum Safe Storage Algorithm is the heart of the Storage engine. The storage engine takes the original data objects and creates data part descriptions that it stores over many virtual storage devices (ZDB/s)
![](../img/.jpg)
Data gets stored over multiple ZDB's in such a way that data can never be lost.
Unique features
- data always append, can never be lost
- even a quantum computer cannot decrypt the data
- is spread over multiple sites, sites can be lost, data will still be available
- protects for [datarot](datarot)
### Why
Today we produce more data than ever before. We could not continue to make full copies of data to make sure it is stored reliably. This will simply not scale. We need to move from securing the whole dataset to securing all the objects that make up a dataset.
ThreeFold is using space technology to store data (fragments) over multiple devices (physical storage devices in 3Nodes). The solution does not distribute and store parts of an object (file, photo, movie...) but describes the part of an object. This could be visualized by thinking of it as equations.
### Details
Let a,b,c,d.... be the parts of that original object. You could create endless unique equations using these parts. A simple example: let's assume we have 3 parts of original objects that have the following values:
```
a=1
b=2
c=3
```
(and for reference that part of real-world objects is not a simple number like `1` but a unique digital number describing the part, like the binary code for it `110101011101011101010111101110111100001010101111011.....`). With these numbers we could create endless amounts of equations:
```
1: a+b+c=6
2: c-b-a=0
3: b-c+a=0
4: 2b+a-c=2
5: 5c-b-a=12
......
```
Mathematically we only need 3 to describe the content (=value) of the fragments. But creating more adds reliability. Now store those equations distributed (one equation per physical storage device) and forget the original object. So we no longer have access to the values of a, b, c and see and we just remember the locations of all the equations created with the original data fragments. Mathematically we need three equations (any 3 of the total) to recover the original values for a, b or c. So do a request to retrieve 3 of the many equations and the first 3 to arrive are good enough to recalculate the original values. Three randomly retrieved equations are:
```
5c-b-a=12
b-c+a=0
2b+a-c=2
```
And this is a mathematical system we could solve:
- First: `b-c+a=0 -> b=c-a`
- Second: `2b+a-c=2 -> c=2b+a-2 -> c=2(c-a)+a-2 -> c=2c-2a+a-2 -> c=a+2`
- Third: `5c-b-a=12 -> 5(a+2)-(c-a)-a=12 -> 5a+10-(a+2)+a-a=12 -> 5a-a-2=2 -> 4a=4 -> a=1`
Now that we know `a=1` we could solve the rest `c=a+2=3` and `b=c-a=2`. And we have from 3 random equations regenerated the original fragments and could now recreate the original object.
The redundancy and reliability in such system comes in the form of creating (more than needed) equations and storing them. As shown these equations in any random order could recreate the original fragments and therefore
redundancy comes in at a much lower overhead.
### Example of 16/4
![](img/quantumsafe_storage_algo.jpg)
Each object is fragmented into 16 parts. So we have 16 original fragments for which we need 16 equations to mathematically describe them. Now let's make 20 equations and store them dispersedly on 20 devices. To recreate the original object we only need 16 equations, the first 16 that we find and collect which allows us to recover the fragment and in the end the original object. We could lose any 4 of those original 20 equations.
The likelihood of losing 4 independent, dispersed storage devices at the same time is very low. Since we have continuous monitoring of all of the stored equations, we could create additional equations immediately when one of them is missing, making it an auto-regeneration of lost data and a self-repairing storage system. The overhead in this example is 4 out of 20 which is a mere **20%** instead of (up to) **400%.**
### Content distribution Policy (10/50)
This system can be used as backend for content delivery networks.
Imagine a movie being stored on 60 locations from which we can loose 50 at the same time.
If someone now wants to download the data the first 10 locations who answer fastest will provide enough of the data parts to allow the data to be rebuild.
The overhead here is much more compared to previous example but stil order of magnitude lower compared to other cdn systems.
!!!def alias:quantumsafe_storage_algo,quantumsafe_storage_algorithm,space_algo,space_algorithm,quantum_safe_storage_algo,qs_algo,qs_codec
!!!include:qsss_toc

View File

@@ -0,0 +1,8 @@
# Datarot Cannot Happen on our Storage System
Fact that data storage degrades over time and becomes unreadable, on e.g. a harddisk.
The storage system provided by ThreeFold intercepts this silent data corruption, making that it can pass by unnotified.
> see also https://en.wikipedia.org/wiki/Data_degradation
!!!def alias:bitrot,datarot

View File

@@ -0,0 +1,11 @@
# Zero Knowledge Proof Storage system.
The quantum save storage system is zero knowledge proof compliant. The storage system is made up / split into 2 components: The actual storage devices use to store the data (ZDB's) and the Quantum Safe Storage engine.
![](img/qss_system.jpg)
The zero proof knowledge compliancy comes from the fact the all the physical storage nodes (3nodes) can proof that they store a valid part of what data the quantum safe storage engine (QSSE) has stored on multiple independent devices. The QSSE can validate that all the QSSE storage devices have a valid part of the original information. The storage devices however have no idea what the original stored data is as they only have a part (description) of the origina data and have no access to the original data part or the complete origal data objects.
!!!def

View File

@@ -0,0 +1,7 @@
![](img/filesystem_abstract.jpg)
# Quantum Safe Storage System benefits
!!wiki.include page:'technology:qss_benefits.md'
!!wiki.include page:'technology:qsss_toc.md'

View File

@@ -0,0 +1,6 @@
- Up to 10x more efficient (power and usage of hardware)
- Ultra reliable, data can not be lost
- Ultra safe & private
- Ultra scalable
- Sovereign, data is close to you in the country of your choice
- Truly peer-to-peer, by everyone for everyone

View File

@@ -0,0 +1,2 @@
!!!include:qsss_home

View File

@@ -0,0 +1,21 @@
![](img/qsstorage_architecture.jpg)
# Quantum Safe Storage System
Imagine a storage system with the following benefits
!!!include:qss_benefits_
> This is not a dream but does exist and is the underpinning of the TFGrid.
Our storage architecture follows the true peer-to-peer design of the TF grid. Any participating node only stores small incomplete parts of objects (files, photos, movies, databases...) by offering a slice of the present (local) storage devices. Managing the storage and retrieval of all of these distributed fragments is done by a software that creates development or end-user interfaces for this storage algorithm. We call this '**dispersed storage**'.
Peer-to-peer provides the unique proposition of selecting storage providers that match your application and service of business criteria. For example, you might be looking to store data for your application in a certain geographic area (for governance and compliance) reasons. Also, you might want to use different "storage policies" for different types of data. Examples are live versus archived data. All of these uses cases are possible with this storage architecture and could be built by using the same building blocks produced by farmers and consumed by developers or end-users.
!!!include:qsss_toc
!!!def alias:qsss,quantum_safe_storage_system

View File

@@ -0,0 +1,34 @@
<!-- ![](img/qsss_intro_.jpg) -->
<h1> Quantum Safe Storage System </h1>
<h2>Table of Contents</h2>
- [Introduction](#introduction)
- [QSS Benefits](#qss-benefits)
- [Peer-to-Peer Design](#peer-to-peer-design)
- [Overview](#overview)
***
## Introduction
ThreeFold offers a quantum safe storage system (QSS). QSS is a decentralized, globally distributed data storage system. It is unbreakable, self-healing, append-only and immutable.
## QSS Benefits
Imagine a storage system with the following benefits:
!!wiki.include page:'technology:qss_benefits.md'
This is not a dream but does exist and is the underpinning of the TFGrid.
## Peer-to-Peer Design
Our storage architecture follows the true peer-to-peer design of the TF grid. Any participating node only stores small incomplete parts of objects (files, photos, movies, databases...) by offering a slice of the present (local) storage devices. Managing the storage and retrieval of all of these distributed fragments is done by a software that creates development or end-user interfaces for this storage algorithm. We call this '**dispersed storage**'.
Peer-to-peer provides the unique proposition of selecting storage providers that match your application and service of business criteria. For example, you might be looking to store data for your application in a certain geographic area (for governance and compliance) reasons. Also, you might want to use different "storage policies" for different types of data. Examples are live versus archived data. All of these uses cases are possible with this storage architecture and could be built by using the same building blocks produced by farmers and consumed by developers or end-users.
## Overview
![](img/qsss_intro_0_.jpg)

View File

@@ -0,0 +1,6 @@
<h1> Quantum Safe Storage More Info </h1>
<h2>Table of Contents</h2>
- [Quantum Safe Storage Overview](qsss_home.md)
- [Quantum Safe Filesystem](qss_filesystem)

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

View File

@@ -0,0 +1,7 @@
![roadmap](img/roadmap.jpg)
# Roadmap
>TODO: to be filled in
> See Quantum Safe Storage project [kanban](https://github.com/orgs/threefoldtech/projects/152).

View File

@@ -0,0 +1,20 @@
- [**Home**](@threefold_home)
- [**Technology**](@technology)
------------
**Quantum Safe Filesystem**
- [Home](@qsss_home)
- [Filesystem](@qss_filesystem)
- [Algorithm](@qss_algorithm)
<!-- - [Zero knowledge proof](@qss_zero_knowledge_proof)
- [Manual](qsfs_setup) -->
<!-- - [Roadmap](@quantumsafe_roadmap)
- [Manual](@qsfs_setup)
- [Use Cases](@qss_use_cases)
- [Specifications](@qss_specs) -->
<!-- - [Test plan](@testplan) -->
<!-- - [Datarot](@qss_datarot) -->

Binary file not shown.

After

Width:  |  Height:  |  Size: 276 KiB

View File

@@ -0,0 +1,3 @@
# zstor filesystem (zstor) Policy
Describe how it works...

View File

@@ -0,0 +1,68 @@
![specs](img/specs_header.jpg)
# System requirements
System that is easy to provision storage capacity on the TF grid
- user can create X storage nodes on a random or specific locations
- user can list their storage nodes
- check node status/info in some shape or form in a monitoring solution
- external authentication/payment system using threefold connect app
- user can delete their storage nodes
- user can provision mode storage nodes
- user can increase total size of storage solutions
- user can install the quantum safe filesystem on any linux based system, physical or virtual
# Non-functional requirements
- How many expected concurrent users: not application - each user will have it's own local binary and software install.
- How many users on the system: 10000-100000
- Data store: fuse filesystem plus local and grid based ZDB's
- How critical is the system? it needs to be alive all the time.
- What do we know about the external payment system?
Threefold Connect, use QR code for payments and validate on the blockchain
- Life cycle of the storage nodes? How does the user keep their nodes alive? The local binary / application has a wallet from which it can pay for the existing and new storage devices. This wallet needs to be kept topped up.
- When the user is asked to sign the deployment of 20 storage nodes:
- will the user sign each single reservation? or should the system itself sign it for the user and show the QR code only for payments?
- Payments should be done to the a specific user wallet and with a background service with extend the user pools or /extend in the bot conversation? to be resolved
- Configuration and all metadata should be stored as a hash / private key. With this information you are able to regain access to your stored data from everywhere.
# Components mapping / SALs
- Entities: User, storage Node
- ReservationBuilder: builds reservation for the user to sign (note the QR code data size limit is 3KB)
- we need to define how many nodes can we deploy at a time, shouldn't exceed 3KB for the QR Code, if it exceeds the limit should we split the reservations?
- UserInfo: user info are loaded from threefold login system
- Blockchain Node (role, configurations)
- Interface to Threefold connect (authentication+payment) /identify + generate payments
- User notifications / topup
- Monitoring: monitoring + redeployment of the solutions again if they go down, when redeploying who owns the reservation to delete -can be fixed with delete signers field- and redeploy, but to deploy we need the user identity or should we inform the user in telegram and ask him to /redeploy
- Logging
# Tech stack
- [JS-SDK[](https://github.com/threefoldtech/js-sdk) (?)
- [0-db](https://github.com/threefoldtech/0-db-s)
- [0-db-fs](https://github.com/threefoldtech/0-db-fs)
- [0-stor_v2](https://github.com/threefoldtech/0-stor_v2)
- [quantum_storage](https://github.com/threefoldtech/quantum-storage)
# Blockers
Idea from blockchain jukekebox brainstorm:
## payments
- QR code contains threebot://signandpay/#https://tf.grid/api/a6254a4a-bdf4-11eb-8529-0242ac130003 (can also be uni link)
- App gets URL
- URL gives data
- { DataToSign : {RESERVATIONDETAILS}, Payment: {PAYMENTDETAILS}, CallbackUrl: {CALLBACKURL} }
- App signs reservation, makes payment, calls callbackurl { SingedData : {SINGEDRESERVATION}, Payment: {FINISHED_PAYMENTDETAILS}}
Full flow:
- User logs in using normal login flow
- User scans QR
- User confirms reservation and payment in the app

View File

@@ -0,0 +1,3 @@
# Specs zstor filesystem
- [Quantum Safe File System](quantum_safe_filesystem_2_6)

View File

@@ -0,0 +1,15 @@
# zstor filesystem 2.6
## requirements
- redundancy/uptime
- data can never be lost if older than 20 min (avg will be 7.5 min, because we use 15 min push)
- if a datacenter or node goes down and we are in storage policy the storage stays available
- reliability
- data cannot have hidden data corruption, when bitrot the FS will automatically recover
- self healing
- when data policy is lower than required level then should re-silver (means make sure policy is intact again)
## NEW
- 100% redundancy

View File

@@ -0,0 +1,37 @@
## zstor Architecture
```mermaid
graph TD
subgraph TFGridLoc2
ZDB5
ZDB6
ZDB7
ZDB8
ETCD3
end
subgraph TFGridLoc1
ZDB1
ZDB2
ZDB3
ZDB4
ETCD1
ETCD2
KubernetesController --> ETCD1
KubernetesController --> ETCD2
KubernetesController --> ETCD3
end
subgraph eVDC
PlanetaryFS --> ETCD1 & ETCD2 & ETCD3
PlanetaryFS --> MetadataStor
PlanetaryFS --> ReadWriteCache
MetadataStor --> LocalZDB
ReadWriteCache --> LocalZDB
LocalZDB & PlanetaryFS --> ZeroStor
ZeroStor --> ZDB1 & ZDB2 & ZDB3 & ZDB4 & ZDB5 & ZDB6 & ZDB7 & ZDB8
end
```

View File

@@ -0,0 +1,40 @@
## zstor Sequence Diagram
```mermaid
sequenceDiagram
participant user as user
participant fs as 0-fs
participant lzdb as local 0-db
participant zstor as 0-stor
participant etcd as ETCD
participant zdbs as backend 0-dbs
participant mon as Monitor
alt Writing data
user->>fs: write data to files
fs->>lzdb: write data blocks
opt Datafile is full
lzdb->>zstor: encode and chunk data file
zstor->>zdbs: write encoded datafile chunks to the different backends
zstor->>etcd: write metadata about encoded file to metadata storage
end
else Reading data
user->>fs: read data from file
fs->>lzdb: read data blocks
opt Datafile is missing
lzdb->>zstor: request retrieval of data file
zstor->>etcd: load file encoding and storage metadata
zstor->>zdbs: read encoded datafile chunks from multiple backends and rebuilds original datafile
zstor->>lzdb: replaces the missing datafile
end
end
loop Monitor action
mon->>lzdb: delete local data files which are full and have been encoded, AND have not been accessed for some time
mon->>zdbs: monitors health of used namespaces
opt Namespace is lost or corrupted
mon->>zstor: checks storage configuration
mon->>zdbs: rebuilds missing shard on new namespace from storage config
end
end
```

Binary file not shown.

After

Width:  |  Height:  |  Size: 115 KiB

View File

@@ -0,0 +1,34 @@
graph TD
subgraph Local laptop, computer or server
user[End User *11* ]
protocol[Storage Protocol *6*]
qsfs[Filesystem *7*]
0store[Storage Engine *8*]
end
subgraph Grid storage - metadata
etcd1[ETCD-1 *9*]
etcd2[ETCD-2 *9*]
etcd3[ETCD-3 *9*]
end
subgraph Grid storage - zero proof data
zdb1[ZDB-1 *10*]
zdb2[ZDB-2 *10*]
zdb3[ZDB-3 *10*]
zdb4[ZDB-4 *10*]
zdb5[ZDB-5 *10*]
zdb6[ZDB-... *10*]
zdb7[ZDB-N *10*]
user -.- |-1-| protocol
protocol -.- |-2-| qsfs
qsfs --- |-3-| 0store
0store --- |-4-| etcd1
0store --- |-4-| etcd2
0store --- |-4-| etcd3
0store <-.-> |-5-| zdb1
0store <-.-> |-5-| zdb2
0store <-.-> |-5-| zdb3
0store <-.-> |-5-| zdb4
0store <-.-> |-5-| zdb5
0store <-.-> |-5-| zdb6
0store <-.-> |-5-| zdb7
end

View File

@@ -0,0 +1,131 @@
## Quantum Safe Storage Testplan
### Prerequisites
The quantum safe storage system runs on the following platforms
- bare metal linux installation.
- kubernetes cluster with Helm installation scripts.
### Installation
For instructions in installing the QSFS please see the manual [here](../manual/README.md).
#### Bare metal Linux
The software comes as a single binary which will install all the necessary components (local) to run the quantum safe file system. The server in the storage front end and the TF Grid is the storage backend. The storage backend configuration can be provided in two different ways:
- the user has access to the eVDC facility of Threefold and is able to download the kubernetes configuration file.
- the binary has built in options to ask for backend storage components to be provisioned an delivered.
### Architecture and failure modes
Quantum Safe Storage is built from a number of components. Components are connected and interacting and therefore there are a number of failure modes that need to be considered for the test plan.
Failure modes which we have testplans and cases for:
- [Enduser](#enduser)
- [Storage protocol](#storage-protocol)
- [Filesystem](#filesystem)
- [Storage engine](#storage-engine)
- [Metadata store](#metadata-store)
- [Physical storage devices](#physical-storage-devices)
- [Interaction Enduser - Storage Protocol](#enduser-to-storage-protocol)
- [Interaction Storage Protocol - Filesystem](#storage-protocol---filesystem)
- [Interaction Filesystem - Storage Engine](#filesystem-to-storage-engine)
- [Interaction Storage Engine - Physical Storage Device](#storage-engine-to-physical-storage-device)
- [Interaction Storage Engine - Metadata Store](#storage-engine-to-metadata-store)
![](img/failure_points.jpg)
#### Enduser
Failure scanerio's
- End user enters weird / wrong data at during QSS install
- End user deletes / changes things on the QSS engine host
- End user stops / deletes the storage protocol application of any of its configuration / temp storage facilities
- End user deletes the quantum safe file system and / or it configuration files
- End user deletes the storage enginer and / or its configuration and temp storage files.
Tests to conduct
#### Storage protocol
Failure scanerio's
Storage protocol can be anything from IPFS, minio, sFTP and all the other protocols available. The possible failure modes for these are endless to test. For a couple of well knownm protocols we will do some basic testing
- **minio**
- **sFTP**
- **syncthing**
- **scp**
For all these protocols a number of simple tests can he done:
- stop the protocol binary while data is being pushed in. restart the binary and see if normal operation commences (data loss eg. data in transfer when failure happened is accaptable).
- make changes to the config file (policy, parameters, etc) and see if normal operation commences.
Tests to conduct
#### Filesystem
Direct access to the filesystem and eliminates the dependency of the interface protocol. The filesystem provides a well knmow interface to browse, create, store and retrieve data in an easy and structured way.
Tests to conduct. Testing is required to see if the filesystem can deal with:
- create a large number of nested directories and see if all this is causing problems
- create a large number fo small files and see if this is creating problems
- create a number of very large files (1GB+) and see if this is causing any problems.
- delete a (large) number of files
- delete a (large) number of derectories
- move a (large) number of files
- move a (large) number of directories
#### Storage engine
The storage engine takes files (data) and runs a "forward error correcting algorithm" on the data. The algorithm requires a "storage" policy to specify how to treat the inbound data and where to store the resuling data descriptions. This engine is non-redundant at this point in time and we should test how it behaves with certain failure modes.
Tests to conduct:
- storage policy is change during operation of the storage engine
- physical storage devices are added
- physical storage devices are deleted
- storage policy (example - 16:4) is changed during operation
- other configuration components are changed
- physical storage access passwords
- encryption key
#### Metadata store
The metadatastore stores the information needed to retrieve the part of descriptions that make up an original piece of data. These metadata stores are redundant (3x?) and store all data required.
Testing needs to be done on:
- corruption / deleting one out of the three metadata stores
- corruption / deleting two out of three metadata stores
- rebuilding failed metadata stores
- create high workloads of adding new data and retrieving stored data - no longer available in a local cache.
#### Physical storage devices
Physical storage devices are ZDB's on one the the TF Grid networks (Mainnet, Testnet, Devnet). ZDB's manage slices of physical disk space on HDD's and SSD's. They have a very simple operational model and API interface. ZDB's are operating in "append only" mode by default.
Testing needs to be done:
- create a high workload writing and reading a the same time
- get
#### Enduser to Storage Protocol
Failure scanerio's
Tests to conduct
#### Storage Protocol to Filesystem
Failure scanerio's
Tests to conduct
#### Filesystem to Storage Engine
Failure scanerio's
Tests to conduct
#### Storage Engine to Physical Storage Device
Failure scanerio's
Tests to conduct
#### Storage Engine to Metadata Store
Failure scanerio's
Tests to conduct