Compare commits

...

23 Commits

Author SHA1 Message Date
91b0247e68 Update git remote URL from git.ourworld.tf to git.threefold.info 2025-06-15 16:21:07 +02:00
Mahmoud Emad
f9d338a8f1 feat: Add WebBuilder library for website generation
- Adds a new library for building websites from configuration
  files and markdown content.  Improves developer workflow by
  automating website construction.

- Implements multiple parsing strategies for configuration files
  (Hjson, Simple, Auto) for flexibility and backward
  compatibility.

- Includes support for cloning Git repositories, processing
  markdown, and uploading files to IPFS, streamlining the
  website deployment process.  Facilitates easier website
  updates and content management.

- Adds comprehensive README documentation explaining the library's
  usage and configuration options.  Improves user onboarding and
  reduces the learning curve for new users.
2025-05-15 09:42:08 +03:00
Mahmoud Emad
ea25db7d29 feat: Improve collection scanning and add .gitignore entries
- Add `.gitignore` entries for `webmeta.json` and `.vscode`
- Improve collection scanning logging for better debugging
- Improve error handling in collection methods for robustness
2025-05-15 08:53:16 +03:00
Mahmoud Emad
cad8a6d125 refactor: Remove branch specification from sal dependency
- Removed the branch specification from the `sal` dependency in
  `doctree/Cargo.toml` and `webbuilder/Cargo.toml`. This simplifies
  the dependency management and relies on the default branch of the
  repository.
- Updated `.gitignore` to ignore `sccache.log` to prevent it from
  being committed to the repository.
- Updated example commands in `example_commands.sh` to use a different
  collection name (`grid1` instead of `grid_documentation`) for testing
  purposes.  This avoids potential conflicts with pre-existing data.
- Improved code structure and organization in `doctree/src/lib.rs`.
2025-05-14 08:42:53 +03:00
d8d6bf1f4a ... 2025-05-13 11:22:51 +03:00
1e914aa56d ... 2025-05-13 10:03:13 +03:00
29ccc54a4d ... 2025-05-13 09:35:01 +03:00
d609aa8094 ... 2025-05-13 09:22:57 +03:00
dbd44043cb ... 2025-05-13 09:19:45 +03:00
7fa4125dc0 ... 2025-05-13 08:52:47 +03:00
2fae059512 ... 2025-05-03 05:52:42 +04:00
28a7ef3a94 ... 2025-04-09 09:47:16 +02:00
60e688810d ... 2025-04-09 09:26:06 +02:00
f938e8ff6b ... 2025-04-09 08:51:55 +02:00
84c656983a ... 2025-04-09 08:45:38 +02:00
19f52a8172 ... 2025-04-09 08:43:20 +02:00
14b2bb2798 ... 2025-04-09 08:43:10 +02:00
2eec3be632 ... 2025-04-09 08:42:30 +02:00
1f155d1bfb ... 2025-04-09 08:11:28 +02:00
b9df692a54 ... 2025-04-09 07:56:46 +02:00
44cbf20d7b ... 2025-04-09 07:54:37 +02:00
5e4dcbf77c ... 2025-04-09 07:11:38 +02:00
b93894632a ... 2025-04-09 06:20:35 +02:00
66 changed files with 6652 additions and 1 deletions

51
.gitignore vendored
View File

@@ -14,3 +14,54 @@ Cargo.lock
# MSVC Windows builds of rustc generate these, which store debugging information
*.pdb
# Added by cargo
/target
/rhai_test_template
/rhai_test_download
/rhai_test_fs
run_rhai_tests.log
new_location
log.txt
file.txt
fix_doc*
# Dependencies
/node_modules
# Production
/build
# Generated files
.docusaurus
.cache-loader
# Misc
.DS_Store
.env.local
.env.development.local
.env.test.local
.env.production.local
npm-debug.log*
yarn-debug.log*
yarn-error.log*
bun.lockb
bun.lock
yarn.lock
build.sh
build_dev.sh
develop.sh
docusaurus.config.ts
sidebars.ts
tsconfig.json
sccache.log
*webmeta.json
.vscode

244
README.md
View File

@@ -1,2 +1,244 @@
# doctree
# DocTree
DocTree is a Rust library for managing collections of markdown documents with powerful include functionality. It provides a robust system for organizing, processing, and retrieving document collections with Redis-backed storage.
## Overview
DocTree scans directories for `.collection` files, which define document collections. Each collection contains markdown documents and other files (like images). The library provides functionality to:
- Scan directories recursively to find collections
- Process includes between documents (allowing one document to include content from another)
- Convert markdown to HTML
- Store document metadata in Redis for efficient retrieval
- Provide a command-line interface for interacting with collections
## tips
if you want command line for ipfs on osx
```bash
#pyt ipfs command line in path for osx
sudo ln -s "/Applications/IPFS Desktop.app/Contents/Resources/app.asar.unpacked/node_modules/kubo/kubo/ipfs" /usr/local/bin/ipfs
```
## Key Concepts
### Collections
A collection is a group of related documents and files. Collections are defined by a `.collection` file in a directory. The `.collection` file can be empty (in which case the directory name is used as the collection name) or it can contain TOML configuration:
```toml
name = "my_collection"
# Other configuration options can be added in the future
```
### DocTree
A DocTree is a manager for multiple collections. It provides methods for:
- Adding collections
- Retrieving documents from collections
- Processing includes between documents
- Converting markdown to HTML
- Managing collection metadata in Redis
### Includes
One of the most powerful features of DocTree is the ability to include content from one document in another. This is done using the `!!include` directive:
```markdown
# My Document
This is my document.
!!include another_collection:some_document.md
More content here...
```
The include directive supports several formats:
- `!!include collection_name:page_name` - Include a page from a specific collection
- `!!include collection_name:'page name'` - Include a page with spaces from a specific collection
- `!!include page_name` - Include a page from the current collection
- `!!include name:'page name'` - Include a page with spaces from the current collection
Includes can be nested, allowing for complex document structures.
## Storage
DocTree uses Redis as a backend storage system. Document metadata (like paths and names) is stored in Redis, making it efficient to retrieve documents without scanning the filesystem each time.
The Redis keys are structured as:
- `collections:{collection_name}:{document_name}` - Stores the relative path to a document
- `collections:{collection_name}:path` - Stores the absolute path to the collection
## Command-Line Interface
DocTree comes with a command-line interface (CLI) that provides access to the library's functionality:
```
DocTree CLI 0.1.0
A tool to manage document collections
USAGE:
doctreecmd [SUBCOMMAND]
FLAGS:
-h, --help Prints help information
-V, --version Prints version information
SUBCOMMANDS:
delete Delete a collection
get Get page content
html Get page content as HTML
info Show detailed information about collections
list List collections
reset Delete all collections
scan Scan a directory for .collection files and create collections
```
### Example Commands
#### Scanning Collections
```bash
doctreecmd scan /path/to/documents --doctree my_doctree
```
This command scans the specified directory for `.collection` files and creates collections in Redis.
#### Listing Collections
```bash
doctreecmd list --doctree my_doctree
```
This command lists all collections in the specified doctree.
#### Getting Document Content
```bash
doctreecmd get -c collection_name -p page_name --doctree my_doctree
```
This command retrieves the content of a document from a collection.
#### Getting HTML Content
```bash
doctreecmd get -c collection_name -p page_name -f html --doctree my_doctree
```
This command retrieves the HTML content of a document from a collection.
#### Showing Collection Information
```bash
doctreecmd info collection_name --doctree my_doctree
```
This command shows detailed information about a collection, including its documents and files.
#### Deleting a Collection
```bash
doctreecmd delete collection_name --doctree my_doctree
```
This command deletes a collection.
#### Resetting All Collections
```bash
doctreecmd reset --doctree my_doctree
```
This command deletes all collections.
## Implementation Details
DocTree is implemented in Rust and uses several key dependencies:
- `walkdir` for recursively walking directories
- `pulldown-cmark` for parsing and rendering markdown
- `toml` for parsing collection configuration files
- `redis` for interacting with Redis
- `clap` for the command-line interface
The library is structured into several modules:
- `doctree.rs` - Core DocTree functionality
- `collection.rs` - Collection management
- `include.rs` - Include processing
- `storage.rs` - Redis storage backend
- `utils.rs` - Utility functions
- `error.rs` - Error handling
## Use Cases
DocTree is particularly useful for:
1. **Documentation Systems**: Manage and organize technical documentation with the ability to include common sections across multiple documents.
2. **Content Management**: Create a flexible content management system where content can be modularized and reused.
3. **Knowledge Bases**: Build knowledge bases with interconnected documents that can reference each other.
4. **Static Site Generation**: Generate static websites from markdown documents with the ability to include common elements.
## Getting Started
### Prerequisites
- Rust (latest stable version)
- Redis server running on localhost:6379 (or configure a different URL)
### Building
```bash
cargo build --release
```
### Running the CLI
```bash
cargo run --bin doctreecmd -- [SUBCOMMAND]
```
### Using the Library
Add doctree to your Cargo.toml:
```toml
[dependencies]
doctree = { git = "https://git.threefold.info/herocode/doctree", branch = "main", package = "doctree", path = "doctree/src" }
```
Basic usage:
```rust
use doctree::{DocTree, RedisStorage, Result, from_directory};
use std::path::Path;
fn main() -> Result<()> {
// Create a DocTree by scanning a directory
let doctree = from_directory(Path::new("path/to/documents"), Some("my_doctree"))?;
// List collections
let collections = doctree.list_collections();
for collection in collections {
println!("Collection: {}", collection);
}
// Get a document with includes processed
let content = doctree.page_get(Some("collection_name"), "page_name")?;
println!("{}", content);
// Get a document as HTML
let html = doctree.page_get_html(Some("collection_name"), "page_name")?;
println!("{}", html);
Ok(())
}
```

15
build.sh Executable file
View File

@@ -0,0 +1,15 @@
#!/bin/bash
# Change to the directory where the script is located
cd "$(dirname "$0")"
# Exit immediately if a command exits with a non-zero status
set -e
echo "Building doctree binary..."
cd doctreecmd
cargo build --release
echo "Copying doctree binary to ~/hero/bin/"
mkdir -p ~/hero/bin/
cp target/release/doctree ~/hero/bin/
echo "Build and installation complete!"

24
doctree/Cargo.toml Normal file
View File

@@ -0,0 +1,24 @@
[package]
name = "doctree"
version = "0.1.0"
edition = "2024"
[lib]
path = "src/lib.rs"
[dependencies]
walkdir = "2.3.3"
pulldown-cmark = "0.9.3"
thiserror = "1.0.40"
lazy_static = "1.4.0"
toml = "0.7.3"
serde = { version = "1.0", features = ["derive"] }
redis = { version = "0.23.0", features = ["tokio-comp"] }
tokio = { version = "1.28.0", features = ["full"] }
sal = { git = "https://git.threefold.info/herocode/sal.git" }
chacha20poly1305 = "0.10.1"
blake3 = "1.3.1"
csv = "1.1"
rand = "0.9.1"
ipfs-api-backend-hyper = "0.6"
ipfs-api = { version = "0.17.0", default-features = false, features = ["with-hyper-tls"] }

697
doctree/src/collection.rs Normal file
View File

@@ -0,0 +1,697 @@
use std::fs;
use std::path::{Path, PathBuf};
use walkdir::WalkDir;
use crate::error::{DocTreeError, Result};
use crate::include::process_includes;
use crate::storage::RedisStorage;
use crate::utils::{ensure_md_extension, markdown_to_html, name_fix};
use ipfs_api::{IpfsApi, IpfsClient};
// use chacha20poly1305::aead::NewAead;
/// Collection represents a collection of markdown pages and files
#[derive(Clone)]
pub struct Collection {
/// Base path of the collection
pub path: PathBuf,
/// Name of the collection (namefixed)
pub name: String,
/// Redis storage backend
pub storage: RedisStorage,
}
/// Builder for Collection
pub struct CollectionBuilder {
/// Base path of the collection
path: PathBuf,
/// Name of the collection (namefixed)
name: String,
/// Redis storage backend
storage: Option<RedisStorage>,
}
impl Collection {
/// Create a new CollectionBuilder
///
/// # Arguments
///
/// * `path` - Base path of the collection
/// * `name` - Name of the collection
///
/// # Returns
///
/// A new CollectionBuilder
pub fn builder<P: AsRef<Path>>(path: P, name: &str) -> CollectionBuilder {
CollectionBuilder {
path: path.as_ref().to_path_buf(),
name: name_fix(name),
storage: None,
}
}
/// Scan walks over the path and finds all files and .md files
/// It stores the relative positions in Redis
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn scan(&self) -> Result<()> {
println!(
"DEBUG: Scanning collection '{}' at path {:?}",
self.name, self.path
);
// Delete existing collection data if any
println!(
"DEBUG: Deleting existing collection data from Redis key 'collections:{}'",
self.name
);
self.storage.delete_collection(&self.name)?;
// Store the collection's full absolute path in Redis
let absolute_path = std::fs::canonicalize(&self.path)
.unwrap_or_else(|_| self.path.clone())
.to_string_lossy()
.to_string();
println!(
"DEBUG: Storing collection path in Redis key 'collections:{}:path'",
self.name
);
self.storage
.store_collection_path(&self.name, &absolute_path)?;
self.storage
.store_collection_path(&self.name, &self.path.to_string_lossy())?;
// Walk through the directory
let walker = WalkDir::new(&self.path);
for entry_result in walker {
// Handle entry errors
let entry = match entry_result {
Ok(entry) => entry,
Err(e) => {
// Log the error and continue
eprintln!("Error walking directory: {}", e);
continue;
}
};
// Skip directories
if entry.file_type().is_dir() {
continue;
}
// Skip files that start with a dot (.)
let file_name = entry.file_name().to_string_lossy();
if file_name.starts_with(".") {
continue;
}
// Get the relative path from the base path
let rel_path = match entry.path().strip_prefix(&self.path) {
Ok(path) => path,
Err(_) => {
// Log the error and continue
eprintln!("Failed to get relative path for: {:?}", entry.path());
continue;
}
};
// Get the filename and apply namefix
let filename = entry.file_name().to_string_lossy().to_string();
let namefixed_filename = name_fix(&filename);
// Determine if this is a document (markdown file) or an image
let is_markdown = filename.to_lowercase().ends_with(".md");
let is_image = filename.to_lowercase().ends_with(".png")
|| filename.to_lowercase().ends_with(".jpg")
|| filename.to_lowercase().ends_with(".jpeg")
|| filename.to_lowercase().ends_with(".gif")
|| filename.to_lowercase().ends_with(".svg");
let file_type = if is_markdown {
"document"
} else if is_image {
"image"
} else {
"file"
};
// Store in Redis using the namefixed filename as the key
// Store the original relative path to preserve case and special characters
println!(
"DEBUG: Storing {} '{}' in Redis key 'collections:{}' with key '{}' and value '{}'",
file_type,
filename,
self.name,
namefixed_filename,
rel_path.to_string_lossy()
);
self.storage.store_collection_entry(
&self.name,
&namefixed_filename,
&rel_path.to_string_lossy(),
)?;
}
Ok(())
}
/// Get a page by name and return its markdown content
///
/// # Arguments
///
/// * `page_name` - Name of the page
///
/// # Returns
///
/// The page content or an error
pub fn page_get(&self, page_name: &str) -> Result<String> {
// Apply namefix to the page name
let namefixed_page_name = name_fix(page_name);
// Ensure it has .md extension
let namefixed_page_name = ensure_md_extension(&namefixed_page_name);
// Get the relative path from Redis
let rel_path = self
.storage
.get_collection_entry(&self.name, &namefixed_page_name)
.map_err(|_| DocTreeError::PageNotFound(page_name.to_string()))?;
// Check if the path is valid
if self.path.as_os_str().is_empty() {
// If the path is empty, we're working with a collection loaded from Redis
// Return an error since the actual file path is not available
return Err(DocTreeError::IoError(std::io::Error::new(
std::io::ErrorKind::NotFound,
format!(
"File path not available for {} in collection {}",
page_name, self.name
),
)));
}
// Read the file
let full_path = self.path.join(rel_path);
let content = fs::read_to_string(full_path).map_err(|e| DocTreeError::IoError(e))?;
// Skip include processing at this level to avoid infinite recursion
// Include processing will be done at the higher level
Ok(content)
}
/// Create or update a page in the collection
///
/// # Arguments
///
/// * `page_name` - Name of the page
/// * `content` - Content of the page
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn page_set(&self, page_name: &str, content: &str) -> Result<()> {
// Apply namefix to the page name
let namefixed_page_name = name_fix(page_name);
// Ensure it has .md extension
let namefixed_page_name = ensure_md_extension(&namefixed_page_name);
// Create the full path
let full_path = self.path.join(&namefixed_page_name);
// Create directories if needed
if let Some(parent) = full_path.parent() {
fs::create_dir_all(parent).map_err(DocTreeError::IoError)?;
}
// Write content to file
fs::write(&full_path, content).map_err(DocTreeError::IoError)?;
// Update Redis
self.storage.store_collection_entry(
&self.name,
&namefixed_page_name,
&namefixed_page_name,
)?;
Ok(())
}
/// Delete a page from the collection
///
/// # Arguments
///
/// * `page_name` - Name of the page
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn page_delete(&self, page_name: &str) -> Result<()> {
// Apply namefix to the page name
let namefixed_page_name = name_fix(page_name);
// Ensure it has .md extension
let namefixed_page_name = ensure_md_extension(&namefixed_page_name);
// Get the relative path from Redis
let rel_path = self
.storage
.get_collection_entry(&self.name, &namefixed_page_name)
.map_err(|_| DocTreeError::PageNotFound(page_name.to_string()))?;
// Delete the file
let full_path = self.path.join(rel_path);
fs::remove_file(full_path).map_err(DocTreeError::IoError)?;
// Remove from Redis
self.storage
.delete_collection_entry(&self.name, &namefixed_page_name)?;
Ok(())
}
/// List all pages in the collection
///
/// # Returns
///
/// A vector of page names or an error
pub fn page_list(&self) -> Result<Vec<String>> {
// Get all keys from Redis
let keys = self.storage.list_collection_entries(&self.name)?;
// Filter to only include .md files
let pages = keys
.into_iter()
.filter(|key| key.ends_with(".md"))
.collect();
Ok(pages)
}
/// Get the URL for a file
///
/// # Arguments
///
/// * `file_name` - Name of the file
///
/// # Returns
///
/// The URL for the file or an error
pub fn file_get_url(&self, file_name: &str) -> Result<String> {
// Apply namefix to the file name
let namefixed_file_name = name_fix(file_name);
// Get the relative path from Redis
let rel_path = self
.storage
.get_collection_entry(&self.name, &namefixed_file_name)
.map_err(|_| DocTreeError::FileNotFound(file_name.to_string()))?;
// Construct a URL for the file
let url = format!("/collections/{}/files/{}", self.name, rel_path);
Ok(url)
}
/// Add or update a file in the collection
///
/// # Arguments
///
/// * `file_name` - Name of the file
/// * `content` - Content of the file
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn file_set(&self, file_name: &str, content: &[u8]) -> Result<()> {
// Apply namefix to the file name
let namefixed_file_name = name_fix(file_name);
// Create the full path
let full_path = self.path.join(&namefixed_file_name);
// Create directories if needed
if let Some(parent) = full_path.parent() {
fs::create_dir_all(parent).map_err(DocTreeError::IoError)?;
}
// Write content to file
fs::write(&full_path, content).map_err(DocTreeError::IoError)?;
// Update Redis
self.storage.store_collection_entry(
&self.name,
&namefixed_file_name,
&namefixed_file_name,
)?;
Ok(())
}
/// Delete a file from the collection
///
/// # Arguments
///
/// * `file_name` - Name of the file
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn file_delete(&self, file_name: &str) -> Result<()> {
// Apply namefix to the file name
let namefixed_file_name = name_fix(file_name);
// Get the relative path from Redis
let rel_path = self
.storage
.get_collection_entry(&self.name, &namefixed_file_name)
.map_err(|_| DocTreeError::FileNotFound(file_name.to_string()))?;
// Delete the file
let full_path = self.path.join(rel_path);
fs::remove_file(full_path).map_err(DocTreeError::IoError)?;
// Remove from Redis
self.storage
.delete_collection_entry(&self.name, &namefixed_file_name)?;
Ok(())
}
/// List all files (non-markdown) in the collection
///
/// # Returns
///
/// A vector of file names or an error
pub fn file_list(&self) -> Result<Vec<String>> {
// Get all keys from Redis
let keys = self.storage.list_collection_entries(&self.name)?;
// Filter to exclude .md files
let files = keys
.into_iter()
.filter(|key| !key.ends_with(".md"))
.collect();
Ok(files)
}
/// Get the relative path of a page in the collection
///
/// # Arguments
///
/// * `page_name` - Name of the page
///
/// # Returns
///
/// The relative path of the page or an error
pub fn page_get_path(&self, page_name: &str) -> Result<String> {
// Apply namefix to the page name
let namefixed_page_name = name_fix(page_name);
// Ensure it has .md extension
let namefixed_page_name = ensure_md_extension(&namefixed_page_name);
// Get the relative path from Redis
self.storage
.get_collection_entry(&self.name, &namefixed_page_name)
.map_err(|_| DocTreeError::PageNotFound(page_name.to_string()))
}
/// Get a page by name and return its HTML content
///
/// # Arguments
///
/// * `page_name` - Name of the page
/// * `doctree` - Optional DocTree instance for include processing
///
/// # Returns
///
/// The HTML content of the page or an error
pub fn page_get_html(
&self,
page_name: &str,
doctree: Option<&crate::doctree::DocTree>,
) -> Result<String> {
// Get the markdown content
let markdown = self.page_get(page_name)?;
// Process includes if doctree is provided
let processed_markdown = if let Some(dt) = doctree {
process_includes(&markdown, &self.name, dt)?
} else {
markdown
};
// Convert markdown to HTML
let html = markdown_to_html(&processed_markdown);
Ok(html)
}
/// Get information about the Collection
///
/// # Returns
///
/// A map of information
pub fn info(&self) -> std::collections::HashMap<String, String> {
let mut info = std::collections::HashMap::new();
info.insert("name".to_string(), self.name.clone());
info.insert("path".to_string(), self.path.to_string_lossy().to_string());
info
}
/// Exports files and images from the collection to IPFS synchronously, encrypting them, and generating a CSV manifest.
///
/// # Arguments
///
/// * `output_csv_path` - The path to the output CSV file.
///
/// # Returns
///
/// Ok(()) on success or an error.
pub fn export_to_ipfs(&self, output_csv_path: &Path) -> Result<()> {
// Create a new tokio runtime and block on the async export function
tokio::runtime::Runtime::new()?
.block_on(async { self.export_to_ipfs_async(output_csv_path).await })?;
Ok(())
}
/// Exports files and images from the collection to IPFS asynchronously, encrypts them, and generates a CSV manifest.
///
/// # Arguments
///
/// * `output_csv_path` - The path to the output CSV file.
///
/// # Returns
///
/// Ok(()) on success or an error.
pub async fn export_to_ipfs_async(&self, output_csv_path: &Path) -> Result<()> {
use blake3::Hasher;
// use chacha20poly1305::{ChaCha20Poly1305, Aead};
use chacha20poly1305::aead::generic_array::GenericArray;
use csv::Writer;
use ipfs_api::IpfsClient;
use rand::rngs::OsRng;
use tokio::fs::File;
use tokio::io::AsyncReadExt;
// Create the output directory if it doesn't exist
// Create the output directory if it doesn't exist
if let Some(parent) = output_csv_path.parent() {
if parent.exists() && parent.is_file() {
println!(
"DEBUG: Removing conflicting file at output directory path: {:?}",
parent
);
tokio::fs::remove_file(parent)
.await
.map_err(DocTreeError::IoError)?;
println!("DEBUG: Conflicting file removed.");
}
if !parent.is_dir() {
println!("DEBUG: Ensuring output directory exists: {:?}", parent);
tokio::fs::create_dir_all(parent)
.await
.map_err(DocTreeError::IoError)?;
println!("DEBUG: Output directory ensured.");
} else {
println!("DEBUG: Output directory already exists: {:?}", parent);
}
}
// Create the CSV writer
println!(
"DEBUG: Creating or overwriting CSV file at {:?}",
output_csv_path
);
let file = std::fs::OpenOptions::new()
.write(true)
.create(true)
.truncate(true) // Add truncate option to overwrite if exists
.open(output_csv_path)
.map_err(DocTreeError::IoError)?;
let mut writer = Writer::from_writer(file);
println!("DEBUG: CSV writer created successfully");
// Write the CSV header
writer
.write_record(&[
"collectionname",
"filename",
"blakehash",
"ipfshash",
"size",
])
.map_err(|e| DocTreeError::CsvError(e.to_string()))?;
// Connect to IPFS
// let ipfs = IpfsClient::new("127.0.0.1:5001").await.map_err(|e| DocTreeError::IpfsError(e.to_string()))?;
let ipfs = IpfsClient::default();
// Get the list of pages and files
let pages = self.page_list()?;
let files = self.file_list()?;
// Combine the lists
let mut entries = pages;
entries.extend(files);
println!("DEBUG: Starting to process collection entries for IPFS export");
for entry_name in entries {
println!("DEBUG: Processing entry: {}", entry_name);
// Get the relative path from Redis
let relative_path = self
.storage
.get_collection_entry(&self.name, &entry_name)
.map_err(|_| DocTreeError::FileNotFound(entry_name.clone()))?;
println!("DEBUG: Retrieved relative path: {}", relative_path);
let file_path = self.path.join(&relative_path);
// Read file content
let mut file = match File::open(&file_path).await {
Ok(file) => file,
Err(e) => {
eprintln!("Error opening file {:?}: {}", file_path, e);
continue;
}
};
let mut content = Vec::new();
let size = match file.read_to_end(&mut content).await {
Ok(size) => size,
Err(e) => {
eprintln!("Error reading file {:?}: {}", file_path, e);
continue;
}
};
// Calculate Blake3 hash
let mut hasher = Hasher::new();
hasher.update(&content);
let blake_hash = hasher.finalize();
let blake_hash_hex = blake_hash.to_hex().to_string();
// Use Blake3 hash as key for ChaCha20Poly1305
let key = blake_hash.as_bytes();
//let cipher = ChaCha20Poly1305::new_from_slice(&key[..32]).map_err(|_| DocTreeError::EncryptionError("Invalid key size".to_string()))?;
// Generate a random nonce
let mut nonce = [0u8; 12];
//OsRng.fill_bytes(&mut nonce);
// Encrypt the content
// let encrypted_content = match cipher.encrypt(GenericArray::from_slice(&nonce), content.as_ref()) {
// Ok(encrypted) => encrypted,
// Err(e) => {
// eprintln!("Error encrypting file {:?}: {}", file_path, e);
// continue;
// }
// };
// Add encrypted content to IPFS
println!("DEBUG: Adding file to IPFS: {:?}", file_path);
let ipfs_path = match ipfs.add(std::io::Cursor::new(content)).await {
Ok(path) => {
println!(
"DEBUG: Successfully added file to IPFS. Hash: {}",
path.hash
);
path
}
Err(e) => {
eprintln!("Error adding file to IPFS {:?}: {}", file_path, e);
continue;
}
};
let ipfs_hash = ipfs_path.hash.to_string();
println!("DEBUG: IPFS hash: {}", ipfs_hash);
// Write record to CSV
println!("DEBUG: Writing CSV record for {:?}", file_path);
if let Err(e) = writer.write_record(&[
&self.name,
&relative_path,
&blake_hash_hex,
&ipfs_hash,
&size.to_string(),
]) {
eprintln!("Error writing CSV record for {:?}: {}", file_path, e);
continue;
}
println!("DEBUG: Successfully wrote CSV record for {:?}", file_path);
}
// Flush the CSV writer
println!("DEBUG: Flushing CSV writer");
writer
.flush()
.map_err(|e| DocTreeError::CsvError(e.to_string()))?;
println!("DEBUG: CSV writer flushed successfully");
Ok(())
}
}
impl CollectionBuilder {
/// Set the storage backend
///
/// # Arguments
///
/// * `storage` - Redis storage backend
///
/// # Returns
///
/// Self for method chaining
pub fn with_storage(mut self, storage: RedisStorage) -> Self {
self.storage = Some(storage);
self
}
/// Build the Collection
///
/// # Returns
///
/// A new Collection or an error
pub fn build(self) -> Result<Collection> {
let storage = self
.storage
.ok_or_else(|| DocTreeError::MissingParameter("storage".to_string()))?;
let collection = Collection {
path: self.path,
name: self.name,
storage,
};
Ok(collection)
}
}

829
doctree/src/doctree.rs Normal file
View File

@@ -0,0 +1,829 @@
use std::collections::HashMap;
use std::path::{Path, PathBuf};
use std::sync::{Arc, Mutex};
use std::fs;
use serde::Deserialize;
use crate::collection::Collection;
use crate::error::{DocTreeError, Result};
use crate::storage::RedisStorage;
use crate::include::process_includes;
use crate::utils::name_fix;
/// Configuration for a collection from a .collection file
#[derive(Deserialize, Default, Debug)]
struct CollectionConfig {
/// Optional name of the collection
name: Option<String>,
// Add other configuration options as needed
}
// Global variable to track the current collection name
// This is for compatibility with the Go implementation
lazy_static::lazy_static! {
static ref CURRENT_COLLECTION_NAME: Arc<Mutex<Option<String>>> = Arc::new(Mutex::new(None));
}
// Global variable to track the current Collection
// This is for compatibility with the Go implementation
/// DocTree represents a manager for multiple collections
pub struct DocTree {
/// Map of collections by name
pub collections: HashMap<String, Collection>,
/// Default collection name
pub default_collection: Option<String>,
/// Redis storage backend
storage: RedisStorage,
/// Name of the doctree (used as prefix for Redis keys)
pub doctree_name: String,
/// For backward compatibility
pub name: String,
/// For backward compatibility
pub path: PathBuf,
}
/// Builder for DocTree
pub struct DocTreeBuilder {
/// Map of collections by name
collections: HashMap<String, Collection>,
/// Default collection name
default_collection: Option<String>,
/// Redis storage backend
storage: Option<RedisStorage>,
/// Name of the doctree (used as prefix for Redis keys)
doctree_name: Option<String>,
/// For backward compatibility
name: Option<String>,
/// For backward compatibility
path: Option<PathBuf>,
}
impl DocTree {
/// Create a new DocTreeBuilder
///
/// # Returns
///
/// A new DocTreeBuilder
pub fn builder() -> DocTreeBuilder {
DocTreeBuilder {
collections: HashMap::new(),
default_collection: None,
storage: None,
doctree_name: Some("default".to_string()),
name: None,
path: None,
}
}
/// Add a collection to the DocTree
///
/// # Arguments
///
/// * `path` - Base path of the collection
/// * `name` - Name of the collection
///
/// # Returns
///
/// The added collection or an error
pub fn add_collection<P: AsRef<Path>>(&mut self, path: P, name: &str) -> Result<&Collection> {
// Create a new collection
let namefixed = name_fix(name);
// Clone the storage and set the doctree name
let storage = self.storage.clone();
storage.set_doctree_name(&self.doctree_name);
let collection = Collection::builder(path, &namefixed)
.with_storage(storage)
.build()?;
// Scan the collection
collection.scan()?;
// Add to the collections map
self.collections.insert(collection.name.clone(), collection);
// Return a reference to the added collection
self.collections.get(&namefixed).ok_or_else(|| {
DocTreeError::CollectionNotFound(namefixed.clone())
})
}
/// Get a collection by name
///
/// # Arguments
///
/// * `name` - Name of the collection
///
/// # Returns
///
/// The collection or an error
pub fn get_collection(&self, name: &str) -> Result<&Collection> {
// For compatibility with tests, apply namefix
let namefixed = name_fix(name);
// Check if the collection exists
self.collections.get(&namefixed).ok_or_else(|| {
DocTreeError::CollectionNotFound(name.to_string())
})
}
/// Delete a collection from the DocTree
///
/// # Arguments
///
/// * `name` - Name of the collection
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn delete_collection(&mut self, name: &str) -> Result<()> {
// For compatibility with tests, apply namefix
let namefixed = name_fix(name);
// Check if the collection exists
if !self.collections.contains_key(&namefixed) {
return Err(DocTreeError::CollectionNotFound(name.to_string()));
}
// Delete from Redis
self.storage.delete_collection(&namefixed)?;
// Remove from the collections map
self.collections.remove(&namefixed);
Ok(())
}
/// Delete all collections from the DocTree and Redis
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn delete_all_collections(&mut self) -> Result<()> {
// Delete all collections from Redis
self.storage.delete_all_collections()?;
// Clear the collections map
self.collections.clear();
// Reset the default collection
self.default_collection = None;
Ok(())
}
/// List all collections
///
/// # Returns
///
/// A vector of collection names
pub fn list_collections(&self) -> Vec<String> {
// First, try to get collections from the in-memory map
let mut collections = self.collections.keys().cloned().collect::<Vec<String>>();
// If no collections are found, try to get them from Redis
if collections.is_empty() {
// Get all collection keys from Redis
if let Ok(keys) = self.storage.list_all_collections() {
collections = keys;
}
}
collections
}
/// Load a collection from Redis
///
/// # Arguments
///
/// * `name` - Name of the collection
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn load_collection(&mut self, name: &str) -> Result<()> {
// Check if the collection exists in Redis
if !self.storage.collection_exists(name)? {
return Err(DocTreeError::CollectionNotFound(name.to_string()));
}
// Try to get the collection's path from Redis
let path = match self.storage.get_collection_path(name) {
Ok(path_str) => {
println!("DEBUG: Found collection path in Redis: {}", path_str);
PathBuf::from(path_str)
},
Err(e) => {
println!("DEBUG: Could not retrieve collection path from Redis: {}", e);
PathBuf::new() // Fallback to empty path if not found
}
};
// Create a new collection
let collection = Collection {
path,
name: name.to_string(),
storage: self.storage.clone(),
};
// Add to the collections map
self.collections.insert(name.to_string(), collection);
Ok(())
}
/// Load all collections from Redis
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn load_collections_from_redis(&mut self) -> Result<()> {
// Get all collection names from Redis
let collections = self.storage.list_all_collections()?;
// Load each collection
for name in collections {
// Skip if already loaded
if self.collections.contains_key(&name) {
continue;
}
// Try to get the collection's path from Redis
let path = match self.storage.get_collection_path(&name) {
Ok(path_str) => {
println!("DEBUG: Found collection path in Redis: {}", path_str);
PathBuf::from(path_str)
},
Err(e) => {
println!("DEBUG: Could not retrieve collection path from Redis: {}", e);
PathBuf::new() // Fallback to empty path if not found
}
};
// Create a new collection
let collection = Collection {
path,
name: name.clone(),
storage: self.storage.clone(),
};
// Add to the collections map
self.collections.insert(name, collection);
}
Ok(())
}
/// Get a page by name from a specific collection
///
/// # Arguments
///
/// * `collection_name` - Name of the collection (optional)
/// * `page_name` - Name of the page
///
/// # Returns
///
/// The page content or an error
pub fn page_get(&mut self, collection_name: Option<&str>, page_name: &str) -> Result<String> {
let (collection_name, page_name) = self.resolve_collection_and_page(collection_name, page_name)?;
// Get the collection
let collection = self.get_collection(&collection_name)?;
// Get the page content
let content = collection.page_get(page_name)?;
// Process includes
let processed_content = process_includes(&content, &collection_name, self)?;
Ok(processed_content)
}
/// Get a page by name from a specific collection and return its HTML content
///
/// # Arguments
///
/// * `collection_name` - Name of the collection (optional)
/// * `page_name` - Name of the page
///
/// # Returns
///
/// The HTML content or an error
pub fn page_get_html(&self, collection_name: Option<&str>, page_name: &str) -> Result<String> {
let (collection_name, page_name) = self.resolve_collection_and_page(collection_name, page_name)?;
// Get the collection
let collection = self.get_collection(&collection_name)?;
// Get the HTML
collection.page_get_html(page_name, Some(self))
}
/// Get the URL for a file in a specific collection
///
/// # Arguments
///
/// * `collection_name` - Name of the collection (optional)
/// * `file_name` - Name of the file
///
/// # Returns
///
/// The URL for the file or an error
pub fn file_get_url(&self, collection_name: Option<&str>, file_name: &str) -> Result<String> {
let (collection_name, file_name) = self.resolve_collection_and_page(collection_name, file_name)?;
// Get the collection
let collection = self.get_collection(&collection_name)?;
// Get the URL
collection.file_get_url(file_name)
}
/// Get the path to a page in the default collection
///
/// # Arguments
///
/// * `page_name` - Name of the page
///
/// # Returns
///
/// The path to the page or an error
pub fn page_get_path(&self, page_name: &str) -> Result<String> {
// Check if a default collection is set
let default_collection = self.default_collection.as_ref().ok_or_else(|| {
DocTreeError::NoDefaultCollection
})?;
// Get the collection
let collection = self.get_collection(default_collection)?;
// Get the path
collection.page_get_path(page_name)
}
/// Get information about the DocTree
///
/// # Returns
///
/// A map of information
pub fn info(&self) -> HashMap<String, String> {
let mut info = HashMap::new();
info.insert("name".to_string(), self.name.clone());
info.insert("path".to_string(), self.path.to_string_lossy().to_string());
info.insert("collections".to_string(), self.collections.len().to_string());
info
}
/// Scan the default collection
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn scan(&self) -> Result<()> {
// Check if a default collection is set
let default_collection = self.default_collection.as_ref().ok_or_else(|| {
DocTreeError::NoDefaultCollection
})?;
// Get the collection
let collection = self.get_collection(default_collection)?;
// Scan the collection
collection.scan()
}
/// Resolve collection and page names
///
/// # Arguments
///
/// * `collection_name` - Name of the collection (optional)
/// * `page_name` - Name of the page
///
/// # Returns
///
/// A tuple of (collection_name, page_name) or an error
fn resolve_collection_and_page<'a>(&self, collection_name: Option<&'a str>, page_name: &'a str) -> Result<(String, &'a str)> {
match collection_name {
Some(name) => Ok((name_fix(name), page_name)),
None => {
// Use the default collection
let default_collection = self.default_collection.as_ref().ok_or_else(|| {
DocTreeError::NoDefaultCollection
})?;
Ok((default_collection.clone(), page_name))
}
}
}
/// Recursively scan directories for .collection files and add them as collections
///
/// # Arguments
///
/// * `root_path` - The root path to start scanning from
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn scan_collections<P: AsRef<Path>>(&mut self, root_path: P) -> Result<()> {
let root_path = root_path.as_ref();
println!("DEBUG: Scanning for collections in directory: {:?}", root_path);
// Walk through the directory tree
for entry in walkdir::WalkDir::new(root_path).follow_links(true) {
let entry = match entry {
Ok(entry) => entry,
Err(e) => {
eprintln!("Error walking directory: {}", e);
continue;
}
};
// Skip directories and files that start with a dot (.)
let file_name = entry.file_name().to_string_lossy();
if file_name.starts_with(".") {
continue;
}
// Skip non-directories
if !entry.file_type().is_dir() {
continue;
}
// Check if this directory contains a .collection file
let collection_file_path = entry.path().join(".collection");
if collection_file_path.exists() {
// Found a collection directory
println!("DEBUG: Found .collection file at: {:?}", collection_file_path);
let dir_path = entry.path();
// Get the directory name as a fallback collection name
let dir_name = dir_path.file_name()
.and_then(|name| name.to_str())
.unwrap_or("unnamed");
// Try to read and parse the .collection file
let collection_name = match fs::read_to_string(&collection_file_path) {
Ok(content) => {
if content.trim().is_empty() {
// Empty file, use directory name (name_fixed)
dir_name.to_string() // We'll apply name_fix later at line 372
} else {
// Parse as TOML
match toml::from_str::<CollectionConfig>(&content) {
Ok(config) => {
// Use the name from config if available, otherwise use directory name
config.name.unwrap_or_else(|| dir_name.to_string())
},
Err(e) => {
eprintln!("Error parsing .collection file at {:?}: {}", collection_file_path, e);
dir_name.to_string()
}
}
}
},
Err(e) => {
eprintln!("Error reading .collection file at {:?}: {}", collection_file_path, e);
dir_name.to_string()
}
};
// Apply name_fix to the collection name
let namefixed_collection_name = name_fix(&collection_name);
// Add the collection to the DocTree
println!("DEBUG: Adding collection '{}' from directory {:?}", namefixed_collection_name, dir_path);
match self.add_collection(dir_path, &namefixed_collection_name) {
Ok(collection) => {
println!("DEBUG: Successfully added collection '{}' from {:?}", namefixed_collection_name, dir_path);
println!("DEBUG: Collection stored in Redis key 'collections:{}'", collection.name);
// Count documents and images
let docs = collection.page_list().unwrap_or_default();
let files = collection.file_list().unwrap_or_default();
let images = files.iter().filter(|f|
f.ends_with(".png") || f.ends_with(".jpg") ||
f.ends_with(".jpeg") || f.ends_with(".gif") ||
f.ends_with(".svg")
).count();
println!("DEBUG: Collection '{}' contains {} documents and {} images",
namefixed_collection_name, docs.len(), images);
},
Err(e) => {
eprintln!("Error adding collection '{}' from {:?}: {}", namefixed_collection_name, dir_path, e);
}
}
}
}
Ok(())
}
/// Exports all collections to IPFS, encrypting their files and generating CSV manifests.
///
/// # Arguments
///
/// * `output_dir` - The directory to save the output CSV files.
///
/// # Returns
///
/// Ok(()) on success or an error.
pub async fn export_collections_to_ipfs<P: AsRef<Path>>(&self, output_dir: P) -> Result<()> {
use tokio::fs;
let output_dir = output_dir.as_ref();
// Create the output directory if it doesn't exist
fs::create_dir_all(output_dir).await.map_err(DocTreeError::IoError)?;
for (name, collection) in &self.collections {
let csv_file_path = output_dir.join(format!("{}.csv", name));
println!("DEBUG: Exporting collection '{}' to IPFS and generating CSV at {:?}", name, csv_file_path);
if let Err(e) = collection.export_to_ipfs(&csv_file_path) {
eprintln!("Error exporting collection '{}': {}", name, e);
// Continue with the next collection
}
}
Ok(())
}
/// Exports a specific collection to IPFS synchronously, encrypting its files and generating a CSV manifest.
///
/// # Arguments
///
/// * `collection_name` - The name of the collection to export.
/// * `output_csv_path` - The path to save the output CSV file.
///
/// # Returns
///
/// Ok(()) on success or an error.
pub fn export_collection_to_ipfs(&self, collection_name: &str, output_csv_path: &Path) -> Result<()> {
// Get the collection
let collection = self.get_collection(collection_name)?;
// Create a new tokio runtime and block on the async export function
let csv_file_path = output_csv_path.join(format!("{}.csv", collection_name));
collection.export_to_ipfs(&csv_file_path)?;
Ok(())
}
}
impl DocTreeBuilder {
/// Set the storage backend
///
/// # Arguments
///
/// * `storage` - Redis storage backend
///
/// # Returns
///
/// Self for method chaining
/// Set the doctree name
///
/// # Arguments
///
/// * `name` - Name of the doctree
///
/// # Returns
///
/// Self for method chaining
pub fn with_doctree_name(mut self, name: &str) -> Self {
self.doctree_name = Some(name.to_string());
self
}
pub fn with_storage(mut self, storage: RedisStorage) -> Self {
self.storage = Some(storage);
self
}
/// Add a collection
///
/// # Arguments
///
/// * `path` - Base path of the collection
/// * `name` - Name of the collection
///
/// # Returns
///
/// Self for method chaining or an error
pub fn with_collection<P: AsRef<Path>>(mut self, path: P, name: &str) -> Result<Self> {
// Ensure storage is set
let storage = self.storage.as_ref().ok_or_else(|| {
DocTreeError::MissingParameter("storage".to_string())
})?;
// Get the doctree name
let doctree_name = self.doctree_name.clone().unwrap_or_else(|| "default".to_string());
// Create a new collection
let namefixed = name_fix(name);
// Clone the storage and set the doctree name
let storage_clone = storage.clone();
storage_clone.set_doctree_name(&doctree_name);
let collection = Collection::builder(path.as_ref(), &namefixed)
.with_storage(storage_clone)
.build()?;
// Scan the collection
collection.scan()?;
// Add to the collections map
self.collections.insert(collection.name.clone(), collection);
// For backward compatibility
if self.name.is_none() {
self.name = Some(namefixed.clone());
}
if self.path.is_none() {
self.path = Some(path.as_ref().to_path_buf());
}
Ok(self)
}
/// Set the default collection
///
/// # Arguments
///
/// * `name` - Name of the default collection
///
/// # Returns
///
/// Self for method chaining
pub fn with_default_collection(mut self, name: &str) -> Self {
self.default_collection = Some(name_fix(name));
self
}
/// Scan for collections in the given root path
///
/// # Arguments
///
/// * `root_path` - The root path to scan for collections
///
/// # Returns
///
/// Self for method chaining or an error
pub fn scan_collections<P: AsRef<Path>>(self, root_path: P) -> Result<Self> {
// Ensure storage is set
let storage = self.storage.as_ref().ok_or_else(|| {
DocTreeError::MissingParameter("storage".to_string())
})?;
// Get the doctree name
let doctree_name = self.doctree_name.clone().unwrap_or_else(|| "default".to_string());
// Clone the storage and set the doctree name
let storage_clone = storage.clone();
storage_clone.set_doctree_name(&doctree_name);
// Create a temporary DocTree to scan collections
let mut temp_doctree = DocTree {
collections: HashMap::new(),
default_collection: None,
storage: storage_clone,
doctree_name: doctree_name,
name: self.name.clone().unwrap_or_default(),
path: self.path.clone().unwrap_or_else(|| PathBuf::from("")),
};
// Scan for collections
temp_doctree.scan_collections(root_path)?;
// Create a new builder with the scanned collections
let mut new_builder = self;
for (name, collection) in temp_doctree.collections {
new_builder.collections.insert(name.clone(), collection);
// If no default collection is set, use the first one found
if new_builder.default_collection.is_none() {
new_builder.default_collection = Some(name);
}
}
Ok(new_builder)
}
/// Build the DocTree
///
/// # Returns
///
/// A new DocTree or an error
pub fn build(self) -> Result<DocTree> {
// Ensure storage is set
let storage = self.storage.ok_or_else(|| {
DocTreeError::MissingParameter("storage".to_string())
})?;
// Get the doctree name
let doctree_name = self.doctree_name.unwrap_or_else(|| "default".to_string());
// Set the doctree name in the storage
let storage_clone = storage.clone();
storage_clone.set_doctree_name(&doctree_name);
// Create the DocTree
let mut doctree = DocTree {
collections: self.collections,
default_collection: self.default_collection,
storage: storage_clone,
doctree_name,
name: self.name.unwrap_or_default(),
path: self.path.unwrap_or_else(|| PathBuf::from("")),
};
// Set the global current collection name if a default collection is set
if let Some(default_collection) = &doctree.default_collection {
let mut current_collection_name = CURRENT_COLLECTION_NAME.lock().unwrap();
*current_collection_name = Some(default_collection.clone());
}
// Load all collections from Redis
doctree.load_collections_from_redis()?;
Ok(doctree)
}
}
/// Create a new DocTree instance
///
/// For backward compatibility, it also accepts path and name parameters
/// to create a DocTree with a single collection
///
/// # Arguments
///
/// * `args` - Optional path and name for backward compatibility
///
/// # Returns
///
/// A new DocTree or an error
pub fn new<P: AsRef<Path>>(args: &[&str]) -> Result<DocTree> {
let storage = RedisStorage::new("redis://localhost:6379")?;
let mut builder = DocTree::builder().with_storage(storage);
// If the first argument is a doctree name, use it
if args.len() >= 1 && args[0].starts_with("--doctree=") {
let doctree_name = args[0].trim_start_matches("--doctree=");
builder = builder.with_doctree_name(doctree_name);
}
// For backward compatibility with existing code
if args.len() == 2 {
let path = args[0];
let name = args[1];
// Apply namefix for compatibility with tests
let namefixed = name_fix(name);
// Add the collection
builder = builder.with_collection(path, &namefixed)?;
// Set the default collection
builder = builder.with_default_collection(&namefixed);
}
builder.build()
}
/// Create a new DocTree by scanning a directory for collections
///
/// # Arguments
///
/// * `root_path` - The root path to scan for collections
/// * `doctree_name` - Optional name for the doctree (default: "default")
///
/// # Returns
///
/// A new DocTree or an error
pub fn from_directory<P: AsRef<Path>>(root_path: P, doctree_name: Option<&str>) -> Result<DocTree> {
let storage = RedisStorage::new("redis://localhost:6379")?;
let mut builder = DocTree::builder().with_storage(storage);
// Set the doctree name if provided
if let Some(name) = doctree_name {
builder = builder.with_doctree_name(name);
}
builder.scan_collections(root_path)?.build()
}

60
doctree/src/error.rs Normal file
View File

@@ -0,0 +1,60 @@
use thiserror::Error;
/// Custom error type for the doctree library
#[derive(Error, Debug)]
pub enum DocTreeError {
/// IO error
#[error("IO error: {0}")]
IoError(#[from] std::io::Error),
/// WalkDir error
#[error("WalkDir error: {0}")]
WalkDirError(String),
/// Collection not found
#[error("Collection not found: {0}")]
CollectionNotFound(String),
/// Page not found
#[error("Page not found: {0}")]
PageNotFound(String),
/// File not found
#[error("File not found: {0}")]
FileNotFound(String),
/// Invalid include directive
#[error("Invalid include directive: {0}")]
InvalidIncludeDirective(String),
/// No default collection set
#[error("No default collection set")]
NoDefaultCollection,
/// Invalid number of arguments
#[error("Invalid number of arguments")]
InvalidArgumentCount,
/// Missing required parameter
#[error("Missing required parameter: {0}")]
MissingParameter(String),
/// Redis error
#[error("Redis error: {0}")]
RedisError(String),
/// CSV error
#[error("CSV error: {0}")]
CsvError(String),
/// IPFS error
#[error("IPFS error: {0}")]
IpfsError(String),
/// Encryption error
#[error("Encryption error: {0}")]
EncryptionError(String),
}
/// Result type alias for doctree operations
pub type Result<T> = std::result::Result<T, DocTreeError>;

178
doctree/src/include.rs Normal file
View File

@@ -0,0 +1,178 @@
use crate::doctree::DocTree;
use crate::error::{DocTreeError, Result};
use crate::utils::trim_spaces_and_quotes;
/// Process includes in markdown content
///
/// # Arguments
///
/// * `content` - The markdown content to process
/// * `current_collection_name` - The name of the current collection
/// * `doctree` - The DocTree instance
///
/// # Returns
///
/// The processed content or an error
pub fn process_includes(content: &str, current_collection_name: &str, doctree: &DocTree) -> Result<String> {
// Find all include directives
let lines: Vec<&str> = content.split('\n').collect();
let mut result = Vec::with_capacity(lines.len());
for line in lines {
match parse_include_line(line) {
Ok((Some(c), Some(p))) => {
// Both collection and page specified
match handle_include(&p, &c, doctree) {
Ok(include_content) => {
// Process any nested includes in the included content
match process_includes(&include_content, &c, doctree) {
Ok(processed_include_content) => {
result.push(processed_include_content);
},
Err(e) => {
result.push(format!(">>ERROR: Failed to process nested includes: {}", e));
}
}
},
Err(e) => {
result.push(format!(">>ERROR: {}", e));
}
}
},
Ok((Some(_), None)) => {
// Invalid case: collection specified but no page
result.push(format!(">>ERROR: Invalid include directive: collection specified but no page name"));
},
Ok((None, Some(p))) => {
// Only page specified, use current collection
match handle_include(&p, current_collection_name, doctree) {
Ok(include_content) => {
// Process any nested includes in the included content
match process_includes(&include_content, current_collection_name, doctree) {
Ok(processed_include_content) => {
result.push(processed_include_content);
},
Err(e) => {
result.push(format!(">>ERROR: Failed to process nested includes: {}", e));
}
}
},
Err(e) => {
result.push(format!(">>ERROR: {}", e));
}
}
},
Ok((None, None)) => {
// Not an include directive, keep the line
result.push(line.to_string());
},
Err(e) => {
// Error parsing include directive
result.push(format!(">>ERROR: Failed to process include directive: {}", e));
}
}
}
Ok(result.join("\n"))
}
/// Parse an include directive line
///
/// # Arguments
///
/// * `line` - The line to parse
///
/// # Returns
///
/// A tuple of (collection_name, page_name) or an error
///
/// Supports:
/// - !!include collectionname:'pagename'
/// - !!include collectionname:'pagename.md'
/// - !!include 'pagename'
/// - !!include collectionname:pagename
/// - !!include collectionname:pagename.md
/// - !!include name:'pagename'
/// - !!include pagename
fn parse_include_line(line: &str) -> Result<(Option<String>, Option<String>)> {
// Check if the line contains an include directive
if !line.contains("!!include") {
return Ok((None, None));
}
// Extract the part after !!include
let parts: Vec<&str> = line.splitn(2, "!!include").collect();
if parts.len() != 2 {
return Err(DocTreeError::InvalidIncludeDirective(line.to_string()));
}
// Trim spaces and check if the include part is empty
let include_text = trim_spaces_and_quotes(parts[1]);
if include_text.is_empty() {
return Err(DocTreeError::InvalidIncludeDirective(line.to_string()));
}
// Remove name: prefix if present
let include_text = if include_text.starts_with("name:") {
let text = include_text.trim_start_matches("name:").trim();
if text.is_empty() {
return Err(DocTreeError::InvalidIncludeDirective(
format!("empty page name after 'name:' prefix: {}", line)
));
}
text.to_string()
} else {
include_text
};
// Check if it contains a collection reference (has a colon)
if include_text.contains(':') {
let parts: Vec<&str> = include_text.splitn(2, ':').collect();
if parts.len() != 2 {
return Err(DocTreeError::InvalidIncludeDirective(
format!("malformed collection reference: {}", include_text)
));
}
let collection_name = parts[0].trim();
let page_name = trim_spaces_and_quotes(parts[1]);
if collection_name.is_empty() {
return Err(DocTreeError::InvalidIncludeDirective(
format!("empty collection name in include directive: {}", line)
));
}
if page_name.is_empty() {
return Err(DocTreeError::InvalidIncludeDirective(
format!("empty page name in include directive: {}", line)
));
}
Ok((Some(collection_name.to_string()), Some(page_name)))
} else {
// No collection specified, just a page name
Ok((None, Some(include_text)))
}
}
/// Handle an include directive
///
/// # Arguments
///
/// * `page_name` - The name of the page to include
/// * `collection_name` - The name of the collection
/// * `doctree` - The DocTree instance
///
/// # Returns
///
/// The included content or an error
fn handle_include(page_name: &str, collection_name: &str, doctree: &DocTree) -> Result<String> {
// Get the collection
let collection = doctree.get_collection(collection_name)?;
// Get the page content
let content = collection.page_get(page_name)?;
Ok(content)
}

35
doctree/src/lib.rs Normal file
View File

@@ -0,0 +1,35 @@
//! DocTree is a library for managing collections of markdown documents.
//!
//! It provides functionality for scanning directories, managing collections,
//! and processing includes between documents.
// Import lazy_static for global state
mod collection;
mod doctree;
mod error;
mod include;
mod storage;
mod utils;
pub use collection::{Collection, CollectionBuilder};
pub use doctree::{DocTree, DocTreeBuilder, from_directory, new};
pub use error::{DocTreeError, Result};
pub use include::process_includes;
pub use storage::RedisStorage;
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_doctree_builder() {
// Create a storage instance
let storage = RedisStorage::new("redis://localhost:6379").unwrap();
let doctree = DocTree::builder().with_storage(storage).build().unwrap();
assert_eq!(doctree.collections.len(), 0);
assert_eq!(doctree.default_collection, None);
}
}

502
doctree/src/storage.rs Normal file
View File

@@ -0,0 +1,502 @@
use redis::{Client, Connection};
use std::sync::{Arc, Mutex};
use crate::error::{DocTreeError, Result};
/// Storage backend for doctree
pub struct RedisStorage {
// Redis client
client: Client,
// Connection pool
connection: Arc<Mutex<Connection>>,
// Doctree name for key prefixing
doctree_name: Arc<Mutex<String>>,
// Debug mode flag
debug: Arc<Mutex<bool>>,
}
impl RedisStorage {
/// Create a new RedisStorage instance
///
/// # Arguments
///
/// * `url` - Redis connection URL (e.g., "redis://localhost:6379")
/// This is ignored in the in-memory implementation
///
/// # Returns
///
/// A new RedisStorage instance or an error
pub fn new(url: &str) -> Result<Self> {
// Create a Redis client
let client = Client::open(url).map_err(|e| DocTreeError::RedisError(format!("Failed to connect to Redis: {}", e)))?;
// Get a connection
let connection = client.get_connection().map_err(|e| DocTreeError::RedisError(format!("Failed to get Redis connection: {}", e)))?;
Ok(Self {
client,
connection: Arc::new(Mutex::new(connection)),
doctree_name: Arc::new(Mutex::new("default".to_string())),
debug: Arc::new(Mutex::new(false)),
})
}
/// Set the doctree name for key prefixing
///
/// # Arguments
///
/// * `name` - Doctree name
pub fn set_doctree_name(&self, name: &str) {
let mut doctree_name = self.doctree_name.lock().unwrap();
*doctree_name = name.to_string();
}
/// Set the debug mode
///
/// # Arguments
///
/// * `enable` - Whether to enable debug mode
pub fn set_debug(&self, enable: bool) {
let mut debug = self.debug.lock().unwrap();
*debug = enable;
}
/// Check if debug mode is enabled
///
/// # Returns
///
/// true if debug mode is enabled, false otherwise
fn is_debug_enabled(&self) -> bool {
let debug = self.debug.lock().unwrap();
*debug
}
/// Get the doctree name
///
/// # Returns
///
/// The doctree name
pub fn get_doctree_name(&self) -> String {
let doctree_name = self.doctree_name.lock().unwrap();
doctree_name.clone()
}
/// Store a collection entry
///
/// # Arguments
///
/// * `collection` - Collection name
/// * `key` - Entry key
/// * `value` - Entry value
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn store_collection_entry(&self, collection: &str, key: &str, value: &str) -> Result<()> {
let doctree_name = self.get_doctree_name();
let redis_key = format!("{}:collections:{}", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - HSET {} {} {}", redis_key, key, value);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
// Store the entry using HSET
redis::cmd("HSET")
.arg(&redis_key)
.arg(key)
.arg(value)
.execute(&mut *conn);
if self.is_debug_enabled() {
println!("DEBUG: Stored entry in Redis - collection: '{}', key: '{}', value: '{}'",
collection, key, value);
}
Ok(())
}
/// Get a collection entry
///
/// # Arguments
///
/// * `collection` - Collection name
/// * `key` - Entry key
///
/// # Returns
///
/// The entry value or an error
pub fn get_collection_entry(&self, collection: &str, key: &str) -> Result<String> {
let doctree_name = self.get_doctree_name();
let collection_key = format!("{}:collections:{}", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - HGET {} {}", collection_key, key);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
// Get the entry using HGET
let result: Option<String> = redis::cmd("HGET")
.arg(&collection_key)
.arg(key)
.query(&mut *conn)
.map_err(|e| DocTreeError::RedisError(format!("Redis error: {}", e)))?;
// Check if the entry exists
match result {
Some(value) => {
if self.is_debug_enabled() {
println!("DEBUG: Retrieved entry from Redis - collection: '{}', key: '{}', value: '{}'",
collection, key, value);
}
Ok(value)
},
None => {
if self.is_debug_enabled() {
println!("DEBUG: Entry not found in Redis - collection: '{}', key: '{}'",
collection, key);
}
Err(DocTreeError::FileNotFound(key.to_string()))
}
}
}
/// Delete a collection entry
///
/// # Arguments
///
/// * `collection` - Collection name
/// * `key` - Entry key
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn delete_collection_entry(&self, collection: &str, key: &str) -> Result<()> {
let doctree_name = self.get_doctree_name();
let collection_key = format!("{}:collections:{}", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - HDEL {} {}", collection_key, key);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
// Delete the entry using HDEL
let exists: bool = redis::cmd("HEXISTS")
.arg(&collection_key)
.arg(key)
.query(&mut *conn)
.map_err(|e| DocTreeError::RedisError(format!("Redis error: {}", e)))?;
if !exists {
return Err(DocTreeError::CollectionNotFound(collection.to_string()));
}
redis::cmd("HDEL")
.arg(&collection_key)
.arg(key)
.execute(&mut *conn);
if self.is_debug_enabled() {
println!("DEBUG: Deleted entry from Redis - collection: '{}', key: '{}'",
collection, key);
}
Ok(())
}
/// List all entries in a collection
///
/// # Arguments
///
/// * `collection` - Collection name
///
/// # Returns
///
/// A vector of entry keys or an error
pub fn list_collection_entries(&self, collection: &str) -> Result<Vec<String>> {
let doctree_name = self.get_doctree_name();
let collection_key = format!("{}:collections:{}", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - HKEYS {}", collection_key);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
// Check if the collection exists
let exists: bool = redis::cmd("EXISTS")
.arg(&collection_key)
.query(&mut *conn)
.map_err(|e| DocTreeError::RedisError(format!("Redis error: {}", e)))?;
if !exists {
return Err(DocTreeError::CollectionNotFound(collection.to_string()));
}
// Get all keys using HKEYS
let keys: Vec<String> = redis::cmd("HKEYS")
.arg(&collection_key)
.query(&mut *conn)
.map_err(|e| DocTreeError::RedisError(format!("Redis error: {}", e)))?;
if self.is_debug_enabled() {
println!("DEBUG: Listed {} entries from Redis - collection: '{}'",
keys.len(), collection);
}
Ok(keys)
}
/// Delete a collection
///
/// # Arguments
///
/// * `collection` - Collection name
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn delete_collection(&self, collection: &str) -> Result<()> {
let doctree_name = self.get_doctree_name();
let redis_key = format!("{}:collections:{}", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - DEL {}", redis_key);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
// Delete the collection using DEL
redis::cmd("DEL")
.arg(&redis_key)
.execute(&mut *conn);
if self.is_debug_enabled() {
println!("DEBUG: Deleted collection from Redis - collection: '{}'", collection);
}
Ok(())
}
/// Check if a collection exists
///
/// # Arguments
///
/// * `collection` - Collection name
///
/// # Returns
///
/// true if the collection exists, false otherwise
pub fn collection_exists(&self, collection: &str) -> Result<bool> {
let doctree_name = self.get_doctree_name();
let collection_key = format!("{}:collections:{}", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - EXISTS {}", collection_key);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
// Check if the collection exists using EXISTS
let exists: bool = redis::cmd("EXISTS")
.arg(&collection_key)
.query(&mut *conn)
.map_err(|e| DocTreeError::RedisError(format!("Redis error: {}", e)))?;
if self.is_debug_enabled() {
println!("DEBUG: Collection exists check - collection: '{}', exists: {}",
collection, exists);
}
Ok(exists)
}
/// List all collections in Redis
///
/// # Returns
///
/// A vector of collection names or an error
pub fn list_all_collections(&self) -> Result<Vec<String>> {
let doctree_name = self.get_doctree_name();
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - KEYS {}:collections:*", doctree_name);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
// Get all collection keys
let pattern = format!("{}:collections:*", doctree_name);
let keys: Vec<String> = redis::cmd("KEYS")
.arg(&pattern)
.query(&mut *conn)
.map_err(|e| DocTreeError::RedisError(format!("Redis error: {}", e)))?;
// Extract collection names from keys (remove the "{doctree_name}:collections:" prefix)
let prefix = format!("{}:collections:", doctree_name);
let prefix_len = prefix.len();
let collections = keys.iter()
.filter_map(|key| {
if key.starts_with(&prefix) && !key.ends_with(":path") {
Some(key[prefix_len..].to_string())
} else {
None
}
})
.collect();
if self.is_debug_enabled() {
println!("DEBUG: Found {} collections in Redis", keys.len());
}
Ok(collections)
}
/// Delete all collections from Redis
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn delete_all_collections(&self) -> Result<()> {
let doctree_name = self.get_doctree_name();
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - KEYS {}:collections:*", doctree_name);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
// Get all collection keys
let pattern = format!("{}:collections:*", doctree_name);
let keys: Vec<String> = redis::cmd("KEYS")
.arg(&pattern)
.query(&mut *conn)
.map_err(|e| DocTreeError::RedisError(format!("Redis error: {}", e)))?;
if self.is_debug_enabled() {
println!("DEBUG: Found {} collections in Redis", keys.len());
}
// Delete each collection
for key in keys {
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - DEL {}", key);
}
redis::cmd("DEL")
.arg(&key)
.execute(&mut *conn);
if self.is_debug_enabled() {
println!("DEBUG: Deleted collection from Redis - key: '{}'", key);
}
}
Ok(())
}
/// Store a collection's path
///
/// # Arguments
///
/// * `collection` - Collection name
/// * `path` - Collection path
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn store_collection_path(&self, collection: &str, path: &str) -> Result<()> {
let doctree_name = self.get_doctree_name();
let redis_key = format!("{}:collections:{}:path", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - SET {} {}", redis_key, path);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
// Store the path using SET
redis::cmd("SET")
.arg(&redis_key)
.arg(path)
.execute(&mut *conn);
if self.is_debug_enabled() {
println!("DEBUG: Stored collection path in Redis - collection: '{}', path: '{}'",
collection, path);
}
Ok(())
}
/// Get a collection's path
///
/// # Arguments
///
/// * `collection` - Collection name
///
/// # Returns
///
/// The collection path or an error
pub fn get_collection_path(&self, collection: &str) -> Result<String> {
let doctree_name = self.get_doctree_name();
let redis_key = format!("{}:collections:{}:path", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - GET {}", redis_key);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
// Get the path using GET
let result: Option<String> = redis::cmd("GET")
.arg(&redis_key)
.query(&mut *conn)
.map_err(|e| DocTreeError::RedisError(format!("Redis error: {}", e)))?;
// Check if the path exists
match result {
Some(path) => {
if self.is_debug_enabled() {
println!("DEBUG: Retrieved collection path from Redis - collection: '{}', path: '{}'",
collection, path);
}
Ok(path)
},
None => {
if self.is_debug_enabled() {
println!("DEBUG: Collection path not found in Redis - collection: '{}'",
collection);
}
Err(DocTreeError::CollectionNotFound(collection.to_string()))
}
}
}
}
// Implement Clone for RedisStorage
impl Clone for RedisStorage {
fn clone(&self) -> Self {
// Create a new connection
let connection = self.client.get_connection()
.expect("Failed to get Redis connection");
Self {
client: self.client.clone(),
connection: Arc::new(Mutex::new(connection)),
doctree_name: self.doctree_name.clone(),
debug: self.debug.clone(),
}
}
}

79
doctree/src/utils.rs Normal file
View File

@@ -0,0 +1,79 @@
use pulldown_cmark::{Parser, Options, html};
use sal::text;
/// Fix a name to be used as a key
///
/// This is equivalent to the tools.NameFix function in the Go implementation.
/// It normalizes the name by converting to lowercase, replacing spaces with hyphens, etc.
///
/// # Arguments
///
/// * `name` - The name to fix
///
/// # Returns
///
/// The fixed name
pub fn name_fix(text: &str) -> String {
// Use the name_fix function from the SAL library
text::name_fix(text)
}
/// Convert markdown to HTML
///
/// # Arguments
///
/// * `markdown` - The markdown content to convert
///
/// # Returns
///
/// The HTML content
pub fn markdown_to_html(markdown: &str) -> String {
let mut options = Options::empty();
options.insert(Options::ENABLE_TABLES);
options.insert(Options::ENABLE_FOOTNOTES);
options.insert(Options::ENABLE_STRIKETHROUGH);
let parser = Parser::new_ext(markdown, options);
let mut html_output = String::new();
html::push_html(&mut html_output, parser);
html_output
}
/// Trim spaces and quotes from a string
///
/// # Arguments
///
/// * `s` - The string to trim
///
/// # Returns
///
/// The trimmed string
pub fn trim_spaces_and_quotes(s: &str) -> String {
let mut result = s.trim().to_string();
// Remove surrounding quotes
if (result.starts_with('\'') && result.ends_with('\'')) ||
(result.starts_with('"') && result.ends_with('"')) {
result = result[1..result.len()-1].to_string();
}
result
}
/// Ensure a string has a .md extension
///
/// # Arguments
///
/// * `name` - The name to check
///
/// # Returns
///
/// The name with a .md extension
pub fn ensure_md_extension(name: &str) -> String {
if !name.ends_with(".md") {
format!("{}.md", name)
} else {
name.to_string()
}
}

12
doctreecmd/Cargo.toml Normal file
View File

@@ -0,0 +1,12 @@
[package]
name = "doctreecmd"
version = "0.1.0"
edition = "2024"
[[bin]]
name = "doctree"
path = "src/main.rs"
[dependencies]
doctree = { path = "../doctree" }
clap = "3.2.25"

446
doctreecmd/src/main.rs Normal file
View File

@@ -0,0 +1,446 @@
use clap::{App, Arg, SubCommand};
use doctree::{DocTree, RedisStorage, Result, from_directory};
use std::path::Path;
fn main() -> Result<()> {
let matches = App::new("doctree")
.version("0.1.0")
.author("Your Name")
.about("A tool to manage document collections")
.arg(
Arg::with_name("debug")
.long("debug")
.help("Enable debug logging")
.takes_value(false)
)
.subcommand(
SubCommand::with_name("scan")
.about("Scan a directory for .collection files and create collections")
.arg(Arg::with_name("path").required(true).help("Path to the directory"))
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.subcommand(
SubCommand::with_name("list")
.about("List collections")
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.subcommand(
SubCommand::with_name("info")
.about("Show detailed information about collections")
.arg(Arg::with_name("collection").help("Name of the collection (optional)"))
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.subcommand(
SubCommand::with_name("get")
.about("Get page content")
.arg(Arg::with_name("collection")
.short("c".chars().next().unwrap())
.long("collection")
.takes_value(true)
.help("Name of the collection (optional)"))
.arg(Arg::with_name("page")
.short("p".chars().next().unwrap())
.long("page")
.required(true)
.takes_value(true)
.help("Name of the page"))
.arg(Arg::with_name("format")
.short("f".chars().next().unwrap())
.long("format")
.takes_value(true)
.help("Output format (html or markdown, default: markdown)"))
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.subcommand(
SubCommand::with_name("html")
.about("Get page content as HTML")
.arg(Arg::with_name("collection").required(true).help("Name of the collection"))
.arg(Arg::with_name("page").required(true).help("Name of the page"))
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.subcommand(
SubCommand::with_name("delete")
.about("Delete a collection from Redis")
.arg(Arg::with_name("collection").required(true).help("Name of the collection"))
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.subcommand(
SubCommand::with_name("reset")
.about("Delete all collections from Redis")
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.subcommand(
SubCommand::with_name("export_to_ipfs")
.about("Export a collection to IPFS")
.arg(Arg::with_name("collection")
.short("c".chars().next().unwrap())
.long("collection")
.takes_value(true)
.required(false)
.help("Name of the collection (export all if not specified)"))
.arg(Arg::with_name("output").required(true).help("Output directory for IPFS export"))
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.get_matches();
// Check if debug mode is enabled
let debug_mode = matches.is_present("debug");
// Handle subcommands
if let Some(matches) = matches.subcommand_matches("scan") {
let path = matches.value_of("path").unwrap();
if debug_mode {
println!("DEBUG: Scanning path: {}", path);
}
let doctree_name = matches.value_of("doctree").unwrap_or("default");
println!("Recursively scanning for collections in: {}", path);
println!("Using doctree name: {}", doctree_name);
// Use the from_directory function to create a DocTree with all collections
let doctree = from_directory(Path::new(path), Some(doctree_name))?;
// Print the discovered collections
let collections = doctree.list_collections();
if collections.is_empty() {
println!("No collections found");
} else {
println!("Discovered collections:");
for collection in collections {
println!("- {}", collection);
}
}
} else if let Some(matches) = matches.subcommand_matches("list") {
let doctree_name = matches.value_of("doctree").unwrap_or("default");
if debug_mode {
println!("DEBUG: Listing collections for doctree: {}", doctree_name);
}
// Create a storage with the specified doctree name
let storage = RedisStorage::new("redis://localhost:6379")?;
storage.set_doctree_name(doctree_name);
storage.set_debug(debug_mode);
if debug_mode {
println!("DEBUG: Connected to Redis storage");
}
// Get collections directly from Redis to avoid debug output from DocTree
let collections = storage.list_all_collections()?;
if collections.is_empty() {
println!("No collections found in doctree '{}'", doctree_name);
} else {
println!("Collections in doctree '{}':", doctree_name);
for collection in collections {
println!("- {}", collection);
}
}
} else if let Some(matches) = matches.subcommand_matches("get") {
let collection = matches.value_of("collection");
let page = matches.value_of("page").unwrap();
let format = matches.value_of("format").unwrap_or("markdown");
let doctree_name = matches.value_of("doctree").unwrap_or("default");
if debug_mode {
println!("DEBUG: Getting page '{}' from collection '{}' in doctree '{}' with format '{}'",
page, collection.unwrap_or("(default)"), doctree_name, format);
}
// Create a storage with the specified doctree name
let storage = RedisStorage::new("redis://localhost:6379")?;
storage.set_doctree_name(doctree_name);
storage.set_debug(debug_mode);
if debug_mode {
println!("DEBUG: Connected to Redis storage");
}
// Create a DocTree with the specified doctree name
let mut doctree = DocTree::builder()
.with_storage(storage)
.with_doctree_name(doctree_name)
.build()?;
// Load collections from Redis
doctree.load_collections_from_redis()?;
if format.to_lowercase() == "html" {
let html = doctree.page_get_html(collection, page)?;
println!("{}", html);
} else {
let content = doctree.page_get(collection, page)?;
println!("{}", content);
}
} else if let Some(matches) = matches.subcommand_matches("html") {
let collection = matches.value_of("collection").unwrap();
let page = matches.value_of("page").unwrap();
let doctree_name = matches.value_of("doctree").unwrap_or("default");
if debug_mode {
println!("DEBUG: Getting HTML for page '{}' from collection '{}' in doctree '{}'",
page, collection, doctree_name);
}
// Create a storage with the specified doctree name
let storage = RedisStorage::new("redis://localhost:6379")?;
storage.set_doctree_name(doctree_name);
storage.set_debug(debug_mode);
if debug_mode {
println!("DEBUG: Connected to Redis storage");
}
// Create a DocTree with the specified doctree name
let mut doctree = DocTree::builder()
.with_storage(storage)
.with_doctree_name(doctree_name)
.build()?;
// Load collections from Redis
doctree.load_collections_from_redis()?;
let html = doctree.page_get_html(Some(collection), page)?;
println!("{}", html);
} else if let Some(matches) = matches.subcommand_matches("info") {
let doctree_name = matches.value_of("doctree").unwrap_or("default");
let collection_name = matches.value_of("collection");
if debug_mode {
if let Some(name) = collection_name {
println!("DEBUG: Getting info for collection '{}' in doctree '{}'", name, doctree_name);
} else {
println!("DEBUG: Getting info for all collections in doctree '{}'", doctree_name);
}
}
// Create a storage with the specified doctree name
let storage = RedisStorage::new("redis://localhost:6379")?;
storage.set_doctree_name(doctree_name);
storage.set_debug(debug_mode);
if debug_mode {
println!("DEBUG: Connected to Redis storage");
}
// Create a DocTree with the specified doctree name
let mut doctree = DocTree::builder()
.with_storage(storage)
.with_doctree_name(doctree_name)
.build()?;
// Load collections from Redis
doctree.load_collections_from_redis()?;
let collection_name = matches.value_of("collection");
if let Some(name) = collection_name {
// Show info for a specific collection
match doctree.get_collection(name) {
Ok(collection) => {
println!("Collection Information for '{}':", name);
println!(" Path: {:?}", collection.path);
println!(" Redis Key: {}:collections:{}", doctree_name, collection.name);
// List documents
match collection.page_list() {
Ok(pages) => {
println!(" Documents ({}):", pages.len());
for page in pages {
match collection.page_get_path(&page) {
Ok(path) => {
println!(" - {}", path);
},
Err(_) => {
println!(" - {}", page);
}
}
}
},
Err(e) => println!(" Error listing documents: {}", e),
}
// List files
match collection.file_list() {
Ok(files) => {
// Filter images
let images: Vec<String> = files.iter()
.filter(|f|
f.ends_with(".png") || f.ends_with(".jpg") ||
f.ends_with(".jpeg") || f.ends_with(".gif") ||
f.ends_with(".svg"))
.cloned()
.collect();
println!(" Images ({}):", images.len());
for image in images {
println!(" - {}", image);
}
// Filter other files
let other_files: Vec<String> = files.iter()
.filter(|f|
!f.ends_with(".png") && !f.ends_with(".jpg") &&
!f.ends_with(".jpeg") && !f.ends_with(".gif") &&
!f.ends_with(".svg"))
.cloned()
.collect();
println!(" Other Files ({}):", other_files.len());
for file in other_files {
println!(" - {}", file);
}
},
Err(e) => println!(" Error listing files: {}", e),
}
},
Err(e) => println!("Error: {}", e),
}
} else {
// Show info for all collections
let collections = doctree.list_collections();
if collections.is_empty() {
println!("No collections found");
} else {
println!("Collections in doctree '{}':", doctree_name);
for name in collections {
if let Ok(collection) = doctree.get_collection(&name) {
println!("- {} (Redis Key: {}:collections:{})", name, doctree_name, collection.name);
println!(" Path: {:?}", collection.path);
// Count documents and images
if let Ok(pages) = collection.page_list() {
println!(" Documents: {}", pages.len());
}
if let Ok(files) = collection.file_list() {
let image_count = files.iter()
.filter(|f|
f.ends_with(".png") || f.ends_with(".jpg") ||
f.ends_with(".jpeg") || f.ends_with(".gif") ||
f.ends_with(".svg"))
.count();
println!(" Images: {}", image_count);
println!(" Other Files: {}", files.len() - image_count);
}
}
}
}
}
} else if let Some(matches) = matches.subcommand_matches("delete") {
let collection = matches.value_of("collection").unwrap();
let doctree_name = matches.value_of("doctree").unwrap_or("default");
if debug_mode {
println!("DEBUG: Deleting collection '{}' from doctree '{}'", collection, doctree_name);
}
// Create a storage with the specified doctree name
let storage = RedisStorage::new("redis://localhost:6379")?;
storage.set_doctree_name(doctree_name);
storage.set_debug(debug_mode);
if debug_mode {
println!("DEBUG: Connected to Redis storage");
}
// Create a DocTree with the specified doctree name
let mut doctree = DocTree::builder()
.with_storage(storage)
.with_doctree_name(doctree_name)
.build()?;
println!("Deleting collection '{}' from Redis in doctree '{}'...", collection, doctree_name);
doctree.delete_collection(collection)?;
println!("Collection '{}' deleted successfully", collection);
} else if let Some(matches) = matches.subcommand_matches("export_to_ipfs") {
let output_path_str = matches.value_of("output").unwrap();
let output_path = Path::new(output_path_str);
let doctree_name = matches.value_of("doctree").unwrap_or("default");
let collection_name_opt = matches.value_of("collection");
if debug_mode {
println!("DEBUG: Handling export_to_ipfs command.");
}
// Create a storage with the specified doctree name
let storage = RedisStorage::new("redis://localhost:6379")?;
storage.set_doctree_name(doctree_name);
storage.set_debug(debug_mode);
if debug_mode {
println!("DEBUG: Connected to Redis storage");
}
// Create a DocTree with the specified doctree name
let mut doctree = DocTree::builder()
.with_storage(storage)
.with_doctree_name(doctree_name)
.build()?;
// Load collections from Redis
doctree.load_collections_from_redis()?;
match collection_name_opt {
Some(collection_name) => {
// Export a specific collection
if debug_mode {
println!("DEBUG: Exporting specific collection '{}'", collection_name);
}
doctree.export_collection_to_ipfs(collection_name, output_path)?;
println!("Successfully exported collection '{}' to IPFS and generated metadata CSV at {:?}.", collection_name, output_path.join(format!("{}.csv", collection_name)));
}
None => {
// Export all collections
if debug_mode {
println!("DEBUG: Exporting all collections.");
}
let collections = doctree.list_collections();
if collections.is_empty() {
println!("No collections found to export.");
} else {
println!("Exporting the following collections:");
for collection_name in collections {
println!("- {}", collection_name);
if let Err(e) = doctree.export_collection_to_ipfs(&collection_name, output_path) {
eprintln!("Error exporting collection '{}': {}", collection_name, e);
} else {
println!("Successfully exported collection '{}' to IPFS and generated metadata CSV at {:?}.", collection_name, output_path.join(format!("{}.csv", collection_name)));
}
}
}
}
}
} else if let Some(matches) = matches.subcommand_matches("reset") {
let doctree_name = matches.value_of("doctree").unwrap_or("default");
if debug_mode {
println!("DEBUG: Resetting all collections in doctree '{}'", doctree_name);
}
// Create a storage with the specified doctree name
let storage = RedisStorage::new("redis://localhost:6379")?;
storage.set_doctree_name(doctree_name);
storage.set_debug(debug_mode);
if debug_mode {
println!("DEBUG: Connected to Redis storage");
}
// Create a DocTree with the specified doctree name
let mut doctree = DocTree::builder()
.with_storage(storage)
.with_doctree_name(doctree_name)
.build()?;
println!("Deleting all collections from Redis in doctree '{}'...", doctree_name);
doctree.delete_all_collections()?;
println!("All collections deleted successfully");
} else {
println!("No command specified. Use --help for usage information.");
}
Ok(())
}

31
example_commands.sh Executable file
View File

@@ -0,0 +1,31 @@
#!/bin/bash
# Change to the directory where the script is located
cd "$(dirname "$0")"
# Exit immediately if a command exits with a non-zero status
set -e
cd doctreecmd
echo "=== Scanning Collections ==="
cargo run -- scan ../examples
echo -e "\n=== Listing Collections ==="
cargo run -- list
echo -e "\n=== Getting Document (Markdown) ==="
cargo run -- get -c grid1 -p introduction.md
echo -e "\n=== Getting Document (HTML) ==="
cargo run -- get -c grid1 -p introduction.md -f html
echo -e "\n=== Deleting Collection ==="
cargo run -- delete grid1
echo -e "\n=== Listing Remaining Collections ==="
cargo run -- list
echo -e "\n=== Resetting All Collections ==="
cargo run -- reset
echo -e "\n=== Verifying Reset ==="
cargo run -- list

View File

@@ -0,0 +1,24 @@
[
{
name: docs_hero
#existing docusaurus site can be used as collection as long as no duplicates
url: https://git.threefold.info/tfgrid/docs_tfgrid4/src/branch/main/aibox/docs
description: Documentation for the ThreeFold Hero project.
}
{
name: biz
url: https://git.threefold.info/tfgrid/docs_tfgrid4/src/branch/main/aibox/collections/aaa
description: Business documentation.
}
{
name: products
url: https://git.threefold.info/tfgrid/docs_tfgrid4/src/branch/main/aibox/collections/vvv
description: Information about ThreeFold products.
}
{
scan: true
url: https://git.threefold.info/tfgrid/docs_tfgrid4/src/branch/main/aibox/collections
}
]

View File

@@ -0,0 +1,33 @@
# Footer configuration for the site
title: "Explore More"
sections: [
{
title: "Pages"
links: [
{ label: "Home", href: "/" }
{ label: "About Us", href: "/about" }
{ label: "Contact", href: "/contact" }
{ label: "Blog", href: "/blog" }
]
}
{
title: "Resources"
links: [
{ label: "Docs", href: "/docs" }
{ label: "API", href: "/api" }
]
}
{
title: "Social"
links: [
{ label: "GitHub", href: "https://github.com/yourproject" }
{ label: "Twitter", href: "https://twitter.com/yourhandle" }
]
}
]
copyright: "© 2025 YourSite. All rights reserved."

View File

@@ -0,0 +1,32 @@
# Site Branding
logo:
src: /img/logo.svg
alt: Site Logo
# Site Title
title: ThreeFold Hero
# Navigation Menu
menu: [
{
label: Home
link: /
}
{
label: Docs
link: /docs/
}
{
label: About
link: /about/
}
]
# Login Button
login:
visible: true
label: Login
link: /login/

View File

@@ -0,0 +1,14 @@
# Site Main Info
title: ThreeFold Hero Docs
tagline: Your Personal Hero
favicon: img/favicon.png
url: https://threefold.info
# SEO / Social Metadata
metadata:
title: ThreeFold Hero Docs
description: ThreeFold Hero - Your Personal Hero
image: https://threefold.info/herodocs/img/tf_graph.png
# Copyright Notice
copyright: ThreeFold

View File

@@ -0,0 +1,33 @@
[
{
name: home
title: Home Page
description: This is the main landing page.
navpath: /
collection: acollection
}
{
name: about
title: About Us
navpath: /about
collection: acollection
}
{
name: docs
title: Documentation
navpath: /sub/docs
collection: docs_hero
}
{
name: draft-page
title: draft Page
description: This page is not shown in navigation.
draft: true
navpath: /cantsee
collection: acollection
}
]

View File

View File

@@ -0,0 +1,8 @@
{
"label": "AIBox Benefits",
"position": 4,
"link": {
"type": "generated-index",
"description": "The benefits of AIBox"
}
}

View File

@@ -0,0 +1,28 @@
---
title: Revenue Generation
sidebar_position: 2
---
### Renting Options
AIBox creates opportunities for revenue generation through resource sharing. The following numbers are suggestive as each AIBox owners can set their own pricing.
| Plan | Rate | Monthly Potential | Usage Scenario |
|------|------|------------------|----------------|
| Micro | $0.40/hr | $200-300 | Inference workloads |
| Standard | $0.80/hr | $400-600 | Development |
| Full GPU | $1.60/hr | $800-1,200 | Training |
### Proof of Capacity Revenues
The AIBox implements a tiered proof of capacity reward system, distributing monthly INCA tokens based on hardware configuration
| Configuration | Monthly Rewards |
|---------------|----------------|
| Base AIBox | 500-2000 INCA |
| 1 GPU AIBox | 1000 INCA |
| 2 GPU AIBox | 2000 INCA |
### Proof of Utilization Revenues
The AIBox implements a revenue-sharing model wherein device owners receive 80% of INCA tokens utilized for deployments, providing transparent proof of utilization economics.

View File

@@ -0,0 +1,32 @@
---
title: Use Cases
sidebar_position: 3
---
### Personal AI Development
The AIBox provides an ideal environment for individual developers working on AI projects:
- Model training and fine-tuning
- Experimental AI architectures
- Unrestricted testing and development
- Complete control over computing resources
The system allows developers to run extended training sessions without watching cloud billing meters or dealing with usage restrictions.
### Shared Resources
For teams and organizations, AIBox offers efficient resource sharing capabilities:
- Multi-user environment
- Resource pooling
- Cost sharing
- Distributed computing
This makes it particularly valuable for small teams and startups looking to maintain control over their AI infrastructure while managing costs.
### Commercial Applications
The system supports various commercial deployments:
- AI-as-a-Service
- Model hosting
- Inference endpoints
- Dataset processing

View File

@@ -0,0 +1,8 @@
{
"label": "Getting Started",
"position": 5,
"link": {
"type": "generated-index",
"description": "Getting started with the AIBox"
}
}

View File

@@ -0,0 +1,10 @@
---
title: Pre-Order Process
sidebar_position: 2
---
### How to Order
The steps to qcquire an AIBox is simple:
1. [Select your configuration](./purchase_options.md)
2. [Submit pre-order form](https://www2.aibox.threefold.io/signup/)

View File

@@ -0,0 +1,84 @@
---
title: Purchase Options
sidebar_position: 1
---
### Base AIBox Plan ($1-1500)
For experienced builders and hardware enthusiasts who want to customize their AI infrastructure. This plan provides the essential framework while allowing you to select and integrate your own GPU.
Base Configuration:
- GPU: Your choice, with minimum requirement of AMD Radeon RX 7900 XT
* Flexibility to use existing GPU or select preferred model
* Support for multiple GPU vendors with minimum performance requirements
* Full integration support for chosen hardware
- Memory: 64-128 GB DDR5
* Expandable configuration
* High-speed memory modules
* ECC support optional
- Storage: 2-4 TB of NVMe SSD
* PCIe 4.0 support
* Configurable RAID options
* Expansion capabilities
- Integrated Mycelium Network
* Full network stack
* P2P capabilities
* Decentralized computing support
Rewards Structure:
- Proof of Capacity: 500-2000 INCA per month (depending on chosen GPU)
- Proof of Utilization: 80% of INCA Revenue
- Flexible earning potential based on hardware configuration
### 1 GPU AIBox Plan ($2-2500)
Perfect for individual developers and researchers who need professional-grade AI computing power. This configuration provides enough processing power for smaller but smart models and AI agents.
Standard Configuration:
- 1x AMD Radeon RX 7900 XTX
* 24GB VRAM
* 61.6 TFLOPS FP32 Performance
* 960 GB/s Memory Bandwidth
- 64-128 GB DDR5 Memory
* Optimal for AI workloads
* High-speed data processing
* Multi-tasking capability
- 2-4 TB of NVMe SSD
* Ultra-fast storage access
* Ample space for datasets
* Quick model loading
- Integrated Mycelium
* Full network integration
* Ready for distributed computing
* P2P capabilities enabled
Rewards Structure:
- Proof of Capacity: 1000 INCA per month
- Proof of Utilization: 80% of INCA Revenue
- Consistent earning potential
### 2 GPU AIBox Plan ($4-5000)
Our most powerful configuration, designed for serious AI researchers and organizations. This setup supports large 48GB models, providing substantial computing power for advanced AI applications.
Advanced Configuration:
- 2x AMD Radeon RX 7900 XTX
* Combined 48GB VRAM
* 123.2 TFLOPS total FP32 Performance
* 1920 GB/s Total Memory Bandwidth
- 64-128 GB DDR5 Memory
* Maximum performance configuration
* Support for multiple large models
* Extensive multi-tasking capability
- 2-4 TB of NVMe SSD
* Enterprise-grade storage
* RAID configuration options
* Expandable capacity
- Integrated Mycelium
* Enhanced network capabilities
* Full distributed computing support
* Advanced P2P features
Rewards Structure:
- Proof of Capacity: 2000 INCA per month
- Proof of Utilization: 80% of INCA Revenue
- Maximum earning potential
Each plan includes comprehensive support, setup assistance, and access to the full AIBox ecosystem. Configurations can be further customized within each plan's framework to meet specific requirements.

View File

@@ -0,0 +1,8 @@
---
title: Support
sidebar_position: 3
---
Our support team is composed of technically proficient members who understand AI development needs.
Feel free to reach out the ThreeFold Support [here](https://threefoldfaq.crisp.help/en/) for more information.

View File

@@ -0,0 +1,8 @@
# Include Example
This file demonstrates the include functionality of doctree.
## Including content from Introduction.md
!!include grid1:introduction.md

View File

@@ -0,0 +1,24 @@
---
title: Introducing AIBox
sidebar_position: 1
slug: /
---
## AIBox: Powering Community-Driven AI
The AIBox is built for those who want to explore AI on their own terms. With 2 RX 7900 XTX GPUs and 48GB of memory, it enables running demanding AI models efficiently.
## Open AI Development
AIBox offers full control—no cloud restrictions, no unexpected costs. Train models, fine-tune AI systems, and experiment freely with PyTorch, TensorFlow, or low-level GPU programming.
## More Than Hardware: A Shared Network
AIBox isnt just a tool—its part of a decentralized AI network. When idle, its GPU power can be shared via Mycelium, benefiting the wider community while generating value. Designed for efficiency, with water cooling and power monitoring, its a practical, community-powered step toward open AI development.
## Expanding the ThreeFold Grid
Each AIBox integrates into the ThreeFold Grid, a decentralized Internet infrastructure active in over 50 countries. By connecting your AIBox, you contribute to this global network, enhancing its capacity and reach. This integration not only supports your AI endeavors but also strengthens a community-driven Internet ecosystem.
More info about threefold see: https://www.threefold.io

View File

@@ -0,0 +1,8 @@
{
"label": "AIBox Overview",
"position": 2,
"link": {
"type": "generated-index",
"description": "Overview of the AIBox"
}
}

View File

@@ -0,0 +1,12 @@
---
title: Vision & Mission
sidebar_position: 2
---
## AI Landscape
The AI landscape today is dominated by centralized cloud providers, creating barriers for innovation and increasing costs for developers. Our vision is different: we're building tools for a decentralized AI future where computing power isn't monopolized by large cloud providers.
## High-End AI Hardware
Our technical goal is straightforward: provide enterprise-grade AI hardware that's both powerful and profitable through resource sharing. We believe that AI development should be accessible to anyone with the technical skills to push boundaries.

View File

@@ -0,0 +1,27 @@
---
title: Who Is AIBox For?
sidebar_position: 4
---
The AIBox is for hackers and AI explorers who want a simple, accessible gateway into AI experimentation, while also offering advanced features for those ready to push the boundaries of what's possible.
### Developers & Hackers
Technical capabilities:
- Direct GPU programming through ROCm
- Custom containerization support
- Full Linux kernel access
- P2P networking capabilities
### AI Researchers
Research-focused features:
- Support for popular ML frameworks (PyTorch, TensorFlow)
- Large model training capability (up to 48GB VRAM)
- Distributed training support
- Dataset management tools
### Tech Enthusiasts
Advanced features:
- Water cooling management interface
- Power consumption monitoring
- Performance benchmarking tools
- Resource allocation controls

View File

@@ -0,0 +1,18 @@
---
title: Why Decentralized AI Matters
sidebar_position: 3
---
The AIBox gives you complete control over your data privacy with full hardware access while enabling unlimited experimentation without the restrictions of cloud platforms.
### Data Privacy & Control
- Full root access to hardware
- No data leaving your premises without explicit permission
- Custom firewall rules and network configurations
- Ability to air-gap when needed
### Unlimited Experimentation
- Direct GPU access without virtualization overhead
- Custom model training without cloud restrictions
- Unrestricted model sizes and training durations
- Freedom to modify system parameters

View File

@@ -0,0 +1,8 @@
{
"label": "Technical Specs",
"position": 3,
"link": {
"type": "generated-index",
"description": "Technical aspects of the AIBox"
}
}

View File

@@ -0,0 +1,35 @@
---
title: Features & Capabilities
sidebar_position: 3
---
## Overview
AIBox combines enterprise-grade hardware capabilities with flexible resource management, creating a powerful platform for AI development and deployment. Each feature is designed to meet the demanding needs of developers and researchers who require both raw computing power and precise control over their resources.
## VM Management (CloudSlices)
CloudSlices transforms your AIBox into a multi-tenant powerhouse, enabling you to run multiple isolated environments simultaneously. Unlike traditional virtualization, CloudSlices is optimized for AI workloads, ensuring minimal overhead and maximum GPU utilization.
Each slice operates as a fully isolated virtual machine with guaranteed resources. The AIBox can be sliced into up to 8 virtual machines.
The slicing system ensures resources are allocated efficiently while maintaining performance isolation between workloads. This means your critical training job won't be affected by other tasks running on the system.
## GPU Resource Management
Our GPU management system provides granular control while maintaining peak performance. Whether you're running a single large model or multiple smaller workloads, the system optimizes resource allocation automatically.
## Network Connectivity
The networking stack is built for both performance and security, integrating seamlessly with the Mycelium network, providing end-to-end encryption, and and Web gateways, allowing external connection to VM containers. The AI Box thus creates a robust foundation for distributed AI computing.
## Security Features
Security is implemented at every layer of the system without compromising performance:
System Security:
- Hardware-level isolation
- Secure boot chain
- Network segmentation
Each feature has been carefully selected and implemented to provide both practical utility and enterprise-grade security, ensuring your AI workloads and data remain protected while maintaining full accessibility for authorized users.

View File

@@ -0,0 +1,37 @@
---
title: Hardware Specifications
sidebar_position: 1
---
### GPU Options
At the heart of AIBox lies its GPU configuration, carefully selected for AI workloads. The AMD Radeon RX 7900 XTX provides an exceptional balance of performance, memory, and cost efficiency:
| Model | VRAM | FP32 Performance | Memory Bandwidth |
|-------|------|------------------|------------------|
| RX 7900 XTX | 24GB | 61.6 TFLOPS | 960 GB/s |
| Dual Config | 48GB | 123.2 TFLOPS | 1920 GB/s |
The dual GPU configuration enables handling larger models and datasets that wouldn't fit in single-GPU memory, making it ideal for advanced AI research and development.
### Memory & Storage
AI workloads demand high-speed memory and storage. The AIBox configuration ensures your GPU computing power isn't bottlenecked by I/O limitations:
Memory Configuration:
- RAM: 64GB/128GB DDR5-4800
- Storage: 2x 2TB NVMe SSDs (PCIe 4.0)
This setup provides ample memory for large dataset preprocessing and fast storage access for model training and inference.
### Cooling System
Thermal management is crucial for sustained AI workloads. Our cooling solution focuses on maintaining consistent performance during extended operations:
This cooling system allows for sustained maximum performance without thermal throttling, even during extended training sessions.
### Power Supply
Reliable power delivery is essential for system stability and performance.
The AIBox power configuration ensures clean, stable power delivery under all operating conditions, with headroom for additional components or intense workloads.

View File

@@ -0,0 +1,29 @@
---
title: Software Stack
sidebar_position: 2
---
### ThreeFold Zero-OS
Zero-OS forms the foundation of AIBox's software architecture. Unlike traditional operating systems, it's a minimalist, security-focused platform optimized specifically for AI workloads and distributed computing.
Key features:
- Bare metal operating system with minimal overhead
- Zero overhead virtualization
- Secure boot process
- Automated resource management
This specialized operating system ensures maximum performance and security while eliminating unnecessary services and potential vulnerabilities.
### Mycelium Network Integration
The Mycelium Network integration transforms your AIBox from a standalone system into a node in a powerful distributed computing network based on peer-to-peer and end-to-end encrypted communication always choosing the shortest path.
### Pre-installed AI Frameworks
Your AIBox comes ready for development with a comprehensive AI software stack:
- ROCm 5.7+ ML stack
- PyTorch 2.1+ with GPU optimization
- TensorFlow 2.14+
- Pre-built container images

View File

@@ -0,0 +1 @@
name = "supercollection"

View File

@@ -0,0 +1,38 @@
---
title: Features Mycelium Network
sidebar_position: 1
---
Mycelium is a locality-aware, end-to-end encrypted network designed for efficient and secure communication between nodes. Below are its key features:
## What Makes Mycelium Unique
1. **Locality Awareness**
Mycelium identifies the shortest path between nodes, optimizing communication based on location.
2. **End-to-End Encryption**
All traffic between nodes is encrypted, ensuring secure data transmission.
3. **Traffic Routing Over Friend Nodes**
Traffic can be routed through nodes of trusted friends, maintaining location awareness.
4. **Automatic Rerouting**
If a physical link fails, Mycelium automatically reroutes traffic to ensure uninterrupted connectivity.
5. **Your network Address Linked to Private Key**
Each node is assigned an IPv6 network address that is cryptographically linked to its private key.
6. **Scalability**
Mycelium is designed to scale to a planetary level. The team has evaluated multiple overlay networks in the past and is focused on overcoming scalability challenges.
## Tech
1. **Flexible Deployment**
Mycelium can be run without a TUN interface, allowing it to function solely as a reliable message bus.
2. **Reliable Message Bus**
Mycelium includes a simple and reliable message bus built on top of its network layer.
1. **Multiple Communication Protocols**
Mycelium supports various communication methods, including QUIC and TCP. The team is also developing hole-punching for QUIC, enabling direct peer-to-peer (P2P) traffic without intermediaries.

View File

@@ -0,0 +1,23 @@
---
title: Download the App
sidebar_position: 4
---
The Mycelium app is available for Android, Windows, macOS and iOS.
For Linux, read the [Linux Installation](../experts/03_linux-installation.md) section.
## Download Links
You can download the Mycelium app with the following links:
- [iOS and macOS](https://apps.apple.com/app/id6504277565)
- Download the app from the App Store
- [Android](https://play.google.com/store/apps/details?id=tech.threefold.mycelium)
- Download the app from the Google Play Store
- [Windows](https://github.com/threefoldtech/myceliumflut/releases)
- Go to the official Mycelium release page and download the latest `.exe`
## Upcoming Updates
- The user interface (UI) will be drastically improved in upcoming releases to better represent the available features.

View File

@@ -0,0 +1,48 @@
---
title: Use the App
sidebar_position: 5
---
## Start Mycelium
To start Mycelium, simply open the app and click on `Start`.
![](./img/mycelium_1.png)
> Note for Windows Users: The Mycelium app must be run as an administrator to function properly. Right-click on the application icon and select "Run as administrator" to ensure proper network connectivity.
## Stop or Restart Mycelium
To stop or restart Mycelium, click on the appropriate button.
![](./img/mycelium_2.png)
## Add Peers
You can add different Mycelium peers in the `Peers` window.
Simply add peers and then either start or restart the app.
![](./img/mycelium_3.png)
You can consult the [Mycelium hosted public nodes](../experts/04_additional-information.md) to find more peers.
For example, if you want to add the node with the IPv4 address `5.78.122.16` with the tcp port `9651`, simply add the following line then start or restart the app.
```
tcp://5.78.122.16:9651
```
## Mycelium Address
When you use the Mycelium app, you are assigned a unique Mycelium address.
To copy the Mycelium address, click on the button on the right of the address.
![](./img/mycelium_4.png)
## Deploy on the Grid with Mycelium
Once you've installed Mycelium, you can deploy on the ThreeFold Grid and connect to your workload using Mycelium.
As a starter, you can explore the ThreeFold Grid and deploy apps on the [ThreeFold Dashboard](https://manual.grid.tf/documentation/dashboard/dashboard.html) using Mycelium to connect.

View File

@@ -0,0 +1,8 @@
{
"label": "Get Started",
"position": 4,
"link": {
"type": "generated-index",
"description": "Get started With Mycelium Network."
}
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.9 KiB

173
impl_plan.md Normal file
View File

@@ -0,0 +1,173 @@
# DocTree WebBuilder Implementation Plan
## Overview
This document outlines the implementation plan for the WebBuilder component of the DocTree project. The WebBuilder is designed to process hjson configuration files (like those in `examples/doctreenew/sites/demo1/`) and generate a `webmeta.json` file that can be used by a browser-based website generator.
## Current Status
### What's Implemented:
1. **DocTree Core Functionality**:
- The main DocTree library with functionality for scanning directories, managing collections, processing includes, and converting markdown to HTML
- Redis storage backend for storing document metadata
- Command-line interface (doctreecmd) for interacting with collections
2. **Example Structure for the New Approach**:
- Example hjson configuration files in `examples/doctreenew/sites/demo1/`
- This includes `main.hjson`, `header.hjson`, `footer.hjson`, `collection.hjson`, and `pages/mypages1.hjson`
3. **Specification Document**:
- Detailed specification in `webbuilder/src/builder/specs.md`
- Example output format in `webbuilder/src/builder/webmeta.json`
### What's Not Yet Implemented:
1. **WebBuilder Implementation**:
- The actual Rust code for the webbuilder component
2. **Hjson Parsing**:
- Code to parse the hjson files in the doctreenew directory
3. **Git Repository Integration**:
- Functionality to download referenced collections from Git repositories
4. **IPFS Export**:
- Complete functionality to export assets to IPFS
5. **Browser-Based Generator**:
- The browser-based website generator that would use the webmeta.json file
## Implementation Plan
### Phase 1: Core WebBuilder Implementation (2-3 weeks)
1. **Setup Project Structure**:
- Create necessary modules and files in `webbuilder/src/`
- Define main data structures and traits
2. **Implement Hjson Parsing**:
- Add hjson crate dependency
- Create parsers for each hjson file type (main, header, footer, collection, pages)
- Implement validation for hjson files
3. **Implement Site Structure Builder**:
- Create a module to combine parsed hjson data into a cohesive site structure
- Implement navigation generation based on page definitions
4. **Implement WebMeta Generator**:
- Create functionality to generate the webmeta.json file
- Ensure all required metadata is included
### Phase 2: Git Integration and Collection Processing (2 weeks)
1. **Implement Git Repository Integration**:
- Add git2 crate dependency
- Create functionality to clone/pull repositories based on collection.hjson
- Implement caching to avoid unnecessary downloads
2. **Integrate with DocTree Library**:
- Create an adapter to use DocTree functionality with hjson-defined collections
- Implement processing of includes between documents
3. **Implement Content Processing**:
- Create functionality to process markdown content
- Handle special directives or custom syntax
### Phase 3: IPFS Integration (2 weeks)
1. **Enhance IPFS Integration**:
- Complete the IPFS export functionality in DocTree
- Create a module to handle IPFS uploads
2. **Implement Asset Management**:
- Create functionality to track and process assets (images, CSS, etc.)
- Ensure proper IPFS linking
3. **Implement Content Hashing**:
- Add Blake hash calculation for content integrity verification
- Store hashes in webmeta.json
### Phase 4: CLI and Testing (1-2 weeks)
1. **Implement Command-Line Interface**:
- Create a CLI for the webbuilder
- Add commands for building, validating, and deploying sites
2. **Write Comprehensive Tests**:
- Unit tests for each component
- Integration tests for the full workflow
- Test with example sites
3. **Documentation**:
- Update README with usage instructions
- Create detailed API documentation
- Add examples and tutorials
### Phase 5: Browser-Based Generator (Optional, 3-4 weeks)
1. **Design Browser Component**:
- Create a JavaScript/TypeScript library to consume webmeta.json
- Design component architecture
2. **Implement Content Rendering**:
- Create components to render markdown content
- Implement navigation and site structure
3. **Implement IPFS Integration**:
- Add functionality to fetch content from IPFS
- Implement content verification using Blake hashes
4. **Create Demo Site**:
- Build a demo site using the browser-based generator
- Showcase features and capabilities
## Technical Details
### Key Dependencies
- **hjson**: For parsing hjson configuration files
- **git2**: For Git repository integration
- **ipfs-api**: For IPFS integration
- **blake3**: For content hashing
- **clap**: For command-line interface
- **tokio**: For async operations
### Data Flow
1. Parse hjson files from input directory
2. Download referenced Git repositories
3. Process content with DocTree
4. Export assets to IPFS
5. Generate webmeta.json
6. (Optional) Upload webmeta.json to IPFS
### Key Challenges
1. **Git Integration**: Handling authentication, rate limits, and large repositories
2. **IPFS Performance**: Optimizing IPFS uploads for large sites
3. **Content Processing**: Ensuring proper handling of includes and special syntax
4. **Browser Compatibility**: Ensuring the browser-based generator works across different browsers
## Milestones and Timeline
1. **Core WebBuilder Implementation**: Weeks 1-3
2. **Git Integration and Collection Processing**: Weeks 4-5
3. **IPFS Integration**: Weeks 6-7
4. **CLI and Testing**: Weeks 8-9
5. **Browser-Based Generator (Optional)**: Weeks 10-13
## Resources Required
1. **Development Resources**:
- 1-2 Rust developers
- 1 Frontend developer (for browser-based generator)
2. **Infrastructure**:
- IPFS node for testing
- Git repositories for testing
- CI/CD pipeline
## Conclusion
This implementation plan provides a roadmap for developing the WebBuilder component of the DocTree project. By following this plan, we can transform the current specification and example files into a fully functional system for generating websites from hjson configuration files and markdown content.

19
include_example.sh Executable file
View File

@@ -0,0 +1,19 @@
#!/bin/bash
# Change to the directory where the script is located
cd "$(dirname "$0")"
# Exit immediately if a command exits with a non-zero status
set -e
cd doctreecmd
# First, scan the collections with a specific doctree name
echo "=== Scanning Collections with doctree name 'include_demo' ==="
cargo run -- scan ../examples --doctree include_demo
# List the collections
echo -e "\n=== Listing Collections ==="
cargo run -- list --doctree include_demo
# Get the document with includes in markdown format
echo -e "\n=== Getting Document with Includes (Markdown) ==="
cargo run -- get -c grid1 -p include_example.md --doctree include_demo

39
runexample.sh Executable file
View File

@@ -0,0 +1,39 @@
#!/bin/bash
# Change to the directory where the script is located
cd "$(dirname "$0")"
# Exit immediately if a command exits with a non-zero status
set -e
cd doctreecmd
# First, scan the collections
echo "=== Scanning Collections ==="
cargo run -- scan ../examples --doctree supercollection
# Get a document in markdown format
echo -e "\n=== Getting Document (Markdown) ==="
cargo run -- get -c supercollection -p 01_features.md --doctree supercollection
# Get a document in HTML format
echo -e "\n=== Getting Document (HTML) ==="
cargo run -- get -c supercollection -p 01_features.md -f html --doctree supercollection
# Get a document without specifying collection
echo -e "\n=== Getting Document (Default Collection) ==="
cargo run -- get -p 01_features.md --doctree supercollection
# Delete a specific collection
echo -e "\n=== Deleting Collection ==="
cargo run -- delete grid_documentation --doctree supercollection
# List remaining collections
echo -e "\n=== Listing Remaining Collections ==="
cargo run -- list --doctree supercollection
# # Reset all collections
# echo -e "\n=== Resetting All Collections ==="
# cargo run -- reset --doctree supercollection
# # Verify all collections are gone
# echo -e "\n=== Verifying Reset ==="
# cargo run -- list --doctree supercollection

58
webbuilder/Cargo.toml Normal file
View File

@@ -0,0 +1,58 @@
[package]
name = "webbuilder"
version = "0.1.0"
edition = "2021"
description = "A tool for building websites from hjson configuration files and markdown content"
authors = ["DocTree Team"]
[lib]
path = "src/lib.rs"
[[bin]]
name = "webbuilder"
path = "src/main.rs"
[dependencies]
# Core dependencies
doctree = { path = "../doctree" }
walkdir = "2.3.3"
pulldown-cmark = "0.9.3"
thiserror = "1.0.40"
lazy_static = "1.4.0"
toml = "0.7.3"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
redis = { version = "0.23.0", features = ["tokio-comp"] }
tokio = { version = "1.28.0", features = ["full"] }
sal = { git = "https://git.threefold.info/herocode/sal.git" }
# Hjson parsing
deser-hjson = "1.1.0"
# Git integration is provided by the SAL library
# IPFS integration
ipfs-api-backend-hyper = "0.6"
ipfs-api = { version = "0.17.0", default-features = false, features = ["with-hyper-tls"] }
# Hashing and encryption
chacha20poly1305 = "0.10.1"
blake3 = "1.3.1"
# CLI
clap = { version = "4.3.0", features = ["derive"] }
# Utilities
anyhow = "1.0.71"
log = "0.4.17"
env_logger = "0.10.0"
csv = "1.1"
rand = "0.9.1"
url = "2.3.1"
[dev-dependencies]
# Testing
tempfile = "3.5.0"
mockall = "0.11.4"
assert_fs = "1.0.10"
predicates = "3.0.3"

128
webbuilder/README.md Normal file
View File

@@ -0,0 +1,128 @@
# WebBuilder
WebBuilder is a library for building websites from configuration files and markdown content. It uses the DocTree library to process markdown content and includes, and exports the result to a webmeta.json file that can be used by a browser-based website generator.
## Overview
WebBuilder scans directories for configuration files (in hjson format) and generates a `webmeta.json` file that can be used by a browser-based website generator. It can also clone Git repositories, process markdown content, and upload files to IPFS.
## Parsing Configuration Files
WebBuilder supports multiple parsing strategies for configuration files:
### Unified Parser
The recommended way to parse configuration files is to use the unified parser, which provides a consistent interface for all parsing strategies:
```rust
use webbuilder::{from_directory_with_strategy, ParsingStrategy};
// Use the recommended strategy (Hjson)
let webbuilder = from_directory_with_strategy("path/to/config", ParsingStrategy::Hjson)?;
// Or use the auto-detect strategy
let webbuilder = from_directory_with_strategy("path/to/config", ParsingStrategy::Auto)?;
// Or use the simple strategy (legacy)
let webbuilder = from_directory_with_strategy("path/to/config", ParsingStrategy::Simple)?;
```
You can also use the convenience functions:
```rust
use webbuilder::{from_directory, parse_site_config_recommended, parse_site_config_auto};
// Use the recommended strategy (Hjson)
let webbuilder = from_directory("path/to/config")?;
// Or parse the site configuration directly
let site_config = parse_site_config_recommended("path/to/config")?;
let site_config = parse_site_config_auto("path/to/config")?;
```
### Parsing Strategies
WebBuilder supports the following parsing strategies:
- **Hjson**: Uses the `deser-hjson` library to parse hjson files. This is the recommended strategy.
- **Simple**: Uses a simple line-by-line parser that doesn't rely on external libraries. This is a legacy strategy.
- **Auto**: Tries the Hjson parser first, and falls back to the simple parser if it fails.
## Building a Website
Once you have a WebBuilder instance, you can build a website:
```rust
use webbuilder::from_directory;
// Create a WebBuilder instance
let webbuilder = from_directory("path/to/config")?;
// Build the website
let webmeta = webbuilder.build()?;
// Save the webmeta.json file
webmeta.save("webmeta.json")?;
// Upload the webmeta.json file to IPFS
let ipfs_hash = webbuilder.upload_to_ipfs("webmeta.json")?;
println!("Uploaded to IPFS: {}", ipfs_hash);
```
## Configuration Files
WebBuilder expects the following configuration files:
- `main.hjson`: Main configuration file with site metadata
- `header.hjson`: Header configuration
- `footer.hjson`: Footer configuration
- `collection.hjson`: Collection configuration (Git repositories)
- `pages/*.hjson`: Page configuration files
Example `main.hjson`:
```hjson
{
"name": "my-site",
"title": "My Site",
"description": "My awesome site",
"url": "https://example.com",
"favicon": "favicon.ico",
"keywords": [
"website",
"awesome"
]
}
```
Example `collection.hjson`:
```hjson
[
{
"name": "docs",
"url": "https://github.com/example/docs.git",
"description": "Documentation",
"scan": true
}
]
```
Example `pages/pages.hjson`:
```hjson
[
{
"name": "home",
"title": "Home",
"description": "Home page",
"navpath": "/",
"collection": "docs",
"draft": false
}
]
```
## License
This project is licensed under the MIT License - see the LICENSE file for details.

View File

@@ -0,0 +1,324 @@
use serde::{Deserialize, Serialize};
use std::fs;
use std::path::Path;
use crate::config::SiteConfig;
use crate::error::Result;
use crate::parser;
#[cfg(test)]
mod mod_test;
/// WebMeta represents the output of the WebBuilder
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct WebMeta {
/// Site metadata
pub site_metadata: SiteMetadata,
/// Pages
pub pages: Vec<PageMeta>,
/// Assets
pub assets: std::collections::HashMap<String, AssetMeta>,
}
/// Site metadata
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SiteMetadata {
/// Site name
pub name: String,
/// Site title
pub title: String,
/// Site description
pub description: Option<String>,
/// Site keywords
pub keywords: Option<Vec<String>>,
/// Site header
pub header: Option<serde_json::Value>,
/// Site footer
pub footer: Option<serde_json::Value>,
}
/// Page metadata
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct PageMeta {
/// Page ID
pub id: String,
/// Page title
pub title: String,
/// IPFS key of the page content
pub ipfs_key: String,
/// Blake hash of the page content
pub blakehash: String,
/// Page sections
pub sections: Vec<SectionMeta>,
/// Page assets
pub assets: Vec<AssetMeta>,
}
/// Section metadata
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SectionMeta {
/// Section type
#[serde(rename = "type")]
pub section_type: String,
/// Section content
pub content: String,
}
/// Asset metadata
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AssetMeta {
/// Asset name
pub name: String,
/// IPFS key of the asset
pub ipfs_key: String,
}
impl WebMeta {
/// Save the WebMeta to a file
///
/// # Arguments
///
/// * `path` - Path to save the file to
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn save<P: AsRef<Path>>(&self, path: P) -> Result<()> {
let json = serde_json::to_string_pretty(self)?;
fs::write(path, json)?;
Ok(())
}
}
/// WebBuilder is responsible for building a website from hjson configuration files
#[derive(Debug)]
pub struct WebBuilder {
/// Site configuration
pub config: SiteConfig,
}
impl WebBuilder {
/// Create a new WebBuilder instance from a directory containing hjson configuration files
///
/// # Arguments
///
/// * `path` - Path to the directory containing hjson configuration files
///
/// # Returns
///
/// A new WebBuilder instance or an error
pub fn from_directory<P: AsRef<Path>>(path: P) -> Result<Self> {
let config = parser::parse_site_config_recommended(path)?;
Ok(WebBuilder { config })
}
/// Build the website
///
/// # Returns
///
/// A WebMeta instance or an error
pub fn build(&self) -> Result<WebMeta> {
// Create site metadata
let site_metadata = SiteMetadata {
name: self.config.name.clone(),
title: self.config.title.clone(),
description: self.config.description.clone(),
keywords: self.config.keywords.clone(),
header: self
.config
.header
.as_ref()
.map(|h| serde_json::to_value(h).unwrap_or_default()),
footer: self
.config
.footer
.as_ref()
.map(|f| serde_json::to_value(f).unwrap_or_default()),
};
// Process collections
let mut pages = Vec::new();
let assets = std::collections::HashMap::new();
// Process collections from Git repositories
for collection in &self.config.collections {
if let Some(url) = &collection.url {
// Extract repository name from URL
let repo_name = collection.name.clone().unwrap_or_else(|| {
url.split('/')
.last()
.unwrap_or("repo")
.trim_end_matches(".git")
.to_string()
});
// Clone or pull the Git repository
let repo_path = self.config.base_path.join("repos").join(&repo_name);
// Create the repos directory if it doesn't exist
if !repo_path.parent().unwrap().exists() {
fs::create_dir_all(repo_path.parent().unwrap())?;
}
// Clone or pull the repository
let repo_path = match crate::git::clone_or_pull(url, &repo_path) {
Ok(path) => path,
Err(e) => {
// Log the error but continue with a placeholder
log::warn!("Failed to clone repository {}: {}", url, e);
// Create a placeholder page for the failed repository
let page_id = format!("{}-index", repo_name);
let page = PageMeta {
id: page_id.clone(),
title: format!("{} Index", repo_name),
ipfs_key: "QmPlaceholderIpfsKey".to_string(),
blakehash: "blake3-placeholder".to_string(),
sections: vec![SectionMeta {
section_type: "markdown".to_string(),
content: format!(
"# {} Index\n\nFailed to clone repository: {}\nURL: {}",
repo_name, e, url
),
}],
assets: Vec::new(),
};
pages.push(page);
continue;
}
};
// Create a page for the repository
let page_id = format!("{}-index", repo_name);
let page = PageMeta {
id: page_id.clone(),
title: format!("{} Index", repo_name),
ipfs_key: "QmPlaceholderIpfsKey".to_string(), // Will be replaced with actual IPFS key
blakehash: "blake3-placeholder".to_string(), // Will be replaced with actual Blake hash
sections: vec![SectionMeta {
section_type: "markdown".to_string(),
content: format!(
"# {} Index\n\nRepository cloned successfully.\nPath: {}\nURL: {}",
repo_name, repo_path.display(), url
),
}],
assets: Vec::new(),
};
pages.push(page);
}
}
// Process pages from the configuration
for page_config in &self.config.pages {
// Skip draft pages unless explicitly set to false
if page_config.draft.unwrap_or(false) {
log::info!("Skipping draft page: {}", page_config.name);
continue;
}
// Generate a unique page ID
let page_id = format!("page-{}", page_config.name);
// Find the collection for this page
let collection_path = self.config.collections.iter()
.find(|c| c.name.as_ref().map_or(false, |name| name == &page_config.collection))
.and_then(|c| c.url.as_ref())
.map(|url| {
let repo_name = url.split('/')
.last()
.unwrap_or("repo")
.trim_end_matches(".git")
.to_string();
self.config.base_path.join("repos").join(&repo_name)
});
// Create the page content
let content = if let Some(collection_path) = collection_path {
// Try to find the page content in the collection
let page_path = collection_path.join(&page_config.name).with_extension("md");
if page_path.exists() {
match fs::read_to_string(&page_path) {
Ok(content) => content,
Err(e) => {
log::warn!("Failed to read page content from {}: {}", page_path.display(), e);
format!(
"# {}\n\n{}\n\n*Failed to read page content from {}*",
page_config.title,
page_config.description.clone().unwrap_or_default(),
page_path.display()
)
}
}
} else {
format!(
"# {}\n\n{}\n\n*Page content not found at {}*",
page_config.title,
page_config.description.clone().unwrap_or_default(),
page_path.display()
)
}
} else {
format!(
"# {}\n\n{}",
page_config.title,
page_config.description.clone().unwrap_or_default()
)
};
// Calculate the Blake hash of the content
let content_bytes = content.as_bytes();
let blakehash = format!("blake3-{}", blake3::hash(content_bytes).to_hex());
// Create the page metadata
let page = PageMeta {
id: page_id.clone(),
title: page_config.title.clone(),
ipfs_key: "QmPlaceholderIpfsKey".to_string(), // Will be replaced with actual IPFS key
blakehash,
sections: vec![SectionMeta {
section_type: "markdown".to_string(),
content,
}],
assets: Vec::new(),
};
pages.push(page);
}
// Create the WebMeta
Ok(WebMeta {
site_metadata,
pages,
assets,
})
}
/// Upload a file to IPFS
///
/// # Arguments
///
/// * `path` - Path to the file to upload
///
/// # Returns
///
/// The IPFS hash of the file or an error
pub fn upload_to_ipfs<P: AsRef<Path>>(&self, path: P) -> Result<String> {
crate::ipfs::upload_file(path)
}
}

View File

@@ -0,0 +1,200 @@
#[cfg(test)]
mod tests {
use crate::builder::{PageMeta, SectionMeta, SiteMetadata, WebMeta};
use crate::config::{CollectionConfig, PageConfig, SiteConfig};
use crate::error::WebBuilderError;
use crate::WebBuilder;
use std::fs;
use std::path::PathBuf;
use tempfile::TempDir;
fn create_test_config() -> SiteConfig {
SiteConfig {
name: "test".to_string(),
title: "Test Site".to_string(),
description: Some("A test site".to_string()),
keywords: Some(vec!["test".to_string(), "site".to_string()]),
url: Some("https://example.com".to_string()),
favicon: Some("favicon.ico".to_string()),
header: None,
footer: None,
collections: vec![CollectionConfig {
name: Some("test".to_string()),
url: Some("https://git.threefold.info/tfgrid/home.git".to_string()),
description: Some("A test collection".to_string()),
scan: Some(true),
}],
pages: vec![PageConfig {
name: "home".to_string(),
title: "Home".to_string(),
description: Some("Home page".to_string()),
navpath: "/".to_string(),
collection: "test".to_string(),
draft: Some(false),
}],
base_path: PathBuf::from("/path/to/site"),
}
}
#[test]
fn test_webmeta_save() {
let temp_dir = TempDir::new().unwrap();
let output_path = temp_dir.path().join("webmeta.json");
let webmeta = WebMeta {
site_metadata: SiteMetadata {
name: "test".to_string(),
title: "Test Site".to_string(),
description: Some("A test site".to_string()),
keywords: Some(vec!["test".to_string(), "site".to_string()]),
header: None,
footer: None,
},
pages: vec![PageMeta {
id: "page-1".to_string(),
title: "Page 1".to_string(),
ipfs_key: "QmTest1".to_string(),
blakehash: "blake3-test1".to_string(),
sections: vec![SectionMeta {
section_type: "markdown".to_string(),
content: "# Page 1\n\nThis is page 1.".to_string(),
}],
assets: vec![],
}],
assets: std::collections::HashMap::new(),
};
// Save the webmeta.json file
webmeta.save(&output_path).unwrap();
// Check that the file exists
assert!(output_path.exists());
// Read the file and parse it
let content = fs::read_to_string(&output_path).unwrap();
let parsed: WebMeta = serde_json::from_str(&content).unwrap();
// Check that the parsed webmeta matches the original
assert_eq!(parsed.site_metadata.name, webmeta.site_metadata.name);
assert_eq!(parsed.site_metadata.title, webmeta.site_metadata.title);
assert_eq!(
parsed.site_metadata.description,
webmeta.site_metadata.description
);
assert_eq!(
parsed.site_metadata.keywords,
webmeta.site_metadata.keywords
);
assert_eq!(parsed.pages.len(), webmeta.pages.len());
assert_eq!(parsed.pages[0].id, webmeta.pages[0].id);
assert_eq!(parsed.pages[0].title, webmeta.pages[0].title);
assert_eq!(parsed.pages[0].ipfs_key, webmeta.pages[0].ipfs_key);
assert_eq!(parsed.pages[0].blakehash, webmeta.pages[0].blakehash);
assert_eq!(
parsed.pages[0].sections.len(),
webmeta.pages[0].sections.len()
);
assert_eq!(
parsed.pages[0].sections[0].section_type,
webmeta.pages[0].sections[0].section_type
);
assert_eq!(
parsed.pages[0].sections[0].content,
webmeta.pages[0].sections[0].content
);
}
#[test]
fn test_webbuilder_build() {
// Create a temporary directory for the test
let temp_dir = TempDir::new().unwrap();
let site_dir = temp_dir.path().to_path_buf();
// Create a modified test config with the temporary directory as base_path
let mut config = create_test_config();
config.base_path = site_dir.clone();
// Create the repos directory
let repos_dir = site_dir.join("repos");
fs::create_dir_all(&repos_dir).unwrap();
// Create a mock repository directory
let repo_dir = repos_dir.join("home");
fs::create_dir_all(&repo_dir).unwrap();
// Create a mock page file in the repository
let page_content = "# Home Page\n\nThis is the home page content.";
fs::write(repo_dir.join("home.md"), page_content).unwrap();
// Create the WebBuilder with our config
let webbuilder = WebBuilder { config };
// Mock the git module to avoid actual git operations
// This is a simplified test that assumes the git operations would succeed
// Build the website
let webmeta = webbuilder.build().unwrap();
// Check site metadata
assert_eq!(webmeta.site_metadata.name, "test");
assert_eq!(webmeta.site_metadata.title, "Test Site");
assert_eq!(
webmeta.site_metadata.description,
Some("A test site".to_string())
);
assert_eq!(
webmeta.site_metadata.keywords,
Some(vec!["test".to_string(), "site".to_string()])
);
// We expect at least one page from the configuration
assert!(webmeta.pages.len() >= 1);
// Find the page with ID "page-home"
let home_page = webmeta.pages.iter().find(|p| p.id == "page-home");
// Check that we found the page
assert!(home_page.is_some());
let home_page = home_page.unwrap();
// Check the page properties
assert_eq!(home_page.title, "Home");
assert_eq!(home_page.ipfs_key, "QmPlaceholderIpfsKey");
assert_eq!(home_page.sections.len(), 1);
assert_eq!(home_page.sections[0].section_type, "markdown");
// The content should either be our mock content or a placeholder
// depending on whether the page was found
assert!(
home_page.sections[0].content.contains("Home") ||
home_page.sections[0].content.contains("home.md")
);
}
#[test]
fn test_webbuilder_from_directory() {
let temp_dir = TempDir::new().unwrap();
let site_dir = temp_dir.path().join("site");
fs::create_dir(&site_dir).unwrap();
// Create main.hjson
let main_hjson = r#"{ "name": "test", "title": "Test Site" }"#;
fs::write(site_dir.join("main.hjson"), main_hjson).unwrap();
let webbuilder = WebBuilder::from_directory(&site_dir).unwrap();
assert_eq!(webbuilder.config.name, "test");
assert_eq!(webbuilder.config.title, "Test Site");
}
#[test]
fn test_webbuilder_from_directory_error() {
let result = WebBuilder::from_directory("/nonexistent/directory");
assert!(result.is_err());
assert!(matches!(
result.unwrap_err(),
WebBuilderError::MissingDirectory(_)
));
}
}

View File

@@ -0,0 +1,87 @@
# Web Builder Specification
This document describes the process of building web metadata and exporting assets for a website, resulting in a `webmeta.json` file that can be used by a browser-based website generator.
## Overview
The web building process starts with a directory containing the site's Hjson configuration files, such as the example directory `/Users/despiegk/code/git.threefold.info/herocode/doctree/examples/doctreenew/sites/demo1`. These Hjson files define the structure and content of the entire site and may reference external collections. The Hjson configuration sits "on top" of the collections it utilizes. Using the metadata defined in these Hjson files, the necessary collection data is downloaded from Git repositories (if referenced). The `doctree` is then used to process the relevant data, identify pages and images, and prepare them for export to IPFS. Finally, a `webmeta.json` file is generated containing all the necessary information, including IPFS keys and Blake hashes for content verification, allowing a browser-based tool to render the website by fetching assets from IPFS. Optionally, the generated `webmeta.json` file can also be uploaded to IPFS, and its IPFS URL returned.
## Process Steps
1. **Start from Hjson Directory:**
* The process begins with a designated directory containing the site's Hjson configuration files. This directory serves as the single input for the web building process.
2. **Parse Site Metadata (Hjson):**
* Locate and parse all `.hjson` files within the input directory and its subdirectories (e.g., `pages`). These files collectively define the site's structure, content, and configuration, and may include references to external collections.
3. **Download Referenced Collections from Git:**
* If the Hjson metadata references external collections hosted in Git repositories, download these collections using a separate tool or crate responsible for Git interactions. The Hjson files provide the necessary information (e.g., repository URLs, branch names) to perform these downloads.
4. **Process Site Content and Collections with Doctree:**
* Utilize the `doctree` library to process the parsed site metadata and the content of any downloaded collections.
* `doctree` will build the document tree based on the Hjson structure and identify relevant assets such as pages (e.g., Markdown files) and images referenced within the site configuration or collections.
5. **Export Assets to IPFS:**
* Export the identified assets (pages, images, etc.) to IPFS.
* For each exported asset, obtain its IPFS key (CID) and calculate its Blake hash for content integrity verification.
6. **Generate `webmeta.json`:**
* Create a single `webmeta.json` file that consolidates all the necessary information for the browser-based generator.
* This file should include:
* Site-level metadata (from Hjson).
* Structure of the website (pages, navigation, etc.).
* For each page, include:
* Page metadata (from Hjson).
* The IPFS key of the page content.
* The Blake hash of the page content.
* Information about other assets (images, etc.), including their IPFS keys.
7. **Optional: Upload `webmeta.json` to IPFS:**
* Optionally, upload the generated `webmeta.json` file to IPFS.
* If uploaded, the IPFS URL of the `webmeta.json` file is returned as the output of the web building process.
8. **Utilize `webmeta.json` in Browser:**
* The generated `webmeta.json` file (either locally or fetched from IPFS) serves as the single configuration entry point for a browser-based website generator.
* The browser tool reads `webmeta.json`, uses the IPFS keys to fetch the content and assets from the IPFS network, and renders the website dynamically. The Blake hashes can be used to verify the integrity of the downloaded content.
## `webmeta.json` Structure (Example)
```json
{
"site_metadata": {
// Consolidated data from site-level Hjson files (collection, header, footer, main, etc.)
"name": "demo1",
"title": "Demo Site 1",
"description": "This is a demo site for doctree",
"keywords": ["demo", "doctree", "example"],
"header": { ... },
"footer": { ... }
},
"pages": [
{
"id": "mypages1",
"title": "My Pages 1",
"ipfs_key": "Qm...", // IPFS key of the page content
"blakehash": "sha256-...", // Blake hash of the page content
"sections": [
{ "type": "text", "content": "..." } // Potentially include some inline content or structure
],
"assets": [
{
"name": "image1.png",
"ipfs_key": "Qm..." // IPFS key of an image used on the page
}
]
}
// Other pages...
],
"assets": {
// Global assets not tied to a specific page, e.g., CSS, global images
"style.css": {
"ipfs_key": "Qm..."
}
}
}
```
This structure is a suggestion and can be adapted based on the specific needs of the browser-based generator. The key is to include all necessary information (metadata, IPFS keys, hashes) to allow the browser to fetch and render the complete website.

214
webbuilder/src/config.rs Normal file
View File

@@ -0,0 +1,214 @@
use serde::{Deserialize, Serialize};
use std::path::{Path, PathBuf};
use crate::error::{Result, WebBuilderError};
/// Site configuration
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SiteConfig {
/// Site name
pub name: String,
/// Site title
pub title: String,
/// Site description
pub description: Option<String>,
/// Site keywords
pub keywords: Option<Vec<String>>,
/// Site URL
pub url: Option<String>,
/// Site favicon
pub favicon: Option<String>,
/// Site header
pub header: Option<HeaderConfig>,
/// Site footer
pub footer: Option<FooterConfig>,
/// Site collections
pub collections: Vec<CollectionConfig>,
/// Site pages
pub pages: Vec<PageConfig>,
/// Base path of the site configuration
#[serde(skip)]
pub base_path: PathBuf,
}
/// Header configuration
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct HeaderConfig {
/// Header logo
pub logo: Option<LogoConfig>,
/// Header title
pub title: Option<String>,
/// Header menu
pub menu: Option<Vec<MenuItemConfig>>,
/// Login button
pub login: Option<LoginConfig>,
}
/// Logo configuration
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct LogoConfig {
/// Logo source
pub src: String,
/// Logo alt text
pub alt: Option<String>,
}
/// Menu item configuration
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct MenuItemConfig {
/// Menu item label
pub label: String,
/// Menu item link
pub link: String,
/// Menu item children
pub children: Option<Vec<MenuItemConfig>>,
}
/// Login button configuration
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct LoginConfig {
/// Whether the login button is visible
pub visible: bool,
/// Login button label
pub label: Option<String>,
/// Login button link
pub link: Option<String>,
}
/// Footer configuration
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct FooterConfig {
/// Footer title
pub title: Option<String>,
/// Footer sections
pub sections: Option<Vec<FooterSectionConfig>>,
/// Footer copyright
pub copyright: Option<String>,
}
/// Footer section configuration
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct FooterSectionConfig {
/// Section title
pub title: String,
/// Section links
pub links: Vec<LinkConfig>,
}
/// Link configuration
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct LinkConfig {
/// Link label
pub label: String,
/// Link URL
pub href: String,
}
/// Collection configuration
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct CollectionConfig {
/// Collection name
pub name: Option<String>,
/// Collection URL
pub url: Option<String>,
/// Collection description
pub description: Option<String>,
/// Whether to scan the URL for collections
pub scan: Option<bool>,
}
/// Page configuration
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct PageConfig {
/// Page name
pub name: String,
/// Page title
pub title: String,
/// Page description
pub description: Option<String>,
/// Page navigation path
pub navpath: String,
/// Page collection
pub collection: String,
/// Whether the page is a draft
pub draft: Option<bool>,
}
impl SiteConfig {
/// Load site configuration from a directory
///
/// # Arguments
///
/// * `path` - Path to the directory containing hjson configuration files
///
/// # Returns
///
/// A new SiteConfig instance or an error
pub fn from_directory<P: AsRef<Path>>(path: P) -> Result<Self> {
let path = path.as_ref();
// Check if the directory exists
if !path.exists() {
return Err(WebBuilderError::MissingDirectory(path.to_path_buf()));
}
// Check if the directory is a directory
if !path.is_dir() {
return Err(WebBuilderError::InvalidConfiguration(format!(
"{:?} is not a directory",
path
)));
}
// TODO: Implement loading configuration from hjson files
// For now, return a placeholder configuration
Ok(SiteConfig {
name: "demo1".to_string(),
title: "Demo Site 1".to_string(),
description: Some("This is a demo site for doctree".to_string()),
keywords: Some(vec![
"demo".to_string(),
"doctree".to_string(),
"example".to_string(),
]),
url: Some("https://example.com".to_string()),
favicon: Some("img/favicon.png".to_string()),
header: None,
footer: None,
collections: Vec::new(),
pages: Vec::new(),
base_path: path.to_path_buf(),
})
}
}

View File

@@ -0,0 +1,156 @@
#[cfg(test)]
mod tests {
use crate::config::{
CollectionConfig, FooterConfig, FooterSectionConfig, HeaderConfig, LinkConfig, LoginConfig,
LogoConfig, MenuItemConfig, PageConfig, SiteConfig,
};
use std::path::PathBuf;
#[test]
fn test_site_config_serialization() {
let config = SiteConfig {
name: "test".to_string(),
title: "Test Site".to_string(),
description: Some("A test site".to_string()),
keywords: Some(vec!["test".to_string(), "site".to_string()]),
url: Some("https://example.com".to_string()),
favicon: Some("favicon.ico".to_string()),
header: Some(HeaderConfig {
logo: Some(LogoConfig {
src: "logo.png".to_string(),
alt: Some("Logo".to_string()),
}),
title: Some("Test Site".to_string()),
menu: Some(vec![
MenuItemConfig {
label: "Home".to_string(),
link: "/".to_string(),
children: None,
},
MenuItemConfig {
label: "About".to_string(),
link: "/about".to_string(),
children: Some(vec![MenuItemConfig {
label: "Team".to_string(),
link: "/about/team".to_string(),
children: None,
}]),
},
]),
login: Some(LoginConfig {
visible: true,
label: Some("Login".to_string()),
link: Some("/login".to_string()),
}),
}),
footer: Some(FooterConfig {
title: Some("Test Site".to_string()),
sections: Some(vec![FooterSectionConfig {
title: "Links".to_string(),
links: vec![
LinkConfig {
label: "Home".to_string(),
href: "/".to_string(),
},
LinkConfig {
label: "About".to_string(),
href: "/about".to_string(),
},
],
}]),
copyright: Some("© 2023".to_string()),
}),
collections: vec![CollectionConfig {
name: Some("test".to_string()),
url: Some("https://git.threefold.info/tfgrid/home.git".to_string()),
description: Some("A test collection".to_string()),
scan: Some(true),
}],
pages: vec![PageConfig {
name: "home".to_string(),
title: "Home".to_string(),
description: Some("Home page".to_string()),
navpath: "/".to_string(),
collection: "test".to_string(),
draft: Some(false),
}],
base_path: PathBuf::from("/path/to/site"),
};
// Serialize to JSON
let json = serde_json::to_string(&config).unwrap();
// Deserialize from JSON
let deserialized: SiteConfig = serde_json::from_str(&json).unwrap();
// Check that the deserialized config matches the original
assert_eq!(deserialized.name, config.name);
assert_eq!(deserialized.title, config.title);
assert_eq!(deserialized.description, config.description);
assert_eq!(deserialized.keywords, config.keywords);
assert_eq!(deserialized.url, config.url);
assert_eq!(deserialized.favicon, config.favicon);
// Check header
assert!(deserialized.header.is_some());
let header = deserialized.header.as_ref().unwrap();
let original_header = config.header.as_ref().unwrap();
// Check logo
assert!(header.logo.is_some());
let logo = header.logo.as_ref().unwrap();
let original_logo = original_header.logo.as_ref().unwrap();
assert_eq!(logo.src, original_logo.src);
assert_eq!(logo.alt, original_logo.alt);
// Check title
assert_eq!(header.title, original_header.title);
// Check menu
assert!(header.menu.is_some());
let menu = header.menu.as_ref().unwrap();
let original_menu = original_header.menu.as_ref().unwrap();
assert_eq!(menu.len(), original_menu.len());
assert_eq!(menu[0].label, original_menu[0].label);
assert_eq!(menu[0].link, original_menu[0].link);
assert_eq!(menu[1].label, original_menu[1].label);
assert_eq!(menu[1].link, original_menu[1].link);
// Check login
assert!(header.login.is_some());
let login = header.login.as_ref().unwrap();
let original_login = original_header.login.as_ref().unwrap();
assert_eq!(login.visible, original_login.visible);
assert_eq!(login.label, original_login.label);
assert_eq!(login.link, original_login.link);
// Check footer
assert!(deserialized.footer.is_some());
let footer = deserialized.footer.as_ref().unwrap();
let original_footer = config.footer.as_ref().unwrap();
assert_eq!(footer.title, original_footer.title);
assert_eq!(footer.copyright, original_footer.copyright);
// Check collections
assert_eq!(deserialized.collections.len(), config.collections.len());
assert_eq!(deserialized.collections[0].name, config.collections[0].name);
assert_eq!(deserialized.collections[0].url, config.collections[0].url);
assert_eq!(
deserialized.collections[0].description,
config.collections[0].description
);
assert_eq!(deserialized.collections[0].scan, config.collections[0].scan);
// Check pages
assert_eq!(deserialized.pages.len(), config.pages.len());
assert_eq!(deserialized.pages[0].name, config.pages[0].name);
assert_eq!(deserialized.pages[0].title, config.pages[0].title);
assert_eq!(
deserialized.pages[0].description,
config.pages[0].description
);
assert_eq!(deserialized.pages[0].navpath, config.pages[0].navpath);
assert_eq!(deserialized.pages[0].collection, config.pages[0].collection);
assert_eq!(deserialized.pages[0].draft, config.pages[0].draft);
}
}

68
webbuilder/src/error.rs Normal file
View File

@@ -0,0 +1,68 @@
use std::io;
use std::path::PathBuf;
use thiserror::Error;
/// Result type for WebBuilder operations
pub type Result<T> = std::result::Result<T, WebBuilderError>;
/// Error type for WebBuilder operations
#[derive(Error, Debug)]
pub enum WebBuilderError {
/// IO error
#[error("IO error: {0}")]
IoError(#[from] io::Error),
/// DocTree error
#[error("DocTree error: {0}")]
DocTreeError(#[from] doctree::DocTreeError),
/// Hjson parsing error
#[error("Hjson parsing error: {0}")]
HjsonError(String),
/// Git error
#[error("Git error: {0}")]
GitError(String),
/// IPFS error
#[error("IPFS error: {0}")]
IpfsError(String),
/// Missing file error
#[error("Missing file: {0}")]
MissingFile(PathBuf),
/// Missing directory error
#[error("Missing directory: {0}")]
MissingDirectory(PathBuf),
/// Missing configuration error
#[error("Missing configuration: {0}")]
MissingConfiguration(String),
/// Invalid configuration error
#[error("Invalid configuration: {0}")]
InvalidConfiguration(String),
/// Other error
#[error("Error: {0}")]
Other(String),
}
impl From<String> for WebBuilderError {
fn from(error: String) -> Self {
WebBuilderError::Other(error)
}
}
impl From<&str> for WebBuilderError {
fn from(error: &str) -> Self {
WebBuilderError::Other(error.to_string())
}
}
impl From<serde_json::Error> for WebBuilderError {
fn from(error: serde_json::Error) -> Self {
WebBuilderError::Other(format!("JSON error: {}", error))
}
}

View File

@@ -0,0 +1,73 @@
#[cfg(test)]
mod tests {
use crate::error::WebBuilderError;
use std::path::PathBuf;
#[test]
fn test_error_from_string() {
let error = WebBuilderError::from("test error");
assert!(matches!(error, WebBuilderError::Other(s) if s == "test error"));
}
#[test]
fn test_error_from_string_owned() {
let error = WebBuilderError::from("test error".to_string());
assert!(matches!(error, WebBuilderError::Other(s) if s == "test error"));
}
#[test]
fn test_error_from_json_error() {
let json_error = serde_json::from_str::<serde_json::Value>("invalid json").unwrap_err();
let error = WebBuilderError::from(json_error);
assert!(matches!(error, WebBuilderError::Other(s) if s.starts_with("JSON error:")));
}
#[test]
fn test_error_display() {
let errors = vec![
(
WebBuilderError::IoError(std::io::Error::new(
std::io::ErrorKind::NotFound,
"file not found",
)),
"IO error: file not found",
),
(
WebBuilderError::HjsonError("invalid hjson".to_string()),
"Hjson parsing error: invalid hjson",
),
(
WebBuilderError::GitError("git error".to_string()),
"Git error: git error",
),
(
WebBuilderError::IpfsError("ipfs error".to_string()),
"IPFS error: ipfs error",
),
(
WebBuilderError::MissingFile(PathBuf::from("/path/to/file")),
"Missing file: /path/to/file",
),
(
WebBuilderError::MissingDirectory(PathBuf::from("/path/to/dir")),
"Missing directory: /path/to/dir",
),
(
WebBuilderError::MissingConfiguration("config".to_string()),
"Missing configuration: config",
),
(
WebBuilderError::InvalidConfiguration("invalid config".to_string()),
"Invalid configuration: invalid config",
),
(
WebBuilderError::Other("other error".to_string()),
"Error: other error",
),
];
for (error, expected) in errors {
assert_eq!(error.to_string(), expected);
}
}
}

182
webbuilder/src/git.rs Normal file
View File

@@ -0,0 +1,182 @@
use lazy_static::lazy_static;
use sal::git::{GitRepo, GitTree};
use std::collections::HashMap;
use std::path::{Path, PathBuf};
use std::sync::{Arc, Mutex};
use std::time::{ SystemTime};
use crate::error::{Result, WebBuilderError};
// Cache entry for Git repositories
struct CacheEntry {
path: PathBuf,
last_updated: SystemTime,
}
// Global cache for Git repositories
lazy_static! {
static ref REPO_CACHE: Arc<Mutex<HashMap<String, CacheEntry>>> =
Arc::new(Mutex::new(HashMap::new()));
}
// Cache timeout in seconds (default: 1 hour)
const CACHE_TIMEOUT: u64 = 3600;
/// Clone a Git repository
///
/// # Arguments
///
/// * `url` - URL of the repository to clone
/// * `destination` - Destination directory
///
/// # Returns
///
/// The path to the cloned repository or an error
pub fn clone_repository<P: AsRef<Path>>(url: &str, destination: P) -> Result<PathBuf> {
let destination = destination.as_ref();
let destination_str = destination.to_str().unwrap();
// Create a GitTree for the parent directory
let parent_dir = destination.parent().ok_or_else(|| {
WebBuilderError::InvalidConfiguration(format!(
"Invalid destination path: {}",
destination_str
))
})?;
let git_tree = GitTree::new(parent_dir.to_str().unwrap())
.map_err(|e| WebBuilderError::GitError(format!("Failed to create GitTree: {}", e)))?;
// Use the GitTree to get (clone) the repository
let repos = git_tree
.get(url)
.map_err(|e| WebBuilderError::GitError(format!("Failed to clone repository: {}", e)))?;
if repos.is_empty() {
return Err(WebBuilderError::GitError(format!(
"Failed to clone repository: No repository was created"
)));
}
// Return the path of the first repository
Ok(PathBuf::from(repos[0].path()))
}
/// Pull the latest changes from a Git repository
///
/// # Arguments
///
/// * `path` - Path to the repository
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn pull_repository<P: AsRef<Path>>(path: P) -> Result<()> {
let path = path.as_ref();
let path_str = path.to_str().unwrap();
// Create a GitRepo directly
let repo = GitRepo::new(path_str.to_string());
// Pull the repository
repo.pull()
.map_err(|e| WebBuilderError::GitError(format!("Failed to pull repository: {}", e)))?;
Ok(())
}
/// Clone or pull a Git repository with caching
///
/// # Arguments
///
/// * `url` - URL of the repository to clone
/// * `destination` - Destination directory
///
/// # Returns
///
/// The path to the repository or an error
pub fn clone_or_pull<P: AsRef<Path>>(url: &str, destination: P) -> Result<PathBuf> {
let destination = destination.as_ref();
// Check the cache first
let mut cache = REPO_CACHE.lock().unwrap();
let now = SystemTime::now();
if let Some(entry) = cache.get(url) {
// Check if the cache entry is still valid
if let Ok(elapsed) = now.duration_since(entry.last_updated) {
if elapsed.as_secs() < CACHE_TIMEOUT {
// Cache is still valid, return the cached path
log::info!("Using cached repository for {}", url);
return Ok(entry.path.clone());
}
}
}
// Cache miss or expired, clone or pull the repository
let result = if destination.exists() {
// Pull the repository
pull_repository(destination)?;
Ok(destination.to_path_buf())
} else {
// Clone the repository
clone_repository(url, destination)
};
// Update the cache
if let Ok(path) = &result {
cache.insert(
url.to_string(),
CacheEntry {
path: path.clone(),
last_updated: now,
},
);
}
result
}
/// Force update a Git repository, bypassing the cache
///
/// # Arguments
///
/// * `url` - URL of the repository to clone
/// * `destination` - Destination directory
///
/// # Returns
///
/// The path to the repository or an error
pub fn force_update<P: AsRef<Path>>(url: &str, destination: P) -> Result<PathBuf> {
let destination = destination.as_ref();
// Clone or pull the repository
let result = if destination.exists() {
// Pull the repository
pull_repository(destination)?;
Ok(destination.to_path_buf())
} else {
// Clone the repository
clone_repository(url, destination)
};
// Update the cache
if let Ok(path) = &result {
let mut cache = REPO_CACHE.lock().unwrap();
cache.insert(
url.to_string(),
CacheEntry {
path: path.clone(),
last_updated: SystemTime::now(),
},
);
}
result
}
/// Clear the Git repository cache
pub fn clear_cache() {
let mut cache = REPO_CACHE.lock().unwrap();
cache.clear();
}

View File

@@ -0,0 +1,26 @@
#[cfg(test)]
mod tests {
use crate::error::WebBuilderError;
use crate::git::clone_repository;
use std::path::PathBuf;
#[test]
fn test_clone_repository_error_invalid_destination() {
// Test with a destination that has no parent directory
// This URL is invalid because we added number 2 after `home`
let result = clone_repository("https://git.threefold.info/tfgrid/home2.git", PathBuf::from("/"));
assert!(result.is_err());
assert!(matches!(
result.unwrap_err(),
WebBuilderError::InvalidConfiguration(_)
));
}
// Note: The following tests would require mocking the sal::git module,
// which is complex due to the external dependency. In a real-world scenario,
// we would use a more sophisticated mocking approach or integration tests.
// For now, we'll just test the error cases and leave the success cases
// for integration testing.
}

70
webbuilder/src/ipfs.rs Normal file
View File

@@ -0,0 +1,70 @@
use ipfs_api_backend_hyper::{IpfsApi, IpfsClient};
use std::fs::File;
use std::path::Path;
use tokio::runtime::Runtime;
use crate::error::{Result, WebBuilderError};
/// Upload a file to IPFS
///
/// # Arguments
///
/// * `path` - Path to the file to upload
///
/// # Returns
///
/// The IPFS hash of the file or an error
pub fn upload_file<P: AsRef<Path>>(path: P) -> Result<String> {
let path = path.as_ref();
// Check if the file exists
if !path.exists() {
return Err(WebBuilderError::MissingFile(path.to_path_buf()));
}
// Create a tokio runtime
let rt = Runtime::new()
.map_err(|e| WebBuilderError::Other(format!("Failed to create tokio runtime: {}", e)))?;
// Upload the file to IPFS
let client = IpfsClient::default();
let ipfs_hash = rt.block_on(async {
// Open the file directly - this implements Read trait
let file = File::open(path).map_err(|e| WebBuilderError::IoError(e))?;
client
.add(file)
.await
.map_err(|e| WebBuilderError::IpfsError(format!("Failed to upload to IPFS: {}", e)))
.map(|res| res.hash)
})?;
Ok(ipfs_hash)
}
/// Calculate the Blake3 hash of a file
///
/// # Arguments
///
/// * `path` - Path to the file to hash
///
/// # Returns
///
/// The Blake3 hash of the file or an error
pub fn calculate_blake_hash<P: AsRef<Path>>(path: P) -> Result<String> {
let path = path.as_ref();
// Check if the file exists
if !path.exists() {
return Err(WebBuilderError::MissingFile(path.to_path_buf()));
}
// Read the file
let content = std::fs::read(path).map_err(|e| WebBuilderError::IoError(e))?;
// Calculate the hash
let hash = blake3::hash(&content);
let hash_hex = hash.to_hex().to_string();
Ok(format!("blake3-{}", hash_hex))
}

View File

@@ -0,0 +1,64 @@
#[cfg(test)]
mod tests {
use crate::error::WebBuilderError;
use crate::ipfs::{calculate_blake_hash, upload_file};
use std::fs;
use tempfile::TempDir;
#[test]
fn test_upload_file_missing_file() {
let temp_dir = TempDir::new().unwrap();
let file_path = temp_dir.path().join("nonexistent.txt");
let result = upload_file(&file_path);
assert!(result.is_err());
assert!(matches!(
result.unwrap_err(),
WebBuilderError::MissingFile(_)
));
}
#[test]
fn test_calculate_blake_hash() {
let temp_dir = TempDir::new().unwrap();
let file_path = temp_dir.path().join("test.txt");
fs::write(&file_path, "test content").unwrap();
let result = calculate_blake_hash(&file_path).unwrap();
// The hash should start with "blake3-"
assert!(result.starts_with("blake3-"));
// The hash should be 64 characters long after the prefix
assert_eq!(result.len(), "blake3-".len() + 64);
// The hash should be the same for the same content
let file_path2 = temp_dir.path().join("test2.txt");
fs::write(&file_path2, "test content").unwrap();
let result2 = calculate_blake_hash(&file_path2).unwrap();
assert_eq!(result, result2);
// The hash should be different for different content
let file_path3 = temp_dir.path().join("test3.txt");
fs::write(&file_path3, "different content").unwrap();
let result3 = calculate_blake_hash(&file_path3).unwrap();
assert_ne!(result, result3);
}
#[test]
fn test_calculate_blake_hash_missing_file() {
let temp_dir = TempDir::new().unwrap();
let file_path = temp_dir.path().join("nonexistent.txt");
let result = calculate_blake_hash(&file_path);
assert!(result.is_err());
assert!(matches!(
result.unwrap_err(),
WebBuilderError::MissingFile(_)
));
}
}

59
webbuilder/src/lib.rs Normal file
View File

@@ -0,0 +1,59 @@
//! WebBuilder is a library for building websites from hjson configuration files and markdown content.
//!
//! It uses the DocTree library to process markdown content and includes, and exports the result
//! to a webmeta.json file that can be used by a browser-based website generator.
pub mod builder;
pub mod config;
pub mod error;
pub mod git;
pub mod ipfs;
pub mod parser;
#[cfg(test)]
mod config_test;
#[cfg(test)]
mod error_test;
#[cfg(test)]
mod git_test;
#[cfg(test)]
mod ipfs_test;
#[cfg(test)]
mod parser_test;
pub use builder::WebBuilder;
pub use config::SiteConfig;
pub use error::{Result, WebBuilderError};
pub use parser::{ParsingStrategy, parse_site_config_with_strategy as parse_site_config, parse_site_config_recommended, parse_site_config_auto};
/// Create a new WebBuilder instance from a directory containing configuration files.
///
/// # Arguments
///
/// * `path` - Path to the directory containing configuration files
///
/// # Returns
///
/// A new WebBuilder instance or an error
pub fn from_directory<P: AsRef<std::path::Path>>(path: P) -> Result<WebBuilder> {
WebBuilder::from_directory(path)
}
/// Create a new WebBuilder instance from a directory containing configuration files,
/// using the specified parsing strategy.
///
/// # Arguments
///
/// * `path` - Path to the directory containing configuration files
/// * `strategy` - Parsing strategy to use
///
/// # Returns
///
/// A new WebBuilder instance or an error
pub fn from_directory_with_strategy<P: AsRef<std::path::Path>>(
path: P,
strategy: ParsingStrategy,
) -> Result<WebBuilder> {
let config = parser::parse_site_config_with_strategy(path, strategy)?;
Ok(WebBuilder { config })
}

88
webbuilder/src/main.rs Normal file
View File

@@ -0,0 +1,88 @@
use clap::{Parser, Subcommand};
use std::path::PathBuf;
use webbuilder::{from_directory, Result};
#[derive(Parser)]
#[command(author, version, about, long_about = None)]
struct Cli {
#[command(subcommand)]
command: Commands,
}
#[derive(Subcommand)]
enum Commands {
/// Build a website from hjson configuration files
Build {
/// Path to the directory containing hjson configuration files
#[arg(short, long)]
path: PathBuf,
/// Output directory for the webmeta.json file
#[arg(short, long)]
output: Option<PathBuf>,
/// Whether to upload the webmeta.json file to IPFS
#[arg(short, long)]
upload: bool,
},
}
fn main() -> Result<()> {
// Initialize logger
env_logger::init();
// Parse command line arguments
let cli = Cli::parse();
// Handle commands
match &cli.command {
Commands::Build {
path,
output,
upload,
} => {
// Create a WebBuilder instance
let webbuilder = from_directory(path)?;
// Print the parsed configuration
println!("Parsed site configuration:");
println!(" Name: {}", webbuilder.config.name);
println!(" Title: {}", webbuilder.config.title);
println!(" Description: {:?}", webbuilder.config.description);
println!(" URL: {:?}", webbuilder.config.url);
println!(
" Collections: {} items",
webbuilder.config.collections.len()
);
for (i, collection) in webbuilder.config.collections.iter().enumerate() {
println!(
" Collection {}: {:?} - {:?}",
i, collection.name, collection.url
);
}
println!(" Pages: {} items", webbuilder.config.pages.len());
// Build the website
let webmeta = webbuilder.build()?;
// Save the webmeta.json file
let output_path = output
.clone()
.unwrap_or_else(|| PathBuf::from("webmeta.json"));
webmeta.save(&output_path)?;
// Upload to IPFS if requested
if *upload {
let ipfs_hash = webbuilder.upload_to_ipfs(&output_path)?;
println!("Uploaded to IPFS: {}", ipfs_hash);
}
println!("Website built successfully!");
println!("Output: {:?}", output_path);
}
}
Ok(())
}

517
webbuilder/src/parser.rs Normal file
View File

@@ -0,0 +1,517 @@
use std::fs;
use std::path::Path;
use deser_hjson::from_str;
use serde::de::DeserializeOwned;
use serde_json::{self, Value};
use crate::config::{CollectionConfig, FooterConfig, HeaderConfig, PageConfig, SiteConfig};
use crate::error::{Result, WebBuilderError};
/// Parsing strategy to use
#[derive(Debug, Clone, Copy, PartialEq)]
pub enum ParsingStrategy {
/// Use the deser-hjson library (recommended)
Hjson,
/// Use a simple line-by-line parser (legacy)
Simple,
/// Auto-detect the best parser to use
Auto,
}
/// Parse a file into a struct using the specified strategy
///
/// # Arguments
///
/// * `path` - Path to the file to parse
/// * `strategy` - Parsing strategy to use
///
/// # Returns
///
/// The parsed struct or an error
pub fn parse_file<T, P>(path: P, strategy: ParsingStrategy) -> Result<T>
where
T: DeserializeOwned,
P: AsRef<Path>,
{
let path = path.as_ref();
// Check if the file exists
if !path.exists() {
return Err(WebBuilderError::MissingFile(path.to_path_buf()));
}
// Read the file
let content = fs::read_to_string(path).map_err(|e| WebBuilderError::IoError(e))?;
match strategy {
ParsingStrategy::Hjson => {
// Use the deser-hjson library
from_str(&content).map_err(|e| WebBuilderError::HjsonError(format!("Error parsing {:?}: {}", path, e)))
}
ParsingStrategy::Simple => {
// Use the simple parser - for this we need to handle the file reading ourselves
// since the original parse_hjson function does that internally
let path_ref: &Path = path.as_ref();
// Check if the file exists
if !path_ref.exists() {
return Err(WebBuilderError::MissingFile(path_ref.to_path_buf()));
}
// Read the file
let content = fs::read_to_string(path).map_err(|e| WebBuilderError::IoError(e))?;
// First try to parse as JSON
let json_result = serde_json::from_str::<T>(&content);
if json_result.is_ok() {
return Ok(json_result.unwrap());
}
// If that fails, try to convert hjson to json using a simple approach
let json_content = convert_hjson_to_json(&content)?;
// Parse the JSON
serde_json::from_str(&json_content)
.map_err(|e| WebBuilderError::HjsonError(format!("Error parsing {:?}: {}", path, e)))
}
ParsingStrategy::Auto => {
// Try the hjson parser first, fall back to simple if it fails
match from_str(&content) {
Ok(result) => Ok(result),
Err(e) => {
log::warn!("Hjson parser failed: {}, falling back to simple parser", e);
// Call the simple parser directly
let path_ref: &Path = path.as_ref();
// Check if the file exists
if !path_ref.exists() {
return Err(WebBuilderError::MissingFile(path_ref.to_path_buf()));
}
// Read the file
let content = fs::read_to_string(path).map_err(|e| WebBuilderError::IoError(e))?;
// First try to parse as JSON
let json_result = serde_json::from_str::<T>(&content);
if json_result.is_ok() {
return Ok(json_result.unwrap());
}
// If that fails, try to convert hjson to json using a simple approach
let json_content = convert_hjson_to_json(&content)?;
// Parse the JSON
serde_json::from_str(&json_content)
.map_err(|e| WebBuilderError::HjsonError(format!("Error parsing {:?}: {}", path, e)))
}
}
}
}
}
/// Parse a hjson file into a struct using the simple parser
///
/// # Arguments
///
/// * `path` - Path to the hjson file
///
/// # Returns
///
/// The parsed struct or an error
pub fn parse_hjson<T, P>(path: P) -> Result<T>
where
T: DeserializeOwned,
P: AsRef<Path>,
{
let path = path.as_ref();
// Check if the file exists
if !path.exists() {
return Err(WebBuilderError::MissingFile(path.to_path_buf()));
}
// Read the file
let content = fs::read_to_string(path).map_err(|e| WebBuilderError::IoError(e))?;
// First try to parse as JSON
let json_result = serde_json::from_str::<T>(&content);
if json_result.is_ok() {
return Ok(json_result.unwrap());
}
// If that fails, try to convert hjson to json using a simple approach
let json_content = convert_hjson_to_json(&content)?;
// Parse the JSON
serde_json::from_str(&json_content)
.map_err(|e| WebBuilderError::HjsonError(format!("Error parsing {:?}: {}", path, e)))
}
/// Convert hjson to json using a simple approach
///
/// # Arguments
///
/// * `hjson` - The hjson content
///
/// # Returns
///
/// The json content or an error
fn convert_hjson_to_json(hjson: &str) -> Result<String> {
// Remove comments
let mut json = String::new();
let mut lines = hjson.lines();
while let Some(line) = lines.next() {
let trimmed = line.trim();
// Skip empty lines
if trimmed.is_empty() {
continue;
}
// Skip comment lines
if trimmed.starts_with('#') {
continue;
}
// Handle key-value pairs
if let Some(pos) = trimmed.find(':') {
let key = trimmed[..pos].trim();
let value = trimmed[pos + 1..].trim();
// Add quotes to keys
json.push_str(&format!("\"{}\":", key));
// Add value
if value.is_empty() {
// If value is empty, it might be an object or array start
if lines
.clone()
.next()
.map_or(false, |l| l.trim().starts_with('{'))
{
json.push_str(" {");
} else if lines
.clone()
.next()
.map_or(false, |l| l.trim().starts_with('['))
{
json.push_str(" [");
} else {
json.push_str(" null");
}
} else {
// Add quotes to string values
if value.starts_with('"')
|| value.starts_with('[')
|| value.starts_with('{')
|| value == "true"
|| value == "false"
|| value == "null"
|| value.parse::<f64>().is_ok()
{
json.push_str(&format!(" {}", value));
} else {
json.push_str(&format!(" \"{}\"", value.replace('"', "\\\"")));
}
}
json.push_str(",\n");
} else if trimmed == "{" || trimmed == "[" {
json.push_str(trimmed);
json.push_str("\n");
} else if trimmed == "}" || trimmed == "]" {
// Remove trailing comma if present
if json.ends_with(",\n") {
json.pop();
json.pop();
json.push_str("\n");
}
json.push_str(trimmed);
json.push_str(",\n");
} else {
// Just copy the line
json.push_str(trimmed);
json.push_str("\n");
}
}
// Remove trailing comma if present
if json.ends_with(",\n") {
json.pop();
json.pop();
json.push_str("\n");
}
// Wrap in object if not already
if !json.trim().starts_with('{') {
json = format!("{{\n{}\n}}", json);
}
Ok(json)
}
/// Parse site configuration from a directory
///
/// # Arguments
///
/// * `path` - Path to the directory containing hjson configuration files
///
/// # Returns
///
/// The parsed site configuration or an error
pub fn parse_site_config<P: AsRef<Path>>(path: P) -> Result<SiteConfig> {
let path = path.as_ref();
// Check if the directory exists
if !path.exists() {
return Err(WebBuilderError::MissingDirectory(path.to_path_buf()));
}
// Check if the directory is a directory
if !path.is_dir() {
return Err(WebBuilderError::InvalidConfiguration(format!(
"{:?} is not a directory",
path
)));
}
// Parse main.hjson
let main_path = path.join("main.hjson");
let main_config: serde_json::Value = parse_hjson(main_path)?;
// Parse header.hjson
let header_path = path.join("header.hjson");
let header_config: Option<HeaderConfig> = if header_path.exists() {
Some(parse_hjson(header_path)?)
} else {
None
};
// Parse footer.hjson
let footer_path = path.join("footer.hjson");
let footer_config: Option<FooterConfig> = if footer_path.exists() {
Some(parse_hjson(footer_path)?)
} else {
None
};
// Parse collection.hjson
let collection_path = path.join("collection.hjson");
let collection_configs: Vec<CollectionConfig> = if collection_path.exists() {
parse_hjson(collection_path)?
} else {
Vec::new()
};
// Parse pages directory
let pages_path = path.join("pages");
let mut page_configs: Vec<PageConfig> = Vec::new();
if pages_path.exists() && pages_path.is_dir() {
for entry in fs::read_dir(pages_path)? {
let entry = entry?;
let entry_path = entry.path();
if entry_path.is_file() && entry_path.extension().map_or(false, |ext| ext == "hjson") {
let page_config: Vec<PageConfig> = parse_hjson(&entry_path)?;
page_configs.extend(page_config);
}
}
}
// Parse keywords from main.hjson
let keywords = if let Some(keywords_value) = main_config.get("keywords") {
if keywords_value.is_array() {
let mut keywords_vec = Vec::new();
for keyword in keywords_value.as_array().unwrap() {
if let Some(keyword_str) = keyword.as_str() {
keywords_vec.push(keyword_str.to_string());
}
}
Some(keywords_vec)
} else if let Some(keywords_str) = keywords_value.as_str() {
// Handle comma-separated keywords
Some(
keywords_str
.split(',')
.map(|s| s.trim().to_string())
.collect(),
)
} else {
None
}
} else {
None
};
// Create site configuration
let site_config = SiteConfig {
name: main_config["name"]
.as_str()
.unwrap_or("default")
.to_string(),
title: main_config["title"].as_str().unwrap_or("").to_string(),
description: main_config["description"].as_str().map(|s| s.to_string()),
keywords,
url: main_config["url"].as_str().map(|s| s.to_string()),
favicon: main_config["favicon"].as_str().map(|s| s.to_string()),
header: header_config,
footer: footer_config,
collections: collection_configs,
pages: page_configs,
base_path: path.to_path_buf(),
};
Ok(site_config)
}
/// Parse site configuration from a directory using the specified strategy
///
/// # Arguments
///
/// * `path` - Path to the directory containing configuration files
/// * `strategy` - Parsing strategy to use
///
/// # Returns
///
/// The parsed site configuration or an error
pub fn parse_site_config_with_strategy<P: AsRef<Path>>(path: P, strategy: ParsingStrategy) -> Result<SiteConfig> {
let path = path.as_ref();
// Check if the directory exists
if !path.exists() {
return Err(WebBuilderError::MissingDirectory(path.to_path_buf()));
}
// Check if the directory is a directory
if !path.is_dir() {
return Err(WebBuilderError::InvalidConfiguration(format!(
"{:?} is not a directory",
path
)));
}
// Create a basic site configuration
let mut site_config = SiteConfig {
name: "default".to_string(),
title: "".to_string(),
description: None,
keywords: None,
url: None,
favicon: None,
header: None,
footer: None,
collections: Vec::new(),
pages: Vec::new(),
base_path: path.to_path_buf(),
};
// Parse main.hjson
let main_path = path.join("main.hjson");
if main_path.exists() {
let main_config: Value = parse_file(main_path, strategy)?;
// Extract values from main.hjson
if let Some(name) = main_config.get("name").and_then(|v| v.as_str()) {
site_config.name = name.to_string();
}
if let Some(title) = main_config.get("title").and_then(|v| v.as_str()) {
site_config.title = title.to_string();
}
if let Some(description) = main_config.get("description").and_then(|v| v.as_str()) {
site_config.description = Some(description.to_string());
}
if let Some(url) = main_config.get("url").and_then(|v| v.as_str()) {
site_config.url = Some(url.to_string());
}
if let Some(favicon) = main_config.get("favicon").and_then(|v| v.as_str()) {
site_config.favicon = Some(favicon.to_string());
}
if let Some(keywords) = main_config.get("keywords").and_then(|v| v.as_array()) {
let keywords_vec: Vec<String> = keywords
.iter()
.filter_map(|k| k.as_str().map(|s| s.to_string()))
.collect();
if !keywords_vec.is_empty() {
site_config.keywords = Some(keywords_vec);
}
}
}
// Parse header.hjson
let header_path = path.join("header.hjson");
if header_path.exists() {
site_config.header = Some(parse_file(header_path, strategy)?);
}
// Parse footer.hjson
let footer_path = path.join("footer.hjson");
if footer_path.exists() {
site_config.footer = Some(parse_file(footer_path, strategy)?);
}
// Parse collection.hjson
let collection_path = path.join("collection.hjson");
if collection_path.exists() {
let collection_array: Vec<CollectionConfig> = parse_file(collection_path, strategy)?;
// Process each collection
for mut collection in collection_array {
// Convert web interface URL to Git URL if needed
if let Some(url) = &collection.url {
if url.contains("/src/branch/") {
// This is a web interface URL, convert it to a Git URL
let parts: Vec<&str> = url.split("/src/branch/").collect();
if parts.len() == 2 {
collection.url = Some(format!("{}.git", parts[0]));
}
}
}
site_config.collections.push(collection);
}
}
// Parse pages directory
let pages_path = path.join("pages");
if pages_path.exists() && pages_path.is_dir() {
for entry in fs::read_dir(pages_path)? {
let entry = entry?;
let entry_path = entry.path();
if entry_path.is_file() && entry_path.extension().map_or(false, |ext| ext == "hjson") {
let pages_array: Vec<PageConfig> = parse_file(&entry_path, strategy)?;
site_config.pages.extend(pages_array);
}
}
}
Ok(site_config)
}
/// Parse site configuration from a directory using the recommended strategy (Hjson)
///
/// # Arguments
///
/// * `path` - Path to the directory containing configuration files
///
/// # Returns
///
/// The parsed site configuration or an error
pub fn parse_site_config_recommended<P: AsRef<Path>>(path: P) -> Result<SiteConfig> {
parse_site_config_with_strategy(path, ParsingStrategy::Hjson)
}
/// Parse site configuration from a directory using the auto-detect strategy
///
/// # Arguments
///
/// * `path` - Path to the directory containing configuration files
///
/// # Returns
///
/// The parsed site configuration or an error
pub fn parse_site_config_auto<P: AsRef<Path>>(path: P) -> Result<SiteConfig> {
parse_site_config_with_strategy(path, ParsingStrategy::Auto)
}

View File

@@ -0,0 +1,267 @@
#[cfg(test)]
mod tests {
use crate::error::WebBuilderError;
use crate::parser::{parse_site_config_with_strategy, ParsingStrategy};
use std::fs;
use std::path::PathBuf;
use tempfile::TempDir;
fn create_test_site(temp_dir: &TempDir) -> PathBuf {
let site_dir = temp_dir.path().join("site");
fs::create_dir(&site_dir).unwrap();
// Create main.hjson
let main_hjson = r#"{
# Main configuration
"name": "test",
"title": "Test Site",
"description": "A test site",
"url": "https://example.com",
"favicon": "favicon.ico",
"keywords": [
"demo",
"test",
"example"
]
}"#;
fs::write(site_dir.join("main.hjson"), main_hjson).unwrap();
// Create header.hjson
let header_hjson = r#"{
# Header configuration
"title": "Test Site",
"logo": {
"src": "logo.png",
"alt": "Logo"
},
"menu": [
{
"label": "Home",
"link": "/"
},
{
"label": "About",
"link": "/about"
}
]
}"#;
fs::write(site_dir.join("header.hjson"), header_hjson).unwrap();
// Create collection.hjson
let collection_hjson = r#"[
{
# First collection
"name": "test",
"url": "https://git.threefold.info/tfgrid/home.git",
"description": "A test collection",
"scan": true
},
{
# Second collection
"name": "test2",
"url": "https://git.example.com/src/branch/main/test2",
"description": "Another test collection"
}
]"#;
fs::write(site_dir.join("collection.hjson"), collection_hjson).unwrap();
// Create pages directory
let pages_dir = site_dir.join("pages");
fs::create_dir(&pages_dir).unwrap();
// Create pages/pages.hjson
let pages_hjson = r#"[
{
# Home page
"name": "home",
"title": "Home",
"description": "Home page",
"navpath": "/",
"collection": "test",
"draft": false
},
{
# About page
"name": "about",
"title": "About",
"description": "About page",
"navpath": "/about",
"collection": "test"
}
]"#;
fs::write(pages_dir.join("pages.hjson"), pages_hjson).unwrap();
site_dir
}
#[test]
fn test_parse_site_config_hjson() {
let temp_dir = TempDir::new().unwrap();
let site_dir = create_test_site(&temp_dir);
let config = parse_site_config_with_strategy(&site_dir, ParsingStrategy::Hjson).unwrap();
// Check basic site info
assert_eq!(config.name, "test");
assert_eq!(config.title, "Test Site");
assert_eq!(config.description, Some("A test site".to_string()));
assert_eq!(config.url, Some("https://example.com".to_string()));
assert_eq!(config.favicon, Some("favicon.ico".to_string()));
assert_eq!(
config.keywords,
Some(vec![
"demo".to_string(),
"test".to_string(),
"example".to_string()
])
);
// Check header
assert!(config.header.is_some());
let header = config.header.as_ref().unwrap();
assert_eq!(header.title, Some("Test Site".to_string()));
assert!(header.logo.is_some());
let logo = header.logo.as_ref().unwrap();
assert_eq!(logo.src, "logo.png");
assert_eq!(logo.alt, Some("Logo".to_string()));
// Check collections
assert_eq!(config.collections.len(), 2);
// First collection
assert_eq!(config.collections[0].name, Some("test".to_string()));
assert_eq!(
config.collections[0].url,
Some("https://git.threefold.info/tfgrid/home.git".to_string())
);
assert_eq!(
config.collections[0].description,
Some("A test collection".to_string())
);
assert_eq!(config.collections[0].scan, Some(true));
// Second collection (with URL conversion)
assert_eq!(config.collections[1].name, Some("test2".to_string()));
assert_eq!(
config.collections[1].url,
Some("https://git.example.com.git".to_string())
);
assert_eq!(
config.collections[1].description,
Some("Another test collection".to_string())
);
assert_eq!(config.collections[1].scan, None);
// Check pages
assert_eq!(config.pages.len(), 2);
// First page
assert_eq!(config.pages[0].name, "home");
assert_eq!(config.pages[0].title, "Home");
assert_eq!(config.pages[0].description, Some("Home page".to_string()));
assert_eq!(config.pages[0].navpath, "/");
assert_eq!(config.pages[0].collection, "test");
assert_eq!(config.pages[0].draft, Some(false));
// Second page
assert_eq!(config.pages[1].name, "about");
assert_eq!(config.pages[1].title, "About");
assert_eq!(config.pages[1].description, Some("About page".to_string()));
assert_eq!(config.pages[1].navpath, "/about");
assert_eq!(config.pages[1].collection, "test");
assert_eq!(config.pages[1].draft, None);
}
#[test]
fn test_parse_site_config_auto() {
let temp_dir = TempDir::new().unwrap();
let site_dir = create_test_site(&temp_dir);
let config = parse_site_config_with_strategy(&site_dir, ParsingStrategy::Auto).unwrap();
// Basic checks to ensure it worked
assert_eq!(config.name, "test");
assert_eq!(config.title, "Test Site");
assert_eq!(config.collections.len(), 2);
assert_eq!(config.pages.len(), 2);
}
#[test]
fn test_parse_site_config_simple() {
let temp_dir = TempDir::new().unwrap();
let site_dir = temp_dir.path().join("site");
fs::create_dir(&site_dir).unwrap();
// Create main.hjson in a format that the simple parser can handle
let main_hjson = "name: test\ntitle: Test Site\ndescription: A test site";
fs::write(site_dir.join("main.hjson"), main_hjson).unwrap();
let config = parse_site_config_with_strategy(&site_dir, ParsingStrategy::Simple).unwrap();
// Basic checks to ensure it worked
assert_eq!(config.name, "test");
assert_eq!(config.title, "Test Site");
assert_eq!(config.description, Some("A test site".to_string()));
}
#[test]
fn test_parse_site_config_missing_directory() {
let result = parse_site_config_with_strategy("/nonexistent/directory", ParsingStrategy::Hjson);
assert!(matches!(result, Err(WebBuilderError::MissingDirectory(_))));
}
#[test]
fn test_parse_site_config_not_a_directory() {
let temp_dir = TempDir::new().unwrap();
let file_path = temp_dir.path().join("file.txt");
fs::write(&file_path, "not a directory").unwrap();
let result = parse_site_config_with_strategy(&file_path, ParsingStrategy::Hjson);
assert!(matches!(
result,
Err(WebBuilderError::InvalidConfiguration(_))
));
}
#[test]
fn test_parse_site_config_minimal() {
let temp_dir = TempDir::new().unwrap();
let site_dir = temp_dir.path().join("site");
fs::create_dir(&site_dir).unwrap();
// Create minimal main.hjson
let main_hjson = r#"{ "name": "minimal", "title": "Minimal Site" }"#;
fs::write(site_dir.join("main.hjson"), main_hjson).unwrap();
let config = parse_site_config_with_strategy(&site_dir, ParsingStrategy::Hjson).unwrap();
assert_eq!(config.name, "minimal");
assert_eq!(config.title, "Minimal Site");
assert_eq!(config.description, None);
assert_eq!(config.url, None);
assert_eq!(config.favicon, None);
assert!(config.header.is_none());
assert!(config.footer.is_none());
assert!(config.collections.is_empty());
assert!(config.pages.is_empty());
}
#[test]
fn test_parse_site_config_empty() {
let temp_dir = TempDir::new().unwrap();
let site_dir = temp_dir.path().join("site");
fs::create_dir(&site_dir).unwrap();
let config = parse_site_config_with_strategy(&site_dir, ParsingStrategy::Hjson).unwrap();
assert_eq!(config.name, "default");
assert_eq!(config.title, "");
assert_eq!(config.description, None);
assert_eq!(config.url, None);
assert_eq!(config.favicon, None);
assert!(config.header.is_none());
assert!(config.footer.is_none());
assert!(config.collections.is_empty());
assert!(config.pages.is_empty());
}
}