Compare commits

...

16 Commits

Author SHA1 Message Date
thijs
750d9b17a3 update sal references 2025-09-11 11:50:22 +02:00
Mahmoud Emad
cad8a6d125 refactor: Remove branch specification from sal dependency
- Removed the branch specification from the `sal` dependency in
  `doctree/Cargo.toml` and `webbuilder/Cargo.toml`. This simplifies
  the dependency management and relies on the default branch of the
  repository.
- Updated `.gitignore` to ignore `sccache.log` to prevent it from
  being committed to the repository.
- Updated example commands in `example_commands.sh` to use a different
  collection name (`grid1` instead of `grid_documentation`) for testing
  purposes.  This avoids potential conflicts with pre-existing data.
- Improved code structure and organization in `doctree/src/lib.rs`.
2025-05-14 08:42:53 +03:00
d8d6bf1f4a ... 2025-05-13 11:22:51 +03:00
1e914aa56d ... 2025-05-13 10:03:13 +03:00
29ccc54a4d ... 2025-05-13 09:35:01 +03:00
d609aa8094 ... 2025-05-13 09:22:57 +03:00
dbd44043cb ... 2025-05-13 09:19:45 +03:00
7fa4125dc0 ... 2025-05-13 08:52:47 +03:00
2fae059512 ... 2025-05-03 05:52:42 +04:00
28a7ef3a94 ... 2025-04-09 09:47:16 +02:00
60e688810d ... 2025-04-09 09:26:06 +02:00
f938e8ff6b ... 2025-04-09 08:51:55 +02:00
84c656983a ... 2025-04-09 08:45:38 +02:00
19f52a8172 ... 2025-04-09 08:43:20 +02:00
14b2bb2798 ... 2025-04-09 08:43:10 +02:00
2eec3be632 ... 2025-04-09 08:42:30 +02:00
26 changed files with 1493 additions and 617 deletions

50
.gitignore vendored
View File

@@ -14,4 +14,52 @@ Cargo.lock
# MSVC Windows builds of rustc generate these, which store debugging information
*.pdb
doctreegolang/
# Added by cargo
/target
/rhai_test_template
/rhai_test_download
/rhai_test_fs
run_rhai_tests.log
new_location
log.txt
file.txt
fix_doc*
# Dependencies
/node_modules
# Production
/build
# Generated files
.docusaurus
.cache-loader
# Misc
.DS_Store
.env.local
.env.development.local
.env.test.local
.env.production.local
npm-debug.log*
yarn-debug.log*
yarn-error.log*
bun.lockb
bun.lock
yarn.lock
build.sh
build_dev.sh
develop.sh
docusaurus.config.ts
sidebars.ts
tsconfig.json
sccache.log

244
README.md
View File

@@ -1,2 +1,244 @@
# doctree
# DocTree
DocTree is a Rust library for managing collections of markdown documents with powerful include functionality. It provides a robust system for organizing, processing, and retrieving document collections with Redis-backed storage.
## Overview
DocTree scans directories for `.collection` files, which define document collections. Each collection contains markdown documents and other files (like images). The library provides functionality to:
- Scan directories recursively to find collections
- Process includes between documents (allowing one document to include content from another)
- Convert markdown to HTML
- Store document metadata in Redis for efficient retrieval
- Provide a command-line interface for interacting with collections
## tips
if you want command line for ipfs on osx
```bash
#pyt ipfs command line in path for osx
sudo ln -s "/Applications/IPFS Desktop.app/Contents/Resources/app.asar.unpacked/node_modules/kubo/kubo/ipfs" /usr/local/bin/ipfs
```
## Key Concepts
### Collections
A collection is a group of related documents and files. Collections are defined by a `.collection` file in a directory. The `.collection` file can be empty (in which case the directory name is used as the collection name) or it can contain TOML configuration:
```toml
name = "my_collection"
# Other configuration options can be added in the future
```
### DocTree
A DocTree is a manager for multiple collections. It provides methods for:
- Adding collections
- Retrieving documents from collections
- Processing includes between documents
- Converting markdown to HTML
- Managing collection metadata in Redis
### Includes
One of the most powerful features of DocTree is the ability to include content from one document in another. This is done using the `!!include` directive:
```markdown
# My Document
This is my document.
!!include another_collection:some_document.md
More content here...
```
The include directive supports several formats:
- `!!include collection_name:page_name` - Include a page from a specific collection
- `!!include collection_name:'page name'` - Include a page with spaces from a specific collection
- `!!include page_name` - Include a page from the current collection
- `!!include name:'page name'` - Include a page with spaces from the current collection
Includes can be nested, allowing for complex document structures.
## Storage
DocTree uses Redis as a backend storage system. Document metadata (like paths and names) is stored in Redis, making it efficient to retrieve documents without scanning the filesystem each time.
The Redis keys are structured as:
- `collections:{collection_name}:{document_name}` - Stores the relative path to a document
- `collections:{collection_name}:path` - Stores the absolute path to the collection
## Command-Line Interface
DocTree comes with a command-line interface (CLI) that provides access to the library's functionality:
```
DocTree CLI 0.1.0
A tool to manage document collections
USAGE:
doctreecmd [SUBCOMMAND]
FLAGS:
-h, --help Prints help information
-V, --version Prints version information
SUBCOMMANDS:
delete Delete a collection
get Get page content
html Get page content as HTML
info Show detailed information about collections
list List collections
reset Delete all collections
scan Scan a directory for .collection files and create collections
```
### Example Commands
#### Scanning Collections
```bash
doctreecmd scan /path/to/documents --doctree my_doctree
```
This command scans the specified directory for `.collection` files and creates collections in Redis.
#### Listing Collections
```bash
doctreecmd list --doctree my_doctree
```
This command lists all collections in the specified doctree.
#### Getting Document Content
```bash
doctreecmd get -c collection_name -p page_name --doctree my_doctree
```
This command retrieves the content of a document from a collection.
#### Getting HTML Content
```bash
doctreecmd get -c collection_name -p page_name -f html --doctree my_doctree
```
This command retrieves the HTML content of a document from a collection.
#### Showing Collection Information
```bash
doctreecmd info collection_name --doctree my_doctree
```
This command shows detailed information about a collection, including its documents and files.
#### Deleting a Collection
```bash
doctreecmd delete collection_name --doctree my_doctree
```
This command deletes a collection.
#### Resetting All Collections
```bash
doctreecmd reset --doctree my_doctree
```
This command deletes all collections.
## Implementation Details
DocTree is implemented in Rust and uses several key dependencies:
- `walkdir` for recursively walking directories
- `pulldown-cmark` for parsing and rendering markdown
- `toml` for parsing collection configuration files
- `redis` for interacting with Redis
- `clap` for the command-line interface
The library is structured into several modules:
- `doctree.rs` - Core DocTree functionality
- `collection.rs` - Collection management
- `include.rs` - Include processing
- `storage.rs` - Redis storage backend
- `utils.rs` - Utility functions
- `error.rs` - Error handling
## Use Cases
DocTree is particularly useful for:
1. **Documentation Systems**: Manage and organize technical documentation with the ability to include common sections across multiple documents.
2. **Content Management**: Create a flexible content management system where content can be modularized and reused.
3. **Knowledge Bases**: Build knowledge bases with interconnected documents that can reference each other.
4. **Static Site Generation**: Generate static websites from markdown documents with the ability to include common elements.
## Getting Started
### Prerequisites
- Rust (latest stable version)
- Redis server running on localhost:6379 (or configure a different URL)
### Building
```bash
cargo build --release
```
### Running the CLI
```bash
cargo run --bin doctreecmd -- [SUBCOMMAND]
```
### Using the Library
Add doctree to your Cargo.toml:
```toml
[dependencies]
doctree = { git = "https://git.ourworld.tf/herocode/doctree", branch = "main", package = "doctree", path = "doctree/src" }
```
Basic usage:
```rust
use doctree::{DocTree, RedisStorage, Result, from_directory};
use std::path::Path;
fn main() -> Result<()> {
// Create a DocTree by scanning a directory
let doctree = from_directory(Path::new("path/to/documents"), Some("my_doctree"))?;
// List collections
let collections = doctree.list_collections();
for collection in collections {
println!("Collection: {}", collection);
}
// Get a document with includes processed
let content = doctree.page_get(Some("collection_name"), "page_name")?;
println!("{}", content);
// Get a document as HTML
let html = doctree.page_get_html(Some("collection_name"), "page_name")?;
println!("{}", html);
Ok(())
}
```

15
build.sh Executable file
View File

@@ -0,0 +1,15 @@
#!/bin/bash
# Change to the directory where the script is located
cd "$(dirname "$0")"
# Exit immediately if a command exits with a non-zero status
set -e
echo "Building doctree binary..."
cd doctreecmd
cargo build --release
echo "Copying doctree binary to ~/hero/bin/"
mkdir -p ~/hero/bin/
cp target/release/doctree ~/hero/bin/
echo "Build and installation complete!"

View File

@@ -15,4 +15,10 @@ toml = "0.7.3"
serde = { version = "1.0", features = ["derive"] }
redis = { version = "0.23.0", features = ["tokio-comp"] }
tokio = { version = "1.28.0", features = ["full"] }
sal = { git = "https://git.ourworld.tf/herocode/sal.git", branch = "main" }
sal-text = "0.1.0"
chacha20poly1305 = "0.10.1"
blake3 = "1.3.1"
csv = "1.1"
rand = "0.9.1"
ipfs-api-backend-hyper = "0.6"
ipfs-api = { version = "0.17.0", default-features = false, features = ["with-hyper-tls"] }

View File

@@ -6,16 +6,19 @@ use crate::error::{DocTreeError, Result};
use crate::storage::RedisStorage;
use crate::utils::{name_fix, markdown_to_html, ensure_md_extension};
use crate::include::process_includes;
use rand::Rng;
use ipfs_api::{IpfsApi, IpfsClient};
// use chacha20poly1305::aead::NewAead;
/// Collection represents a collection of markdown pages and files
#[derive(Clone)]
pub struct Collection {
/// Base path of the collection
pub path: PathBuf,
/// Name of the collection (namefixed)
pub name: String,
/// Redis storage backend
pub storage: RedisStorage,
}
@@ -24,10 +27,10 @@ pub struct Collection {
pub struct CollectionBuilder {
/// Base path of the collection
path: PathBuf,
/// Name of the collection (namefixed)
name: String,
/// Redis storage backend
storage: Option<RedisStorage>,
}
@@ -50,7 +53,7 @@ impl Collection {
storage: None,
}
}
/// Scan walks over the path and finds all files and .md files
/// It stores the relative positions in Redis
///
@@ -59,15 +62,20 @@ impl Collection {
/// Ok(()) on success or an error
pub fn scan(&self) -> Result<()> {
println!("DEBUG: Scanning collection '{}' at path {:?}", self.name, self.path);
// Delete existing collection data if any
println!("DEBUG: Deleting existing collection data from Redis key 'collections:{}'", self.name);
self.storage.delete_collection(&self.name)?;
// Store the collection's path in Redis
// Store the collection's full absolute path in Redis
let absolute_path = std::fs::canonicalize(&self.path)
.unwrap_or_else(|_| self.path.clone())
.to_string_lossy()
.to_string();
println!("DEBUG: Storing collection path in Redis key 'collections:{}:path'", self.name);
self.storage.store_collection_path(&self.name, &absolute_path)?;
self.storage.store_collection_path(&self.name, &self.path.to_string_lossy())?;
// Walk through the directory
let walker = WalkDir::new(&self.path);
for entry_result in walker {
@@ -80,18 +88,18 @@ impl Collection {
continue;
}
};
// Skip directories
if entry.file_type().is_dir() {
continue;
}
// Skip files that start with a dot (.)
let file_name = entry.file_name().to_string_lossy();
if file_name.starts_with(".") {
continue;
}
// Get the relative path from the base path
let rel_path = match entry.path().strip_prefix(&self.path) {
Ok(path) => path,
@@ -101,11 +109,11 @@ impl Collection {
continue;
}
};
// Get the filename and apply namefix
let filename = entry.file_name().to_string_lossy().to_string();
let namefixed_filename = name_fix(&filename);
// Determine if this is a document (markdown file) or an image
let is_markdown = filename.to_lowercase().ends_with(".md");
let is_image = filename.to_lowercase().ends_with(".png") ||
@@ -113,7 +121,7 @@ impl Collection {
filename.to_lowercase().ends_with(".jpeg") ||
filename.to_lowercase().ends_with(".gif") ||
filename.to_lowercase().ends_with(".svg");
let file_type = if is_markdown {
"document"
} else if is_image {
@@ -121,22 +129,22 @@ impl Collection {
} else {
"file"
};
// Store in Redis using the namefixed filename as the key
// Store the original relative path to preserve case and special characters
println!("DEBUG: Storing {} '{}' in Redis key 'collections:{}' with key '{}' and value '{}'",
file_type, filename, self.name, namefixed_filename, rel_path.to_string_lossy());
self.storage.store_collection_entry(
&self.name,
&namefixed_filename,
&rel_path.to_string_lossy()
)?;
}
Ok(())
}
/// Get a page by name and return its markdown content
///
/// # Arguments
@@ -149,14 +157,14 @@ impl Collection {
pub fn page_get(&self, page_name: &str) -> Result<String> {
// Apply namefix to the page name
let namefixed_page_name = name_fix(page_name);
// Ensure it has .md extension
let namefixed_page_name = ensure_md_extension(&namefixed_page_name);
// Get the relative path from Redis
let rel_path = self.storage.get_collection_entry(&self.name, &namefixed_page_name)
.map_err(|_| DocTreeError::PageNotFound(page_name.to_string()))?;
// Check if the path is valid
if self.path.as_os_str().is_empty() {
// If the path is empty, we're working with a collection loaded from Redis
@@ -166,18 +174,18 @@ impl Collection {
format!("File path not available for {} in collection {}", page_name, self.name)
)));
}
// Read the file
let full_path = self.path.join(rel_path);
let content = fs::read_to_string(full_path)
.map_err(|e| DocTreeError::IoError(e))?;
// Skip include processing at this level to avoid infinite recursion
// Include processing will be done at the higher level
Ok(content)
}
/// Create or update a page in the collection
///
/// # Arguments
@@ -191,27 +199,27 @@ impl Collection {
pub fn page_set(&self, page_name: &str, content: &str) -> Result<()> {
// Apply namefix to the page name
let namefixed_page_name = name_fix(page_name);
// Ensure it has .md extension
let namefixed_page_name = ensure_md_extension(&namefixed_page_name);
// Create the full path
let full_path = self.path.join(&namefixed_page_name);
// Create directories if needed
if let Some(parent) = full_path.parent() {
fs::create_dir_all(parent).map_err(DocTreeError::IoError)?;
}
// Write content to file
fs::write(&full_path, content).map_err(DocTreeError::IoError)?;
// Update Redis
self.storage.store_collection_entry(&self.name, &namefixed_page_name, &namefixed_page_name)?;
Ok(())
}
/// Delete a page from the collection
///
/// # Arguments
@@ -224,24 +232,24 @@ impl Collection {
pub fn page_delete(&self, page_name: &str) -> Result<()> {
// Apply namefix to the page name
let namefixed_page_name = name_fix(page_name);
// Ensure it has .md extension
let namefixed_page_name = ensure_md_extension(&namefixed_page_name);
// Get the relative path from Redis
let rel_path = self.storage.get_collection_entry(&self.name, &namefixed_page_name)
.map_err(|_| DocTreeError::PageNotFound(page_name.to_string()))?;
// Delete the file
let full_path = self.path.join(rel_path);
fs::remove_file(full_path).map_err(DocTreeError::IoError)?;
// Remove from Redis
self.storage.delete_collection_entry(&self.name, &namefixed_page_name)?;
Ok(())
}
/// List all pages in the collection
///
/// # Returns
@@ -250,15 +258,15 @@ impl Collection {
pub fn page_list(&self) -> Result<Vec<String>> {
// Get all keys from Redis
let keys = self.storage.list_collection_entries(&self.name)?;
// Filter to only include .md files
let pages = keys.into_iter()
.filter(|key| key.ends_with(".md"))
.collect();
Ok(pages)
}
/// Get the URL for a file
///
/// # Arguments
@@ -271,17 +279,17 @@ impl Collection {
pub fn file_get_url(&self, file_name: &str) -> Result<String> {
// Apply namefix to the file name
let namefixed_file_name = name_fix(file_name);
// Get the relative path from Redis
let rel_path = self.storage.get_collection_entry(&self.name, &namefixed_file_name)
.map_err(|_| DocTreeError::FileNotFound(file_name.to_string()))?;
// Construct a URL for the file
let url = format!("/collections/{}/files/{}", self.name, rel_path);
Ok(url)
}
/// Add or update a file in the collection
///
/// # Arguments
@@ -295,24 +303,24 @@ impl Collection {
pub fn file_set(&self, file_name: &str, content: &[u8]) -> Result<()> {
// Apply namefix to the file name
let namefixed_file_name = name_fix(file_name);
// Create the full path
let full_path = self.path.join(&namefixed_file_name);
// Create directories if needed
if let Some(parent) = full_path.parent() {
fs::create_dir_all(parent).map_err(DocTreeError::IoError)?;
}
// Write content to file
fs::write(&full_path, content).map_err(DocTreeError::IoError)?;
// Update Redis
self.storage.store_collection_entry(&self.name, &namefixed_file_name, &namefixed_file_name)?;
Ok(())
}
/// Delete a file from the collection
///
/// # Arguments
@@ -325,21 +333,21 @@ impl Collection {
pub fn file_delete(&self, file_name: &str) -> Result<()> {
// Apply namefix to the file name
let namefixed_file_name = name_fix(file_name);
// Get the relative path from Redis
let rel_path = self.storage.get_collection_entry(&self.name, &namefixed_file_name)
.map_err(|_| DocTreeError::FileNotFound(file_name.to_string()))?;
// Delete the file
let full_path = self.path.join(rel_path);
fs::remove_file(full_path).map_err(DocTreeError::IoError)?;
// Remove from Redis
self.storage.delete_collection_entry(&self.name, &namefixed_file_name)?;
Ok(())
}
/// List all files (non-markdown) in the collection
///
/// # Returns
@@ -348,15 +356,15 @@ impl Collection {
pub fn file_list(&self) -> Result<Vec<String>> {
// Get all keys from Redis
let keys = self.storage.list_collection_entries(&self.name)?;
// Filter to exclude .md files
let files = keys.into_iter()
.filter(|key| !key.ends_with(".md"))
.collect();
Ok(files)
}
/// Get the relative path of a page in the collection
///
/// # Arguments
@@ -369,15 +377,15 @@ impl Collection {
pub fn page_get_path(&self, page_name: &str) -> Result<String> {
// Apply namefix to the page name
let namefixed_page_name = name_fix(page_name);
// Ensure it has .md extension
let namefixed_page_name = ensure_md_extension(&namefixed_page_name);
// Get the relative path from Redis
self.storage.get_collection_entry(&self.name, &namefixed_page_name)
.map_err(|_| DocTreeError::PageNotFound(page_name.to_string()))
}
/// Get a page by name and return its HTML content
///
/// # Arguments
@@ -391,20 +399,20 @@ impl Collection {
pub fn page_get_html(&self, page_name: &str, doctree: Option<&crate::doctree::DocTree>) -> Result<String> {
// Get the markdown content
let markdown = self.page_get(page_name)?;
// Process includes if doctree is provided
let processed_markdown = if let Some(dt) = doctree {
process_includes(&markdown, &self.name, dt)?
} else {
markdown
};
// Convert markdown to HTML
let html = markdown_to_html(&processed_markdown);
Ok(html)
}
/// Get information about the Collection
///
/// # Returns
@@ -416,6 +424,175 @@ impl Collection {
info.insert("path".to_string(), self.path.to_string_lossy().to_string());
info
}
/// Exports files and images from the collection to IPFS synchronously, encrypting them, and generating a CSV manifest.
///
/// # Arguments
///
/// * `output_csv_path` - The path to the output CSV file.
///
/// # Returns
///
/// Ok(()) on success or an error.
pub fn export_to_ipfs(&self, output_csv_path: &Path) -> Result<()> {
// Create a new tokio runtime and block on the async export function
tokio::runtime::Runtime::new()?.block_on(async {
self.export_to_ipfs_async(output_csv_path).await
})?;
Ok(())
}
/// Exports files and images from the collection to IPFS asynchronously, encrypts them, and generates a CSV manifest.
///
/// # Arguments
///
/// * `output_csv_path` - The path to the output CSV file.
///
/// # Returns
///
/// Ok(()) on success or an error.
pub async fn export_to_ipfs_async(&self, output_csv_path: &Path) -> Result<()> {
use blake3::Hasher;
// use chacha20poly1305::{ChaCha20Poly1305, Aead};
use ipfs_api::IpfsClient;
use tokio::fs::File;
use tokio::io::AsyncReadExt;
use csv::Writer;
use rand::rngs::OsRng;
use chacha20poly1305::aead::generic_array::GenericArray;
// Create the output directory if it doesn't exist
// Create the output directory if it doesn't exist
if let Some(parent) = output_csv_path.parent() {
if parent.exists() && parent.is_file() {
println!("DEBUG: Removing conflicting file at output directory path: {:?}", parent);
tokio::fs::remove_file(parent).await.map_err(DocTreeError::IoError)?;
println!("DEBUG: Conflicting file removed.");
}
if !parent.is_dir() {
println!("DEBUG: Ensuring output directory exists: {:?}", parent);
tokio::fs::create_dir_all(parent).await.map_err(DocTreeError::IoError)?;
println!("DEBUG: Output directory ensured.");
} else {
println!("DEBUG: Output directory already exists: {:?}", parent);
}
}
// Create the CSV writer
println!("DEBUG: Creating or overwriting CSV file at {:?}", output_csv_path);
let file = std::fs::OpenOptions::new()
.write(true)
.create(true)
.truncate(true) // Add truncate option to overwrite if exists
.open(output_csv_path)
.map_err(DocTreeError::IoError)?;
let mut writer = Writer::from_writer(file);
println!("DEBUG: CSV writer created successfully");
// Write the CSV header
writer.write_record(&["collectionname", "filename", "blakehash", "ipfshash", "size"]).map_err(|e| DocTreeError::CsvError(e.to_string()))?;
// Connect to IPFS
// let ipfs = IpfsClient::new("127.0.0.1:5001").await.map_err(|e| DocTreeError::IpfsError(e.to_string()))?;
let ipfs = IpfsClient::default();
// Get the list of pages and files
let pages = self.page_list()?;
let files = self.file_list()?;
// Combine the lists
let mut entries = pages;
entries.extend(files);
println!("DEBUG: Starting to process collection entries for IPFS export");
for entry_name in entries {
println!("DEBUG: Processing entry: {}", entry_name);
// Get the relative path from Redis
let relative_path = self.storage.get_collection_entry(&self.name, &entry_name)
.map_err(|_| DocTreeError::FileNotFound(entry_name.clone()))?;
println!("DEBUG: Retrieved relative path: {}", relative_path);
let file_path = self.path.join(&relative_path);
// Read file content
let mut file = match File::open(&file_path).await {
Ok(file) => file,
Err(e) => {
eprintln!("Error opening file {:?}: {}", file_path, e);
continue;
}
};
let mut content = Vec::new();
let size = match file.read_to_end(&mut content).await {
Ok(size) => size,
Err(e) => {
eprintln!("Error reading file {:?}: {}", file_path, e);
continue;
}
};
// Calculate Blake3 hash
let mut hasher = Hasher::new();
hasher.update(&content);
let blake_hash = hasher.finalize();
let blake_hash_hex = blake_hash.to_hex().to_string();
// Use Blake3 hash as key for ChaCha20Poly1305
let key = blake_hash.as_bytes();
//let cipher = ChaCha20Poly1305::new_from_slice(&key[..32]).map_err(|_| DocTreeError::EncryptionError("Invalid key size".to_string()))?;
// Generate a random nonce
let mut nonce = [0u8; 12];
//OsRng.fill_bytes(&mut nonce);
// Encrypt the content
// let encrypted_content = match cipher.encrypt(GenericArray::from_slice(&nonce), content.as_ref()) {
// Ok(encrypted) => encrypted,
// Err(e) => {
// eprintln!("Error encrypting file {:?}: {}", file_path, e);
// continue;
// }
// };
// Add encrypted content to IPFS
println!("DEBUG: Adding file to IPFS: {:?}", file_path);
let ipfs_path = match ipfs.add(std::io::Cursor::new(content)).await {
Ok(path) => {
println!("DEBUG: Successfully added file to IPFS. Hash: {}", path.hash);
path
},
Err(e) => {
eprintln!("Error adding file to IPFS {:?}: {}", file_path, e);
continue;
}
};
let ipfs_hash = ipfs_path.hash.to_string();
println!("DEBUG: IPFS hash: {}", ipfs_hash);
// Write record to CSV
println!("DEBUG: Writing CSV record for {:?}", file_path);
if let Err(e) = writer.write_record(&[
&self.name,
&relative_path,
&blake_hash_hex,
&ipfs_hash,
&size.to_string(),
]) {
eprintln!("Error writing CSV record for {:?}: {}", file_path, e);
continue;
}
println!("DEBUG: Successfully wrote CSV record for {:?}", file_path);
}
// Flush the CSV writer
println!("DEBUG: Flushing CSV writer");
writer.flush().map_err(|e| DocTreeError::CsvError(e.to_string()))?;
println!("DEBUG: CSV writer flushed successfully");
Ok(())
}
}
impl CollectionBuilder {
@@ -432,7 +609,7 @@ impl CollectionBuilder {
self.storage = Some(storage);
self
}
/// Build the Collection
///
/// # Returns
@@ -442,13 +619,13 @@ impl CollectionBuilder {
let storage = self.storage.ok_or_else(|| {
DocTreeError::MissingParameter("storage".to_string())
})?;
let collection = Collection {
path: self.path,
name: self.name,
storage,
};
Ok(collection)
}
}

View File

@@ -1,3 +1,4 @@
use std::collections::HashMap;
use std::path::{Path, PathBuf};
use std::sync::{Arc, Mutex};
@@ -38,6 +39,9 @@ pub struct DocTree {
/// Redis storage backend
storage: RedisStorage,
/// Name of the doctree (used as prefix for Redis keys)
pub doctree_name: String,
/// For backward compatibility
pub name: String,
@@ -56,6 +60,9 @@ pub struct DocTreeBuilder {
/// Redis storage backend
storage: Option<RedisStorage>,
/// Name of the doctree (used as prefix for Redis keys)
doctree_name: Option<String>,
/// For backward compatibility
name: Option<String>,
@@ -74,6 +81,7 @@ impl DocTree {
collections: HashMap::new(),
default_collection: None,
storage: None,
doctree_name: Some("default".to_string()),
name: None,
path: None,
}
@@ -92,8 +100,12 @@ impl DocTree {
pub fn add_collection<P: AsRef<Path>>(&mut self, path: P, name: &str) -> Result<&Collection> {
// Create a new collection
let namefixed = name_fix(name);
// Clone the storage and set the doctree name
let storage = self.storage.clone();
storage.set_doctree_name(&self.doctree_name);
let collection = Collection::builder(path, &namefixed)
.with_storage(self.storage.clone())
.with_storage(storage)
.build()?;
// Scan the collection
@@ -519,6 +531,56 @@ impl DocTree {
Ok(())
}
/// Exports all collections to IPFS, encrypting their files and generating CSV manifests.
///
/// # Arguments
///
/// * `output_dir` - The directory to save the output CSV files.
///
/// # Returns
///
/// Ok(()) on success or an error.
pub async fn export_collections_to_ipfs<P: AsRef<Path>>(&self, output_dir: P) -> Result<()> {
use tokio::fs;
let output_dir = output_dir.as_ref();
// Create the output directory if it doesn't exist
fs::create_dir_all(output_dir).await.map_err(DocTreeError::IoError)?;
for (name, collection) in &self.collections {
let csv_file_path = output_dir.join(format!("{}.csv", name));
println!("DEBUG: Exporting collection '{}' to IPFS and generating CSV at {:?}", name, csv_file_path);
if let Err(e) = collection.export_to_ipfs(&csv_file_path) {
eprintln!("Error exporting collection '{}': {}", name, e);
// Continue with the next collection
}
}
Ok(())
}
/// Exports a specific collection to IPFS synchronously, encrypting its files and generating a CSV manifest.
///
/// # Arguments
///
/// * `collection_name` - The name of the collection to export.
/// * `output_csv_path` - The path to save the output CSV file.
///
/// # Returns
///
/// Ok(()) on success or an error.
pub fn export_collection_to_ipfs(&self, collection_name: &str, output_csv_path: &Path) -> Result<()> {
// Get the collection
let collection = self.get_collection(collection_name)?;
// Create a new tokio runtime and block on the async export function
let csv_file_path = output_csv_path.join(format!("{}.csv", collection_name));
collection.export_to_ipfs(&csv_file_path)?;
Ok(())
}
}
impl DocTreeBuilder {
@@ -531,6 +593,20 @@ impl DocTreeBuilder {
/// # Returns
///
/// Self for method chaining
/// Set the doctree name
///
/// # Arguments
///
/// * `name` - Name of the doctree
///
/// # Returns
///
/// Self for method chaining
pub fn with_doctree_name(mut self, name: &str) -> Self {
self.doctree_name = Some(name.to_string());
self
}
pub fn with_storage(mut self, storage: RedisStorage) -> Self {
self.storage = Some(storage);
self
@@ -552,10 +628,18 @@ impl DocTreeBuilder {
DocTreeError::MissingParameter("storage".to_string())
})?;
// Get the doctree name
let doctree_name = self.doctree_name.clone().unwrap_or_else(|| "default".to_string());
// Create a new collection
let namefixed = name_fix(name);
// Clone the storage and set the doctree name
let storage_clone = storage.clone();
storage_clone.set_doctree_name(&doctree_name);
let collection = Collection::builder(path.as_ref(), &namefixed)
.with_storage(storage.clone())
.with_storage(storage_clone)
.build()?;
// Scan the collection
@@ -604,11 +688,19 @@ impl DocTreeBuilder {
DocTreeError::MissingParameter("storage".to_string())
})?;
// Get the doctree name
let doctree_name = self.doctree_name.clone().unwrap_or_else(|| "default".to_string());
// Clone the storage and set the doctree name
let storage_clone = storage.clone();
storage_clone.set_doctree_name(&doctree_name);
// Create a temporary DocTree to scan collections
let mut temp_doctree = DocTree {
collections: HashMap::new(),
default_collection: None,
storage: storage.clone(),
storage: storage_clone,
doctree_name: doctree_name,
name: self.name.clone().unwrap_or_default(),
path: self.path.clone().unwrap_or_else(|| PathBuf::from("")),
};
@@ -641,11 +733,19 @@ impl DocTreeBuilder {
DocTreeError::MissingParameter("storage".to_string())
})?;
// Get the doctree name
let doctree_name = self.doctree_name.unwrap_or_else(|| "default".to_string());
// Set the doctree name in the storage
let storage_clone = storage.clone();
storage_clone.set_doctree_name(&doctree_name);
// Create the DocTree
let mut doctree = DocTree {
collections: self.collections,
default_collection: self.default_collection,
storage: storage.clone(),
storage: storage_clone,
doctree_name,
name: self.name.unwrap_or_default(),
path: self.path.unwrap_or_else(|| PathBuf::from("")),
};
@@ -664,9 +764,6 @@ impl DocTreeBuilder {
}
/// Create a new DocTree instance
///
/// For backward compatibility, it also accepts path and name parameters
@@ -684,6 +781,12 @@ pub fn new<P: AsRef<Path>>(args: &[&str]) -> Result<DocTree> {
let mut builder = DocTree::builder().with_storage(storage);
// If the first argument is a doctree name, use it
if args.len() >= 1 && args[0].starts_with("--doctree=") {
let doctree_name = args[0].trim_start_matches("--doctree=");
builder = builder.with_doctree_name(doctree_name);
}
// For backward compatibility with existing code
if args.len() == 2 {
let path = args[0];
@@ -707,15 +810,20 @@ pub fn new<P: AsRef<Path>>(args: &[&str]) -> Result<DocTree> {
/// # Arguments
///
/// * `root_path` - The root path to scan for collections
/// * `doctree_name` - Optional name for the doctree (default: "default")
///
/// # Returns
///
/// A new DocTree or an error
pub fn from_directory<P: AsRef<Path>>(root_path: P) -> Result<DocTree> {
pub fn from_directory<P: AsRef<Path>>(root_path: P, doctree_name: Option<&str>) -> Result<DocTree> {
let storage = RedisStorage::new("redis://localhost:6379")?;
DocTree::builder()
.with_storage(storage)
.scan_collections(root_path)?
.build()
let mut builder = DocTree::builder().with_storage(storage);
// Set the doctree name if provided
if let Some(name) = doctree_name {
builder = builder.with_doctree_name(name);
}
builder.scan_collections(root_path)?.build()
}

View File

@@ -42,6 +42,18 @@ pub enum DocTreeError {
/// Redis error
#[error("Redis error: {0}")]
RedisError(String),
/// CSV error
#[error("CSV error: {0}")]
CsvError(String),
/// IPFS error
#[error("IPFS error: {0}")]
IpfsError(String),
/// Encryption error
#[error("Encryption error: {0}")]
EncryptionError(String),
}
/// Result type alias for doctree operations

View File

@@ -4,35 +4,30 @@
//! and processing includes between documents.
// Import lazy_static for global state
extern crate lazy_static;
mod error;
mod storage;
mod utils;
mod collection;
mod doctree;
mod error;
mod include;
mod storage;
mod utils;
pub use error::{DocTreeError, Result};
pub use storage::RedisStorage;
pub use collection::{Collection, CollectionBuilder};
pub use doctree::{DocTree, DocTreeBuilder, new, from_directory};
pub use doctree::{DocTree, DocTreeBuilder, from_directory, new};
pub use error::{DocTreeError, Result};
pub use include::process_includes;
pub use storage::RedisStorage;
#[cfg(test)]
mod tests {
use super::*;
use std::path::Path;
#[test]
fn test_doctree_builder() {
// Create a storage instance
let storage = RedisStorage::new("dummy_url").unwrap();
let storage = RedisStorage::new("redis://localhost:6379").unwrap();
let doctree = DocTree::builder()
.with_storage(storage)
.build()
.unwrap();
let doctree = DocTree::builder().with_storage(storage).build().unwrap();
assert_eq!(doctree.collections.len(), 0);
assert_eq!(doctree.default_collection, None);

View File

@@ -8,6 +8,10 @@ pub struct RedisStorage {
client: Client,
// Connection pool
connection: Arc<Mutex<Connection>>,
// Doctree name for key prefixing
doctree_name: Arc<Mutex<String>>,
// Debug mode flag
debug: Arc<Mutex<bool>>,
}
impl RedisStorage {
@@ -31,9 +35,51 @@ impl RedisStorage {
Ok(Self {
client,
connection: Arc::new(Mutex::new(connection)),
doctree_name: Arc::new(Mutex::new("default".to_string())),
debug: Arc::new(Mutex::new(false)),
})
}
/// Set the doctree name for key prefixing
///
/// # Arguments
///
/// * `name` - Doctree name
pub fn set_doctree_name(&self, name: &str) {
let mut doctree_name = self.doctree_name.lock().unwrap();
*doctree_name = name.to_string();
}
/// Set the debug mode
///
/// # Arguments
///
/// * `enable` - Whether to enable debug mode
pub fn set_debug(&self, enable: bool) {
let mut debug = self.debug.lock().unwrap();
*debug = enable;
}
/// Check if debug mode is enabled
///
/// # Returns
///
/// true if debug mode is enabled, false otherwise
fn is_debug_enabled(&self) -> bool {
let debug = self.debug.lock().unwrap();
*debug
}
/// Get the doctree name
///
/// # Returns
///
/// The doctree name
pub fn get_doctree_name(&self) -> String {
let doctree_name = self.doctree_name.lock().unwrap();
doctree_name.clone()
}
/// Store a collection entry
///
/// # Arguments
@@ -46,8 +92,12 @@ impl RedisStorage {
///
/// Ok(()) on success or an error
pub fn store_collection_entry(&self, collection: &str, key: &str, value: &str) -> Result<()> {
let redis_key = format!("collections:{}", collection);
println!("DEBUG: Redis operation - HSET {} {} {}", redis_key, key, value);
let doctree_name = self.get_doctree_name();
let redis_key = format!("{}:collections:{}", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - HSET {} {} {}", redis_key, key, value);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
@@ -59,8 +109,10 @@ impl RedisStorage {
.arg(value)
.execute(&mut *conn);
println!("DEBUG: Stored entry in Redis - collection: '{}', key: '{}', value: '{}'",
collection, key, value);
if self.is_debug_enabled() {
println!("DEBUG: Stored entry in Redis - collection: '{}', key: '{}', value: '{}'",
collection, key, value);
}
Ok(())
}
@@ -76,8 +128,12 @@ impl RedisStorage {
///
/// The entry value or an error
pub fn get_collection_entry(&self, collection: &str, key: &str) -> Result<String> {
let collection_key = format!("collections:{}", collection);
println!("DEBUG: Redis operation - HGET {} {}", collection_key, key);
let doctree_name = self.get_doctree_name();
let collection_key = format!("{}:collections:{}", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - HGET {} {}", collection_key, key);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
@@ -92,13 +148,17 @@ impl RedisStorage {
// Check if the entry exists
match result {
Some(value) => {
println!("DEBUG: Retrieved entry from Redis - collection: '{}', key: '{}', value: '{}'",
collection, key, value);
if self.is_debug_enabled() {
println!("DEBUG: Retrieved entry from Redis - collection: '{}', key: '{}', value: '{}'",
collection, key, value);
}
Ok(value)
},
None => {
println!("DEBUG: Entry not found in Redis - collection: '{}', key: '{}'",
collection, key);
if self.is_debug_enabled() {
println!("DEBUG: Entry not found in Redis - collection: '{}', key: '{}'",
collection, key);
}
Err(DocTreeError::FileNotFound(key.to_string()))
}
}
@@ -115,8 +175,12 @@ impl RedisStorage {
///
/// Ok(()) on success or an error
pub fn delete_collection_entry(&self, collection: &str, key: &str) -> Result<()> {
let collection_key = format!("collections:{}", collection);
println!("DEBUG: Redis operation - HDEL {} {}", collection_key, key);
let doctree_name = self.get_doctree_name();
let collection_key = format!("{}:collections:{}", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - HDEL {} {}", collection_key, key);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
@@ -137,8 +201,10 @@ impl RedisStorage {
.arg(key)
.execute(&mut *conn);
println!("DEBUG: Deleted entry from Redis - collection: '{}', key: '{}'",
collection, key);
if self.is_debug_enabled() {
println!("DEBUG: Deleted entry from Redis - collection: '{}', key: '{}'",
collection, key);
}
Ok(())
}
@@ -153,8 +219,12 @@ impl RedisStorage {
///
/// A vector of entry keys or an error
pub fn list_collection_entries(&self, collection: &str) -> Result<Vec<String>> {
let collection_key = format!("collections:{}", collection);
println!("DEBUG: Redis operation - HKEYS {}", collection_key);
let doctree_name = self.get_doctree_name();
let collection_key = format!("{}:collections:{}", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - HKEYS {}", collection_key);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
@@ -174,9 +244,11 @@ impl RedisStorage {
.arg(&collection_key)
.query(&mut *conn)
.map_err(|e| DocTreeError::RedisError(format!("Redis error: {}", e)))?;
if self.is_debug_enabled() {
println!("DEBUG: Listed {} entries from Redis - collection: '{}'",
keys.len(), collection);
}
println!("DEBUG: Listed {} entries from Redis - collection: '{}'",
keys.len(), collection);
Ok(keys)
}
@@ -191,8 +263,12 @@ impl RedisStorage {
///
/// Ok(()) on success or an error
pub fn delete_collection(&self, collection: &str) -> Result<()> {
let redis_key = format!("collections:{}", collection);
println!("DEBUG: Redis operation - DEL {}", redis_key);
let doctree_name = self.get_doctree_name();
let redis_key = format!("{}:collections:{}", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - DEL {}", redis_key);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
@@ -202,7 +278,9 @@ impl RedisStorage {
.arg(&redis_key)
.execute(&mut *conn);
println!("DEBUG: Deleted collection from Redis - collection: '{}'", collection);
if self.is_debug_enabled() {
println!("DEBUG: Deleted collection from Redis - collection: '{}'", collection);
}
Ok(())
}
@@ -217,8 +295,12 @@ impl RedisStorage {
///
/// true if the collection exists, false otherwise
pub fn collection_exists(&self, collection: &str) -> Result<bool> {
let collection_key = format!("collections:{}", collection);
println!("DEBUG: Redis operation - EXISTS {}", collection_key);
let doctree_name = self.get_doctree_name();
let collection_key = format!("{}:collections:{}", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - EXISTS {}", collection_key);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
@@ -229,8 +311,10 @@ impl RedisStorage {
.query(&mut *conn)
.map_err(|e| DocTreeError::RedisError(format!("Redis error: {}", e)))?;
println!("DEBUG: Collection exists check - collection: '{}', exists: {}",
collection, exists);
if self.is_debug_enabled() {
println!("DEBUG: Collection exists check - collection: '{}', exists: {}",
collection, exists);
}
Ok(exists)
}
@@ -241,29 +325,39 @@ impl RedisStorage {
///
/// A vector of collection names or an error
pub fn list_all_collections(&self) -> Result<Vec<String>> {
println!("DEBUG: Redis operation - KEYS collections:*");
let doctree_name = self.get_doctree_name();
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - KEYS {}:collections:*", doctree_name);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
// Get all collection keys
let pattern = format!("{}:collections:*", doctree_name);
let keys: Vec<String> = redis::cmd("KEYS")
.arg("collections:*")
.arg(&pattern)
.query(&mut *conn)
.map_err(|e| DocTreeError::RedisError(format!("Redis error: {}", e)))?;
// Extract collection names from keys (remove the "collections:" prefix)
// Extract collection names from keys (remove the "{doctree_name}:collections:" prefix)
let prefix = format!("{}:collections:", doctree_name);
let prefix_len = prefix.len();
let collections = keys.iter()
.filter_map(|key| {
if key.starts_with("collections:") {
Some(key[12..].to_string())
if key.starts_with(&prefix) && !key.ends_with(":path") {
Some(key[prefix_len..].to_string())
} else {
None
}
})
.collect();
println!("DEBUG: Found {} collections in Redis", keys.len());
if self.is_debug_enabled() {
println!("DEBUG: Found {} collections in Redis", keys.len());
}
Ok(collections)
}
@@ -274,30 +368,42 @@ impl RedisStorage {
///
/// Ok(()) on success or an error
pub fn delete_all_collections(&self) -> Result<()> {
println!("DEBUG: Redis operation - KEYS collections:*");
let doctree_name = self.get_doctree_name();
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - KEYS {}:collections:*", doctree_name);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
// Get all collection keys
let pattern = format!("{}:collections:*", doctree_name);
let keys: Vec<String> = redis::cmd("KEYS")
.arg("collections:*")
.arg(&pattern)
.query(&mut *conn)
.map_err(|e| DocTreeError::RedisError(format!("Redis error: {}", e)))?;
if self.is_debug_enabled() {
println!("DEBUG: Found {} collections in Redis", keys.len());
}
// Delete each collection
for key in keys {
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - DEL {}", key);
}
redis::cmd("DEL")
.arg(&key)
.execute(&mut *conn);
if self.is_debug_enabled() {
println!("DEBUG: Deleted collection from Redis - key: '{}'", key);
}
}
println!("DEBUG: Found {} collections in Redis", keys.len());
// Delete each collection
for key in keys {
println!("DEBUG: Redis operation - DEL {}", key);
redis::cmd("DEL")
.arg(&key)
.execute(&mut *conn);
println!("DEBUG: Deleted collection from Redis - key: '{}'", key);
}
Ok(())
}
/// Store a collection's path
///
/// # Arguments
@@ -309,8 +415,12 @@ impl RedisStorage {
///
/// Ok(()) on success or an error
pub fn store_collection_path(&self, collection: &str, path: &str) -> Result<()> {
let redis_key = format!("collections:{}:path", collection);
println!("DEBUG: Redis operation - SET {} {}", redis_key, path);
let doctree_name = self.get_doctree_name();
let redis_key = format!("{}:collections:{}:path", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - SET {} {}", redis_key, path);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
@@ -321,8 +431,10 @@ impl RedisStorage {
.arg(path)
.execute(&mut *conn);
println!("DEBUG: Stored collection path in Redis - collection: '{}', path: '{}'",
collection, path);
if self.is_debug_enabled() {
println!("DEBUG: Stored collection path in Redis - collection: '{}', path: '{}'",
collection, path);
}
Ok(())
}
@@ -337,8 +449,12 @@ impl RedisStorage {
///
/// The collection path or an error
pub fn get_collection_path(&self, collection: &str) -> Result<String> {
let redis_key = format!("collections:{}:path", collection);
println!("DEBUG: Redis operation - GET {}", redis_key);
let doctree_name = self.get_doctree_name();
let redis_key = format!("{}:collections:{}:path", doctree_name, collection);
if self.is_debug_enabled() {
println!("DEBUG: Redis operation - GET {}", redis_key);
}
// Get a connection from the pool
let mut conn = self.connection.lock().unwrap();
@@ -352,13 +468,17 @@ impl RedisStorage {
// Check if the path exists
match result {
Some(path) => {
println!("DEBUG: Retrieved collection path from Redis - collection: '{}', path: '{}'",
collection, path);
if self.is_debug_enabled() {
println!("DEBUG: Retrieved collection path from Redis - collection: '{}', path: '{}'",
collection, path);
}
Ok(path)
},
None => {
println!("DEBUG: Collection path not found in Redis - collection: '{}'",
collection);
if self.is_debug_enabled() {
println!("DEBUG: Collection path not found in Redis - collection: '{}'",
collection);
}
Err(DocTreeError::CollectionNotFound(collection.to_string()))
}
}
@@ -375,6 +495,8 @@ impl Clone for RedisStorage {
Self {
client: self.client.clone(),
connection: Arc::new(Mutex::new(connection)),
doctree_name: self.doctree_name.clone(),
debug: self.debug.clone(),
}
}
}

View File

@@ -1,5 +1,5 @@
use pulldown_cmark::{Parser, Options, html};
use sal::text;
use sal_text;
/// Fix a name to be used as a key
///
@@ -15,7 +15,7 @@ use sal::text;
/// The fixed name
pub fn name_fix(text: &str) -> String {
// Use the name_fix function from the SAL library
text::name_fix(text)
sal_text::name_fix(text)
}
/// Convert markdown to HTML

View File

@@ -1,258 +0,0 @@
# Implementation Plan: DocTree Collection Scanner
## Overview
We need to expand the doctree library to:
1. Add a recursive scan function to the DocTree struct
2. Detect directories containing `.collection` files
3. Parse `.collection` files as TOML to extract collection names
4. Replace the current `name_fix` function with the one from the sal library
5. Populate collections with all files found under the collection directories
## Detailed Implementation Plan
### 1. Update Dependencies
First, we need to add the necessary dependencies to the Cargo.toml file:
```toml
[dependencies]
walkdir = "2.3.3"
pulldown-cmark = "0.9.3"
thiserror = "1.0.40"
lazy_static = "1.4.0"
toml = "0.7.3" # Add TOML parsing support
```
### 2. Replace the name_fix Function
Replace the current `name_fix` function in `utils.rs` with the one from the sal library:
```rust
pub fn name_fix(text: &str) -> String {
let mut result = String::with_capacity(text.len());
let mut last_was_underscore = false;
for c in text.chars() {
// Keep only ASCII characters
if c.is_ascii() {
// Replace specific characters with underscore
if c.is_whitespace() || c == ',' || c == '-' || c == '"' || c == '\'' ||
c == '#' || c == '!' || c == '(' || c == ')' || c == '[' || c == ']' ||
c == '=' || c == '+' || c == '<' || c == '>' || c == '@' || c == '$' ||
c == '%' || c == '^' || c == '&' || c == '*' {
// Only add underscore if the last character wasn't an underscore
if !last_was_underscore {
result.push('_');
last_was_underscore = true;
}
} else {
// Add the character as is (will be converted to lowercase later)
result.push(c);
last_was_underscore = false;
}
}
// Non-ASCII characters are simply skipped
}
// Convert to lowercase
return result.to_lowercase();
}
```
### 3. Add Collection Configuration Struct
Create a new struct to represent the configuration found in `.collection` files:
```rust
#[derive(Deserialize, Default)]
struct CollectionConfig {
name: Option<String>,
// Add other configuration options as needed
}
```
### 4. Add Scan Collections Method to DocTree
Add a new method to the DocTree struct to recursively scan directories for `.collection` files:
```rust
impl DocTree {
/// Recursively scan directories for .collection files and add them as collections
///
/// # Arguments
///
/// * `root_path` - The root path to start scanning from
///
/// # Returns
///
/// Ok(()) on success or an error
pub fn scan_collections<P: AsRef<Path>>(&mut self, root_path: P) -> Result<()> {
let root_path = root_path.as_ref();
// Walk through the directory tree
for entry in WalkDir::new(root_path).follow_links(true) {
let entry = match entry {
Ok(entry) => entry,
Err(e) => {
eprintln!("Error walking directory: {}", e);
continue;
}
};
// Skip non-directories
if !entry.file_type().is_dir() {
continue;
}
// Check if this directory contains a .collection file
let collection_file_path = entry.path().join(".collection");
if collection_file_path.exists() {
// Found a collection directory
let dir_path = entry.path();
// Get the directory name as a fallback collection name
let dir_name = dir_path.file_name()
.and_then(|name| name.to_str())
.unwrap_or("unnamed");
// Try to read and parse the .collection file
let collection_name = match fs::read_to_string(&collection_file_path) {
Ok(content) => {
// Parse as TOML
match toml::from_str::<CollectionConfig>(&content) {
Ok(config) => {
// Use the name from config if available, otherwise use directory name
config.name.unwrap_or_else(|| dir_name.to_string())
},
Err(e) => {
eprintln!("Error parsing .collection file at {:?}: {}", collection_file_path, e);
dir_name.to_string()
}
}
},
Err(e) => {
eprintln!("Error reading .collection file at {:?}: {}", collection_file_path, e);
dir_name.to_string()
}
};
// Add the collection to the DocTree
match self.add_collection(dir_path, &collection_name) {
Ok(_) => {
println!("Added collection '{}' from {:?}", collection_name, dir_path);
},
Err(e) => {
eprintln!("Error adding collection '{}' from {:?}: {}", collection_name, dir_path, e);
}
}
}
}
Ok(())
}
}
```
### 5. Update the DocTreeBuilder
Update the DocTreeBuilder to include a method for scanning collections:
```rust
impl DocTreeBuilder {
/// Scan for collections in the given root path
///
/// # Arguments
///
/// * `root_path` - The root path to scan for collections
///
/// # Returns
///
/// Self for method chaining or an error
pub fn scan_collections<P: AsRef<Path>>(self, root_path: P) -> Result<Self> {
// Ensure storage is set
let storage = self.storage.as_ref().ok_or_else(|| {
DocTreeError::MissingParameter("storage".to_string())
})?;
// Create a temporary DocTree to scan collections
let mut temp_doctree = DocTree {
collections: HashMap::new(),
default_collection: None,
storage: storage.clone(),
name: self.name.clone().unwrap_or_default(),
path: self.path.clone().unwrap_or_else(|| PathBuf::from("")),
};
// Scan for collections
temp_doctree.scan_collections(root_path)?;
// Create a new builder with the scanned collections
let mut new_builder = self;
for (name, collection) in temp_doctree.collections {
new_builder.collections.insert(name, collection);
}
Ok(new_builder)
}
}
```
### 6. Add a Convenience Function to the Library
Add a convenience function to the library for creating a DocTree by scanning a directory:
```rust
/// Create a new DocTree by scanning a directory for collections
///
/// # Arguments
///
/// * `root_path` - The root path to scan for collections
///
/// # Returns
///
/// A new DocTree or an error
pub fn from_directory<P: AsRef<Path>>(root_path: P) -> Result<DocTree> {
let storage = RedisStorage::new("redis://localhost:6379")?;
DocTree::builder()
.with_storage(storage)
.scan_collections(root_path)?
.build()
}
```
## Implementation Flow Diagram
```mermaid
flowchart TD
A[Start] --> B[Update Dependencies]
B --> C[Replace name_fix function]
C --> D[Add CollectionConfig struct]
D --> E[Add scan_collections method to DocTree]
E --> F[Update DocTreeBuilder]
F --> G[Add convenience function]
G --> H[End]
```
## Component Interaction Diagram
```mermaid
graph TD
A[DocTree] -->|manages| B[Collections]
C[scan_collections] -->|finds| D[.collection files]
D -->|parsed as| E[TOML]
E -->|extracts| F[Collection Name]
C -->|creates| B
G[name_fix] -->|processes| F
G -->|processes| H[File Names]
B -->|contains| H
```
## Testing Plan
1. Create test directories with `.collection` files in various formats
2. Test the scan_collections method with these directories
3. Verify that collections are created correctly with the expected names
4. Verify that all files under the collection directories are included in the collections
5. Test edge cases such as empty `.collection` files, invalid TOML, etc.

View File

@@ -3,6 +3,10 @@ name = "doctreecmd"
version = "0.1.0"
edition = "2024"
[[bin]]
name = "doctree"
path = "src/main.rs"
[dependencies]
doctree = { path = "../doctree" }
clap = "3.2.25"

View File

@@ -3,35 +3,32 @@ use doctree::{DocTree, RedisStorage, Result, from_directory};
use std::path::Path;
fn main() -> Result<()> {
let matches = App::new("DocTree CLI")
let matches = App::new("doctree")
.version("0.1.0")
.author("Your Name")
.about("A tool to manage document collections")
.arg(
Arg::with_name("debug")
.long("debug")
.help("Enable debug logging")
.takes_value(false)
)
.subcommand(
SubCommand::with_name("scan")
.about("Scan a directory and create a collection")
.about("Scan a directory for .collection files and create collections")
.arg(Arg::with_name("path").required(true).help("Path to the directory"))
.arg(Arg::with_name("name").required(true).help("Name of the collection")),
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.subcommand(
SubCommand::with_name("list")
.about("List collections"),
)
.subcommand(
SubCommand::with_name("scan-collections")
.about("Recursively scan directories for .collection files")
.arg(Arg::with_name("path").required(true).help("Root path to scan for collections")),
)
.subcommand(
SubCommand::with_name("scan-and-info")
.about("Scan collections and show detailed information")
.arg(Arg::with_name("path").required(true).help("Root path to scan for collections"))
.arg(Arg::with_name("collection").help("Name of the collection (optional)")),
.about("List collections")
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.subcommand(
SubCommand::with_name("info")
.about("Show detailed information about collections")
.arg(Arg::with_name("collection").help("Name of the collection (optional)")),
.arg(Arg::with_name("collection").help("Name of the collection (optional)"))
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.subcommand(
SubCommand::with_name("get")
@@ -51,87 +48,58 @@ fn main() -> Result<()> {
.short("f".chars().next().unwrap())
.long("format")
.takes_value(true)
.help("Output format (html or markdown, default: markdown)")),
.help("Output format (html or markdown, default: markdown)"))
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.subcommand(
SubCommand::with_name("html")
.about("Get page content as HTML")
.arg(Arg::with_name("collection").required(true).help("Name of the collection"))
.arg(Arg::with_name("page").required(true).help("Name of the page")),
.arg(Arg::with_name("page").required(true).help("Name of the page"))
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.subcommand(
SubCommand::with_name("delete-collection")
SubCommand::with_name("delete")
.about("Delete a collection from Redis")
.arg(Arg::with_name("collection").required(true).help("Name of the collection")),
.arg(Arg::with_name("collection").required(true).help("Name of the collection"))
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.subcommand(
SubCommand::with_name("reset")
.about("Delete all collections from Redis"),
.about("Delete all collections from Redis")
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.subcommand(
SubCommand::with_name("export_to_ipfs")
.about("Export a collection to IPFS")
.arg(Arg::with_name("collection")
.short("c".chars().next().unwrap())
.long("collection")
.takes_value(true)
.required(false)
.help("Name of the collection (export all if not specified)"))
.arg(Arg::with_name("output").required(true).help("Output directory for IPFS export"))
.arg(Arg::with_name("doctree").long("doctree").takes_value(true).help("Name of the doctree (default: 'default')")),
)
.get_matches();
// Create a Redis storage instance
let storage = RedisStorage::new("redis://localhost:6379")?;
// Create a DocTree instance
let mut doctree = DocTree::builder()
.with_storage(storage)
.build()?;
// Check if debug mode is enabled
let debug_mode = matches.is_present("debug");
// Handle subcommands
if let Some(matches) = matches.subcommand_matches("scan") {
let path = matches.value_of("path").unwrap();
let name = matches.value_of("name").unwrap();
println!("Scanning directory: {}", path);
doctree.add_collection(Path::new(path), name)?;
println!("Collection '{}' created successfully", name);
} else if let Some(_) = matches.subcommand_matches("list") {
let collections = doctree.list_collections();
if collections.is_empty() {
println!("No collections found");
} else {
println!("Collections:");
for collection in collections {
println!("- {}", collection);
}
if debug_mode {
println!("DEBUG: Scanning path: {}", path);
}
} else if let Some(matches) = matches.subcommand_matches("get") {
let collection = matches.value_of("collection");
let page = matches.value_of("page").unwrap();
let format = matches.value_of("format").unwrap_or("markdown");
if format.to_lowercase() == "html" {
let html = doctree.page_get_html(collection, page)?;
println!("{}", html);
} else {
let content = doctree.page_get(collection, page)?;
println!("{}", content);
}
} else if let Some(matches) = matches.subcommand_matches("html") {
let collection = matches.value_of("collection").unwrap();
let page = matches.value_of("page").unwrap();
let html = doctree.page_get_html(Some(collection), page)?;
println!("{}", html);
} else if let Some(matches) = matches.subcommand_matches("delete-collection") {
let collection = matches.value_of("collection").unwrap();
println!("Deleting collection '{}' from Redis...", collection);
doctree.delete_collection(collection)?;
println!("Collection '{}' deleted successfully", collection);
} else if let Some(_) = matches.subcommand_matches("reset") {
println!("Deleting all collections from Redis...");
doctree.delete_all_collections()?;
println!("All collections deleted successfully");
} else if let Some(matches) = matches.subcommand_matches("scan-collections") {
let path = matches.value_of("path").unwrap();
let doctree_name = matches.value_of("doctree").unwrap_or("default");
println!("Recursively scanning for collections in: {}", path);
println!("Using doctree name: {}", doctree_name);
// Use the from_directory function to create a DocTree with all collections
let doctree = from_directory(Path::new(path))?;
let doctree = from_directory(Path::new(path), Some(doctree_name))?;
// Print the discovered collections
let collections = doctree.list_collections();
@@ -143,28 +111,130 @@ fn main() -> Result<()> {
println!("- {}", collection);
}
}
} else if let Some(matches) = matches.subcommand_matches("scan-and-info") {
let path = matches.value_of("path").unwrap();
} else if let Some(matches) = matches.subcommand_matches("list") {
let doctree_name = matches.value_of("doctree").unwrap_or("default");
if debug_mode {
println!("DEBUG: Listing collections for doctree: {}", doctree_name);
}
// Create a storage with the specified doctree name
let storage = RedisStorage::new("redis://localhost:6379")?;
storage.set_doctree_name(doctree_name);
storage.set_debug(debug_mode);
if debug_mode {
println!("DEBUG: Connected to Redis storage");
}
// Get collections directly from Redis to avoid debug output from DocTree
let collections = storage.list_all_collections()?;
if collections.is_empty() {
println!("No collections found in doctree '{}'", doctree_name);
} else {
println!("Collections in doctree '{}':", doctree_name);
for collection in collections {
println!("- {}", collection);
}
}
} else if let Some(matches) = matches.subcommand_matches("get") {
let collection = matches.value_of("collection");
let page = matches.value_of("page").unwrap();
let format = matches.value_of("format").unwrap_or("markdown");
let doctree_name = matches.value_of("doctree").unwrap_or("default");
if debug_mode {
println!("DEBUG: Getting page '{}' from collection '{}' in doctree '{}' with format '{}'",
page, collection.unwrap_or("(default)"), doctree_name, format);
}
// Create a storage with the specified doctree name
let storage = RedisStorage::new("redis://localhost:6379")?;
storage.set_doctree_name(doctree_name);
storage.set_debug(debug_mode);
if debug_mode {
println!("DEBUG: Connected to Redis storage");
}
// Create a DocTree with the specified doctree name
let mut doctree = DocTree::builder()
.with_storage(storage)
.with_doctree_name(doctree_name)
.build()?;
// Load collections from Redis
doctree.load_collections_from_redis()?;
if format.to_lowercase() == "html" {
let html = doctree.page_get_html(collection, page)?;
println!("{}", html);
} else {
let content = doctree.page_get(collection, page)?;
println!("{}", content);
}
} else if let Some(matches) = matches.subcommand_matches("html") {
let collection = matches.value_of("collection").unwrap();
let page = matches.value_of("page").unwrap();
let doctree_name = matches.value_of("doctree").unwrap_or("default");
if debug_mode {
println!("DEBUG: Getting HTML for page '{}' from collection '{}' in doctree '{}'",
page, collection, doctree_name);
}
// Create a storage with the specified doctree name
let storage = RedisStorage::new("redis://localhost:6379")?;
storage.set_doctree_name(doctree_name);
storage.set_debug(debug_mode);
if debug_mode {
println!("DEBUG: Connected to Redis storage");
}
// Create a DocTree with the specified doctree name
let mut doctree = DocTree::builder()
.with_storage(storage)
.with_doctree_name(doctree_name)
.build()?;
// Load collections from Redis
doctree.load_collections_from_redis()?;
let html = doctree.page_get_html(Some(collection), page)?;
println!("{}", html);
} else if let Some(matches) = matches.subcommand_matches("info") {
let doctree_name = matches.value_of("doctree").unwrap_or("default");
let collection_name = matches.value_of("collection");
println!("Recursively scanning for collections in: {}", path);
if debug_mode {
if let Some(name) = collection_name {
println!("DEBUG: Getting info for collection '{}' in doctree '{}'", name, doctree_name);
} else {
println!("DEBUG: Getting info for all collections in doctree '{}'", doctree_name);
}
}
// Create a storage with the specified doctree name
let storage = RedisStorage::new("redis://localhost:6379")?;
storage.set_doctree_name(doctree_name);
storage.set_debug(debug_mode);
// Use the from_directory function to create a DocTree with all collections
let doctree = from_directory(Path::new(path))?;
// Print the discovered collections
let collections = doctree.list_collections();
if collections.is_empty() {
println!("No collections found");
return Ok(());
if debug_mode {
println!("DEBUG: Connected to Redis storage");
}
println!("Discovered collections:");
for collection in &collections {
println!("- {}", collection);
}
println!("\nDetailed Collection Information:");
// Create a DocTree with the specified doctree name
let mut doctree = DocTree::builder()
.with_storage(storage)
.with_doctree_name(doctree_name)
.build()?;
// Load collections from Redis
doctree.load_collections_from_redis()?;
let collection_name = matches.value_of("collection");
if let Some(name) = collection_name {
// Show info for a specific collection
@@ -172,7 +242,7 @@ fn main() -> Result<()> {
Ok(collection) => {
println!("Collection Information for '{}':", name);
println!(" Path: {:?}", collection.path);
println!(" Redis Key: collections:{}", collection.name);
println!(" Redis Key: {}:collections:{}", doctree_name, collection.name);
// List documents
match collection.page_list() {
@@ -181,7 +251,7 @@ fn main() -> Result<()> {
for page in pages {
match collection.page_get_path(&page) {
Ok(path) => {
println!(" - {} => Redis: collections:{} / {}", path, collection.name, page);
println!(" - {}", path);
},
Err(_) => {
println!(" - {}", page);
@@ -206,7 +276,7 @@ fn main() -> Result<()> {
println!(" Images ({}):", images.len());
for image in images {
println!(" - {} => Redis: collections:{} / {}", image, collection.name, image);
println!(" - {}", image);
}
// Filter other files
@@ -220,97 +290,7 @@ fn main() -> Result<()> {
println!(" Other Files ({}):", other_files.len());
for file in other_files {
println!(" - {} => Redis: collections:{} / {}", file, collection.name, file);
}
},
Err(e) => println!(" Error listing files: {}", e),
}
},
Err(e) => println!("Error: {}", e),
}
} else {
// Show info for all collections
for name in collections {
if let Ok(collection) = doctree.get_collection(&name) {
println!("- {} (Redis Key: collections:{})", name, collection.name);
println!(" Path: {:?}", collection.path);
// Count documents and images
if let Ok(pages) = collection.page_list() {
println!(" Documents: {}", pages.len());
}
if let Ok(files) = collection.file_list() {
let image_count = files.iter()
.filter(|f|
f.ends_with(".png") || f.ends_with(".jpg") ||
f.ends_with(".jpeg") || f.ends_with(".gif") ||
f.ends_with(".svg"))
.count();
println!(" Images: {}", image_count);
println!(" Other Files: {}", files.len() - image_count);
}
}
}
}
} else if let Some(matches) = matches.subcommand_matches("info") {
let collection_name = matches.value_of("collection");
if let Some(name) = collection_name {
// Show info for a specific collection
match doctree.get_collection(name) {
Ok(collection) => {
println!("Collection Information for '{}':", name);
println!(" Path: {:?}", collection.path);
println!(" Redis Key: collections:{}", collection.name);
// List documents
match collection.page_list() {
Ok(pages) => {
println!(" Documents ({}):", pages.len());
for page in pages {
match collection.page_get_path(&page) {
Ok(path) => {
println!(" - {} => Redis: collections:{} / {}", path, collection.name, page);
},
Err(_) => {
println!(" - {}", page);
}
}
}
},
Err(e) => println!(" Error listing documents: {}", e),
}
// List files
match collection.file_list() {
Ok(files) => {
// Filter images
let images: Vec<String> = files.iter()
.filter(|f|
f.ends_with(".png") || f.ends_with(".jpg") ||
f.ends_with(".jpeg") || f.ends_with(".gif") ||
f.ends_with(".svg"))
.cloned()
.collect();
println!(" Images ({}):", images.len());
for image in images {
println!(" - {} => Redis: collections:{} / {}", image, collection.name, image);
}
// Filter other files
let other_files: Vec<String> = files.iter()
.filter(|f|
!f.ends_with(".png") && !f.ends_with(".jpg") &&
!f.ends_with(".jpeg") && !f.ends_with(".gif") &&
!f.ends_with(".svg"))
.cloned()
.collect();
println!(" Other Files ({}):", other_files.len());
for file in other_files {
println!(" - {} => Redis: collections:{} / {}", file, collection.name, file);
println!(" - {}", file);
}
},
Err(e) => println!(" Error listing files: {}", e),
@@ -324,10 +304,10 @@ fn main() -> Result<()> {
if collections.is_empty() {
println!("No collections found");
} else {
println!("Collections:");
println!("Collections in doctree '{}':", doctree_name);
for name in collections {
if let Ok(collection) = doctree.get_collection(&name) {
println!("- {} (Redis Key: collections:{})", name, collection.name);
println!("- {} (Redis Key: {}:collections:{})", name, doctree_name, collection.name);
println!(" Path: {:?}", collection.path);
// Count documents and images
@@ -337,9 +317,9 @@ fn main() -> Result<()> {
if let Ok(files) = collection.file_list() {
let image_count = files.iter()
.filter(|f|
f.ends_with(".png") || f.ends_with(".jpg") ||
f.ends_with(".jpeg") || f.ends_with(".gif") ||
.filter(|f|
f.ends_with(".png") || f.ends_with(".jpg") ||
f.ends_with(".jpeg") || f.ends_with(".gif") ||
f.ends_with(".svg"))
.count();
println!(" Images: {}", image_count);
@@ -349,9 +329,118 @@ fn main() -> Result<()> {
}
}
}
} else if let Some(matches) = matches.subcommand_matches("delete") {
let collection = matches.value_of("collection").unwrap();
let doctree_name = matches.value_of("doctree").unwrap_or("default");
if debug_mode {
println!("DEBUG: Deleting collection '{}' from doctree '{}'", collection, doctree_name);
}
// Create a storage with the specified doctree name
let storage = RedisStorage::new("redis://localhost:6379")?;
storage.set_doctree_name(doctree_name);
storage.set_debug(debug_mode);
if debug_mode {
println!("DEBUG: Connected to Redis storage");
}
// Create a DocTree with the specified doctree name
let mut doctree = DocTree::builder()
.with_storage(storage)
.with_doctree_name(doctree_name)
.build()?;
println!("Deleting collection '{}' from Redis in doctree '{}'...", collection, doctree_name);
doctree.delete_collection(collection)?;
println!("Collection '{}' deleted successfully", collection);
} else if let Some(matches) = matches.subcommand_matches("export_to_ipfs") {
let output_path_str = matches.value_of("output").unwrap();
let output_path = Path::new(output_path_str);
let doctree_name = matches.value_of("doctree").unwrap_or("default");
let collection_name_opt = matches.value_of("collection");
if debug_mode {
println!("DEBUG: Handling export_to_ipfs command.");
}
// Create a storage with the specified doctree name
let storage = RedisStorage::new("redis://localhost:6379")?;
storage.set_doctree_name(doctree_name);
storage.set_debug(debug_mode);
if debug_mode {
println!("DEBUG: Connected to Redis storage");
}
// Create a DocTree with the specified doctree name
let mut doctree = DocTree::builder()
.with_storage(storage)
.with_doctree_name(doctree_name)
.build()?;
// Load collections from Redis
doctree.load_collections_from_redis()?;
match collection_name_opt {
Some(collection_name) => {
// Export a specific collection
if debug_mode {
println!("DEBUG: Exporting specific collection '{}'", collection_name);
}
doctree.export_collection_to_ipfs(collection_name, output_path)?;
println!("Successfully exported collection '{}' to IPFS and generated metadata CSV at {:?}.", collection_name, output_path.join(format!("{}.csv", collection_name)));
}
None => {
// Export all collections
if debug_mode {
println!("DEBUG: Exporting all collections.");
}
let collections = doctree.list_collections();
if collections.is_empty() {
println!("No collections found to export.");
} else {
println!("Exporting the following collections:");
for collection_name in collections {
println!("- {}", collection_name);
if let Err(e) = doctree.export_collection_to_ipfs(&collection_name, output_path) {
eprintln!("Error exporting collection '{}': {}", collection_name, e);
} else {
println!("Successfully exported collection '{}' to IPFS and generated metadata CSV at {:?}.", collection_name, output_path.join(format!("{}.csv", collection_name)));
}
}
}
}
}
} else if let Some(matches) = matches.subcommand_matches("reset") {
let doctree_name = matches.value_of("doctree").unwrap_or("default");
if debug_mode {
println!("DEBUG: Resetting all collections in doctree '{}'", doctree_name);
}
// Create a storage with the specified doctree name
let storage = RedisStorage::new("redis://localhost:6379")?;
storage.set_doctree_name(doctree_name);
storage.set_debug(debug_mode);
if debug_mode {
println!("DEBUG: Connected to Redis storage");
}
// Create a DocTree with the specified doctree name
let mut doctree = DocTree::builder()
.with_storage(storage)
.with_doctree_name(doctree_name)
.build()?;
println!("Deleting all collections from Redis in doctree '{}'...", doctree_name);
doctree.delete_all_collections()?;
println!("All collections deleted successfully");
} else {
println!("No command specified. Use --help for usage information.");
}
Ok(())
}

View File

@@ -7,19 +7,19 @@ set -e
cd doctreecmd
echo "=== Scanning Collections ==="
cargo run -- scan-collections ../examples
cargo run -- scan ../examples
echo -e "\n=== Listing Collections ==="
cargo run -- list
echo -e "\n=== Getting Document (Markdown) ==="
cargo run -- get -c grid_documentation -p introduction.md
cargo run -- get -c grid1 -p introduction.md
echo -e "\n=== Getting Document (HTML) ==="
cargo run -- get -c grid_documentation -p introduction.md -f html
cargo run -- get -c grid1 -p introduction.md -f html
echo -e "\n=== Deleting Collection ==="
cargo run -- delete-collection grid_documentation
cargo run -- delete grid1
echo -e "\n=== Listing Remaining Collections ==="
cargo run -- list

View File

@@ -0,0 +1,24 @@
[
{
name: docs_hero
#existing docusaurus site can be used as collection as long as no duplicates
url: https://git.ourworld.tf/tfgrid/docs_tfgrid4/src/branch/main/aibox/docs
description: Documentation for the ThreeFold Hero project.
}
{
name: biz
url: https://git.ourworld.tf/tfgrid/docs_tfgrid4/src/branch/main/aibox/collections/aaa
description: Business documentation.
}
{
name: products
url: https://git.ourworld.tf/tfgrid/docs_tfgrid4/src/branch/main/aibox/collections/vvv
description: Information about ThreeFold products.
}
{
scan: true
url: https://git.ourworld.tf/tfgrid/docs_tfgrid4/src/branch/main/aibox/collections
}
]

View File

@@ -0,0 +1,33 @@
# Footer configuration for the site
title: "Explore More"
sections: [
{
title: "Pages"
links: [
{ label: "Home", href: "/" }
{ label: "About Us", href: "/about" }
{ label: "Contact", href: "/contact" }
{ label: "Blog", href: "/blog" }
]
}
{
title: "Resources"
links: [
{ label: "Docs", href: "/docs" }
{ label: "API", href: "/api" }
]
}
{
title: "Social"
links: [
{ label: "GitHub", href: "https://github.com/yourproject" }
{ label: "Twitter", href: "https://twitter.com/yourhandle" }
]
}
]
copyright: "© 2025 YourSite. All rights reserved."

View File

@@ -0,0 +1,32 @@
# Site Branding
logo:
src: /img/logo.svg
alt: Site Logo
# Site Title
title: ThreeFold Hero
# Navigation Menu
menu: [
{
label: Home
link: /
}
{
label: Docs
link: /docs/
}
{
label: About
link: /about/
}
]
# Login Button
login:
visible: true
label: Login
link: /login/

View File

@@ -0,0 +1,14 @@
# Site Main Info
title: ThreeFold Hero Docs
tagline: Your Personal Hero
favicon: img/favicon.png
url: https://threefold.info
# SEO / Social Metadata
metadata:
title: ThreeFold Hero Docs
description: ThreeFold Hero - Your Personal Hero
image: https://threefold.info/herodocs/img/tf_graph.png
# Copyright Notice
copyright: ThreeFold

View File

@@ -0,0 +1,33 @@
[
{
name: home
title: Home Page
description: This is the main landing page.
navpath: /
collection: acollection
}
{
name: about
title: About Us
navpath: /about
collection: acollection
}
{
name: docs
title: Documentation
navpath: /sub/docs
collection: docs_hero
}
{
name: draft-page
title: draft Page
description: This page is not shown in navigation.
draft: true
navpath: /cantsee
collection: acollection
}
]

View File

@@ -1 +0,0 @@
name = "Grid Documentation"

View File

@@ -0,0 +1,8 @@
# Include Example
This file demonstrates the include functionality of doctree.
## Including content from Introduction.md
!!include grid1:introduction.md

19
include_example.sh Executable file
View File

@@ -0,0 +1,19 @@
#!/bin/bash
# Change to the directory where the script is located
cd "$(dirname "$0")"
# Exit immediately if a command exits with a non-zero status
set -e
cd doctreecmd
# First, scan the collections with a specific doctree name
echo "=== Scanning Collections with doctree name 'include_demo' ==="
cargo run -- scan ../examples --doctree include_demo
# List the collections
echo -e "\n=== Listing Collections ==="
cargo run -- list --doctree include_demo
# Get the document with includes in markdown format
echo -e "\n=== Getting Document with Includes (Markdown) ==="
cargo run -- get -c grid1 -p include_example.md --doctree include_demo

View File

@@ -8,32 +8,32 @@ cd doctreecmd
# First, scan the collections
echo "=== Scanning Collections ==="
cargo run -- scan-and-info ../examples supercollection
cargo run -- scan ../examples --doctree supercollection
# Get a document in markdown format
echo -e "\n=== Getting Document (Markdown) ==="
cargo run -- get -c supercollection -p 01_features.md
cargo run -- get -c supercollection -p 01_features.md --doctree supercollection
# Get a document in HTML format
echo -e "\n=== Getting Document (HTML) ==="
cargo run -- get -c supercollection -p 01_features.md -f html
cargo run -- get -c supercollection -p 01_features.md -f html --doctree supercollection
# Get a document without specifying collection
echo -e "\n=== Getting Document (Default Collection) ==="
cargo run -- get -p 01_features.md
cargo run -- get -p 01_features.md --doctree supercollection
# Delete a specific collection
echo -e "\n=== Deleting Collection ==="
cargo run -- delete-collection grid_documentation
cargo run -- delete grid_documentation --doctree supercollection
# List remaining collections
echo -e "\n=== Listing Remaining Collections ==="
cargo run -- list
cargo run -- list --doctree supercollection
# Reset all collections
echo -e "\n=== Resetting All Collections ==="
cargo run -- reset
# # Reset all collections
# echo -e "\n=== Resetting All Collections ==="
# cargo run -- reset --doctree supercollection
# Verify all collections are gone
echo -e "\n=== Verifying Reset ==="
cargo run -- list
# # Verify all collections are gone
# echo -e "\n=== Verifying Reset ==="
# cargo run -- list --doctree supercollection

24
webbuilder/Cargo.toml Normal file
View File

@@ -0,0 +1,24 @@
[package]
name = "doctree"
version = "0.1.0"
edition = "2024"
[lib]
path = "src/lib.rs"
[dependencies]
walkdir = "2.3.3"
pulldown-cmark = "0.9.3"
thiserror = "1.0.40"
lazy_static = "1.4.0"
toml = "0.7.3"
serde = { version = "1.0", features = ["derive"] }
redis = { version = "0.23.0", features = ["tokio-comp"] }
tokio = { version = "1.28.0", features = ["full"] }
sal = { git = "https://git.ourworld.tf/herocode/sal.git" }
chacha20poly1305 = "0.10.1"
blake3 = "1.3.1"
csv = "1.1"
rand = "0.9.1"
ipfs-api-backend-hyper = "0.6"
ipfs-api = { version = "0.17.0", default-features = false, features = ["with-hyper-tls"] }

View File

@@ -0,0 +1,87 @@
# Web Builder Specification
This document describes the process of building web metadata and exporting assets for a website, resulting in a `webmeta.json` file that can be used by a browser-based website generator.
## Overview
The web building process starts with a directory containing the site's Hjson configuration files, such as the example directory `/Users/despiegk/code/git.ourworld.tf/herocode/doctree/examples/doctreenew/sites/demo1`. These Hjson files define the structure and content of the entire site and may reference external collections. The Hjson configuration sits "on top" of the collections it utilizes. Using the metadata defined in these Hjson files, the necessary collection data is downloaded from Git repositories (if referenced). The `doctree` is then used to process the relevant data, identify pages and images, and prepare them for export to IPFS. Finally, a `webmeta.json` file is generated containing all the necessary information, including IPFS keys and Blake hashes for content verification, allowing a browser-based tool to render the website by fetching assets from IPFS. Optionally, the generated `webmeta.json` file can also be uploaded to IPFS, and its IPFS URL returned.
## Process Steps
1. **Start from Hjson Directory:**
* The process begins with a designated directory containing the site's Hjson configuration files. This directory serves as the single input for the web building process.
2. **Parse Site Metadata (Hjson):**
* Locate and parse all `.hjson` files within the input directory and its subdirectories (e.g., `pages`). These files collectively define the site's structure, content, and configuration, and may include references to external collections.
3. **Download Referenced Collections from Git:**
* If the Hjson metadata references external collections hosted in Git repositories, download these collections using a separate tool or crate responsible for Git interactions. The Hjson files provide the necessary information (e.g., repository URLs, branch names) to perform these downloads.
4. **Process Site Content and Collections with Doctree:**
* Utilize the `doctree` library to process the parsed site metadata and the content of any downloaded collections.
* `doctree` will build the document tree based on the Hjson structure and identify relevant assets such as pages (e.g., Markdown files) and images referenced within the site configuration or collections.
5. **Export Assets to IPFS:**
* Export the identified assets (pages, images, etc.) to IPFS.
* For each exported asset, obtain its IPFS key (CID) and calculate its Blake hash for content integrity verification.
6. **Generate `webmeta.json`:**
* Create a single `webmeta.json` file that consolidates all the necessary information for the browser-based generator.
* This file should include:
* Site-level metadata (from Hjson).
* Structure of the website (pages, navigation, etc.).
* For each page, include:
* Page metadata (from Hjson).
* The IPFS key of the page content.
* The Blake hash of the page content.
* Information about other assets (images, etc.), including their IPFS keys.
7. **Optional: Upload `webmeta.json` to IPFS:**
* Optionally, upload the generated `webmeta.json` file to IPFS.
* If uploaded, the IPFS URL of the `webmeta.json` file is returned as the output of the web building process.
8. **Utilize `webmeta.json` in Browser:**
* The generated `webmeta.json` file (either locally or fetched from IPFS) serves as the single configuration entry point for a browser-based website generator.
* The browser tool reads `webmeta.json`, uses the IPFS keys to fetch the content and assets from the IPFS network, and renders the website dynamically. The Blake hashes can be used to verify the integrity of the downloaded content.
## `webmeta.json` Structure (Example)
```json
{
"site_metadata": {
// Consolidated data from site-level Hjson files (collection, header, footer, main, etc.)
"name": "demo1",
"title": "Demo Site 1",
"description": "This is a demo site for doctree",
"keywords": ["demo", "doctree", "example"],
"header": { ... },
"footer": { ... }
},
"pages": [
{
"id": "mypages1",
"title": "My Pages 1",
"ipfs_key": "Qm...", // IPFS key of the page content
"blakehash": "sha256-...", // Blake hash of the page content
"sections": [
{ "type": "text", "content": "..." } // Potentially include some inline content or structure
],
"assets": [
{
"name": "image1.png",
"ipfs_key": "Qm..." // IPFS key of an image used on the page
}
]
}
// Other pages...
],
"assets": {
// Global assets not tied to a specific page, e.g., CSS, global images
"style.css": {
"ipfs_key": "Qm..."
}
}
}
```
This structure is a suggestion and can be adapted based on the specific needs of the browser-based generator. The key is to include all necessary information (metadata, IPFS keys, hashes) to allow the browser to fetch and render the complete website.

View File

@@ -0,0 +1,43 @@
{
"site_metadata": {
"name": "demo1",
"title": "Demo Site 1",
"description": "This is a demo site for doctree",
"keywords": ["demo", "doctree", "example"],
"header": {
"logo": "/images/logo.png",
"nav": [
{ "text": "Home", "url": "/" },
{ "text": "About", "url": "/about" }
]
},
"footer": {
"copyright": "© 2023 My Company",
"links": [
{ "text": "Privacy Policy", "url": "/privacy" }
]
}
},
"pages": [
{
"id": "mypages1",
"title": "My Pages 1",
"ipfs_key": "QmPlaceholderIpfsKey1",
"blakehash": "sha256-PlaceholderBlakeHash1",
"sections": [
{ "type": "text", "content": "This is example content for My Pages 1." }
],
"assets": [
{
"name": "image1.png",
"ipfs_key": "QmPlaceholderImageIpfsKey1"
}
]
}
],
"assets": {
"style.css": {
"ipfs_key": "QmPlaceholderCssIpfsKey1"
}
}
}