doctree/doctree_implementation_plan.md
2025-04-09 07:54:37 +02:00

8.3 KiB

Implementation Plan: DocTree Collection Scanner

Overview

We need to expand the doctree library to:

  1. Add a recursive scan function to the DocTree struct
  2. Detect directories containing .collection files
  3. Parse .collection files as TOML to extract collection names
  4. Replace the current name_fix function with the one from the sal library
  5. Populate collections with all files found under the collection directories

Detailed Implementation Plan

1. Update Dependencies

First, we need to add the necessary dependencies to the Cargo.toml file:

[dependencies]
walkdir = "2.3.3"
pulldown-cmark = "0.9.3"
thiserror = "1.0.40"
lazy_static = "1.4.0"
toml = "0.7.3"  # Add TOML parsing support

2. Replace the name_fix Function

Replace the current name_fix function in utils.rs with the one from the sal library:

pub fn name_fix(text: &str) -> String {
    let mut result = String::with_capacity(text.len());
    
    let mut last_was_underscore = false;
    for c in text.chars() {
        // Keep only ASCII characters
        if c.is_ascii() {
            // Replace specific characters with underscore
            if c.is_whitespace() || c == ',' || c == '-' || c == '"' || c == '\'' ||
               c == '#' || c == '!' || c == '(' || c == ')' || c == '[' || c == ']' ||
               c == '=' || c == '+' || c == '<' || c == '>' || c == '@' || c == '$' ||
               c == '%' || c == '^' || c == '&' || c == '*' {
                // Only add underscore if the last character wasn't an underscore
                if !last_was_underscore {
                    result.push('_');
                    last_was_underscore = true;
                }
            } else {
                // Add the character as is (will be converted to lowercase later)
                result.push(c);
                last_was_underscore = false;
            }
        }
        // Non-ASCII characters are simply skipped
    }
    
    // Convert to lowercase
    return result.to_lowercase();
}

3. Add Collection Configuration Struct

Create a new struct to represent the configuration found in .collection files:

#[derive(Deserialize, Default)]
struct CollectionConfig {
    name: Option<String>,
    // Add other configuration options as needed
}

4. Add Scan Collections Method to DocTree

Add a new method to the DocTree struct to recursively scan directories for .collection files:

impl DocTree {
    /// Recursively scan directories for .collection files and add them as collections
    ///
    /// # Arguments
    ///
    /// * `root_path` - The root path to start scanning from
    ///
    /// # Returns
    ///
    /// Ok(()) on success or an error
    pub fn scan_collections<P: AsRef<Path>>(&mut self, root_path: P) -> Result<()> {
        let root_path = root_path.as_ref();
        
        // Walk through the directory tree
        for entry in WalkDir::new(root_path).follow_links(true) {
            let entry = match entry {
                Ok(entry) => entry,
                Err(e) => {
                    eprintln!("Error walking directory: {}", e);
                    continue;
                }
            };
            
            // Skip non-directories
            if !entry.file_type().is_dir() {
                continue;
            }
            
            // Check if this directory contains a .collection file
            let collection_file_path = entry.path().join(".collection");
            if collection_file_path.exists() {
                // Found a collection directory
                let dir_path = entry.path();
                
                // Get the directory name as a fallback collection name
                let dir_name = dir_path.file_name()
                    .and_then(|name| name.to_str())
                    .unwrap_or("unnamed");
                
                // Try to read and parse the .collection file
                let collection_name = match fs::read_to_string(&collection_file_path) {
                    Ok(content) => {
                        // Parse as TOML
                        match toml::from_str::<CollectionConfig>(&content) {
                            Ok(config) => {
                                // Use the name from config if available, otherwise use directory name
                                config.name.unwrap_or_else(|| dir_name.to_string())
                            },
                            Err(e) => {
                                eprintln!("Error parsing .collection file at {:?}: {}", collection_file_path, e);
                                dir_name.to_string()
                            }
                        }
                    },
                    Err(e) => {
                        eprintln!("Error reading .collection file at {:?}: {}", collection_file_path, e);
                        dir_name.to_string()
                    }
                };
                
                // Add the collection to the DocTree
                match self.add_collection(dir_path, &collection_name) {
                    Ok(_) => {
                        println!("Added collection '{}' from {:?}", collection_name, dir_path);
                    },
                    Err(e) => {
                        eprintln!("Error adding collection '{}' from {:?}: {}", collection_name, dir_path, e);
                    }
                }
            }
        }
        
        Ok(())
    }
}

5. Update the DocTreeBuilder

Update the DocTreeBuilder to include a method for scanning collections:

impl DocTreeBuilder {
    /// Scan for collections in the given root path
    ///
    /// # Arguments
    ///
    /// * `root_path` - The root path to scan for collections
    ///
    /// # Returns
    ///
    /// Self for method chaining or an error
    pub fn scan_collections<P: AsRef<Path>>(self, root_path: P) -> Result<Self> {
        // Ensure storage is set
        let storage = self.storage.as_ref().ok_or_else(|| {
            DocTreeError::MissingParameter("storage".to_string())
        })?;
        
        // Create a temporary DocTree to scan collections
        let mut temp_doctree = DocTree {
            collections: HashMap::new(),
            default_collection: None,
            storage: storage.clone(),
            name: self.name.clone().unwrap_or_default(),
            path: self.path.clone().unwrap_or_else(|| PathBuf::from("")),
        };
        
        // Scan for collections
        temp_doctree.scan_collections(root_path)?;
        
        // Create a new builder with the scanned collections
        let mut new_builder = self;
        for (name, collection) in temp_doctree.collections {
            new_builder.collections.insert(name, collection);
        }
        
        Ok(new_builder)
    }
}

6. Add a Convenience Function to the Library

Add a convenience function to the library for creating a DocTree by scanning a directory:

/// Create a new DocTree by scanning a directory for collections
///
/// # Arguments
///
/// * `root_path` - The root path to scan for collections
///
/// # Returns
///
/// A new DocTree or an error
pub fn from_directory<P: AsRef<Path>>(root_path: P) -> Result<DocTree> {
    let storage = RedisStorage::new("redis://localhost:6379")?;
    
    DocTree::builder()
        .with_storage(storage)
        .scan_collections(root_path)?
        .build()
}

Implementation Flow Diagram

flowchart TD
    A[Start] --> B[Update Dependencies]
    B --> C[Replace name_fix function]
    C --> D[Add CollectionConfig struct]
    D --> E[Add scan_collections method to DocTree]
    E --> F[Update DocTreeBuilder]
    F --> G[Add convenience function]
    G --> H[End]

Component Interaction Diagram

graph TD
    A[DocTree] -->|manages| B[Collections]
    C[scan_collections] -->|finds| D[.collection files]
    D -->|parsed as| E[TOML]
    E -->|extracts| F[Collection Name]
    C -->|creates| B
    G[name_fix] -->|processes| F
    G -->|processes| H[File Names]
    B -->|contains| H

Testing Plan

  1. Create test directories with .collection files in various formats
  2. Test the scan_collections method with these directories
  3. Verify that collections are created correctly with the expected names
  4. Verify that all files under the collection directories are included in the collections
  5. Test edge cases such as empty .collection files, invalid TOML, etc.