Skip to main content

@have/content: AI-Powered Content Processing

The @have/content module provides enterprise-grade content processing capabilities with built-in AI analysis, designed as a SMRT module for seamless integration with the framework.

Overview

@have/content handles structured content management with AI-powered analysis:

  • 📄 Content Management: Unified interface for documents and text content
  • 🤖 AI Integration: Built-in content analysis with do() and is() methods
  • 🔗 Reference System: Link and organize related content pieces
  • 📊 Collection Operations: Batch processing and content organization
  • 🗄️ SMRT Integration: Automatic database persistence and type safety

Core Classes

Content Class

The main content representation with AI capabilities:

import { Content } from '@have/content';
import { getAIClient } from '@have/ai';

// Create content with AI capabilities
const content = new Content({
title: 'Research Paper on AI Trends',
body: 'Large language models have revolutionized...',
type: 'research',
source: 'web_scraping',
status: 'published',
ai: await getAIClient({ provider: 'openai' })
});

// Use built-in AI methods
const summary = await content.do('Create a 100-word summary');
const isAcademic = await content.is('written in academic style');
const keyTopics = await content.do('Extract the 5 main topics as JSON array');

// Save to database (inherited from SMRT BaseObject)
await content.save();

Contents Collection

Manage collections of content with batch operations:

import { Contents } from '@have/content';

const contents = new Contents({});

// Add content to collection
await contents.add(content);

// Query and filter
const researchContent = await contents.list({
where: {
type: 'research',
status: 'published',
'created_at >': new Date('2024-01-01')
},
limit: 50
});

// Batch operations
await contents.updateWhere(
{ status: 'draft' },
{ status: 'review' }
);

Content Properties

Core Metadata

interface ContentOptions {
type?: string | null; // Content classification
fileKey?: string | null; // File storage reference
author?: string | null; // Content author
title?: string; // Content title
description?: string | null; // Short summary
body?: string; // Main content text
publish_date?: Date | null; // Publication date
url?: string | null; // Source URL
source?: string | null; // Origin identifier
status?: 'published' | 'draft' | 'archived' | 'deleted';
state?: 'deprecated' | 'active' | 'highlighted';
}

AI-Powered Methods

Content Analysis:

// Analyze content characteristics
const isFactual = await content.is('presenting factual information');
const isBiased = await content.is('showing clear political bias');
const isRecent = await content.is('discussing events from 2024');

// Extract structured information
const entities = await content.do('Extract all named entities as JSON');
const sentiment = await content.do('Analyze sentiment: positive, negative, or neutral');
const readingLevel = await content.do('Assess reading level: elementary, middle, high, college');

Content Transformation:

// Generate derivatives
const abstract = await content.do('Create a 150-word academic abstract');
const bullets = await content.do('Convert to bullet points highlighting key facts');
const outline = await content.do('Create a hierarchical outline structure');

// Format conversion
const markdown = await content.do('Convert to clean Markdown format');
const slides = await content.do('Create 5 presentation slides from this content');

Quality Assessment:

// Content quality checks
const hasEvidence = await content.is('supported by evidence and citations');
const isComplete = await content.is('providing complete information on the topic');
const needsUpdate = await content.is('containing outdated information');

// Credibility analysis
const credibilityScore = await content.do(
'Rate credibility 1-10 based on sources, citations, and objectivity'
);

Reference System

Link and organize related content:

// Add references to other content
await content.addReference(relatedContent);
await content.addReference('https://example.com/source');

// Load reference network
await content.loadReferences();
const references = await content.getReferences();

// Reference analysis
const citationNetwork = await content.do(
'Analyze how this content relates to its references'
);

Real-World Examples

Web Content Analysis

import { Content } from '@have/content';
import { WebScraperTool } from '@have/spider';

async function analyzeWebContent(url: string): Promise<Content> {
const scraper = new WebScraperTool();
const rawContent = await scraper.extractContent(url);

const content = new Content({
url: url,
title: rawContent.title,
body: rawContent.text,
source: 'web_scraping',
status: 'published',
ai: await getAIClient({ provider: 'openai' })
});

// AI-powered categorization
const category = await content.do(
'Categorize as: news, research, opinion, marketing, documentation'
);
content.type = category;

// Quality assessment
const quality = await content.do(
'Rate content quality 1-10 considering accuracy, completeness, and clarity'
);

// Extract metadata
const keyPoints = await content.do(
'Extract 5-7 key points as JSON array'
);

content.description = `Quality: ${quality}/10. Key insights: ${keyPoints.slice(0, 2).join(', ')}`;

await content.save();
return content;
}

Document Processing Pipeline

import { Content, Contents } from '@have/content';
import { PDFProcessor } from '@have/pdf';

class DocumentProcessor {
private contents: Contents;

constructor() {
this.contents = new Contents({});
}

async processDocument(filePath: string): Promise<Content> {
// Extract text from PDF
const pdf = new PDFProcessor();
const extractedText = await pdf.extractText(filePath);

const content = new Content({
title: path.basename(filePath, '.pdf'),
body: extractedText,
type: 'document',
fileKey: filePath,
source: 'pdf_upload',
status: 'published',
ai: await getAIClient({ provider: 'openai' })
});

// Document classification
const docType = await content.do(
'Classify document type: contract, report, manual, research, legal, financial'
);
content.type = docType;

// Extract structured data
const metadata = await content.do(`
Extract document metadata as JSON:
{
"documentDate": "YYYY-MM-DD or null",
"authors": ["author1", "author2"],
"keywords": ["keyword1", "keyword2"],
"summary": "brief summary",
"pageCount": number
}
`);

const parsedMetadata = JSON.parse(metadata);
content.description = parsedMetadata.summary;
content.author = parsedMetadata.authors.join(', ');

// Content analysis
const hasPersonalInfo = await content.is('containing personal or sensitive information');
if (hasPersonalInfo) {
content.state = 'deprecated'; // Flag for review
}

await content.save();
return content;
}

async generateReport(query: string): Promise<string> {
// Find relevant content
const relevantContent = await this.contents.list({
where: {
'body LIKE': `%${query}%`,
status: 'published',
state: 'active'
}
});

// Synthesize findings
const combinedText = relevantContent.map(c =>
`Title: ${c.title}\nContent: ${c.body.substring(0, 1000)}`
).join('\n\n---\n\n');

const synthesis = new Content({
title: `Analysis Report: ${query}`,
type: 'synthesis',
source: 'analysis',
status: 'draft',
ai: await getAIClient({ provider: 'openai' })
});

const report = await synthesis.do(`
Create a comprehensive analysis report based on these documents about "${query}":

${combinedText}

Structure the report with:
1. Executive Summary
2. Key Findings
3. Supporting Evidence
4. Conclusions
5. Recommendations
`);

synthesis.body = report;
await synthesis.save();

return report;
}
}

Content Synthesis and Research

class ResearchSynthesizer {
private contents: Contents;

constructor() {
this.contents = new Contents({});
}

async synthesizeResearch(topic: string): Promise<Content> {
// Gather related content
const relatedContent = await this.contents.list({
where: {
'body LIKE': `%${topic}%`,
status: 'published',
state: 'active'
},
orderBy: 'created_at DESC',
limit: 20
});

// Create synthesis content
const synthesis = new Content({
title: `Research Synthesis: ${topic}`,
type: 'synthesis',
source: 'ai_analysis',
status: 'draft',
ai: await getAIClient({ provider: 'openai' })
});

// Combine content for analysis
const sourceText = relatedContent.map(c => ({
title: c.title,
content: c.body.substring(0, 2000),
source: c.source,
date: c.created_at
}));

const analysis = await synthesis.do(`
Analyze these research sources about "${topic}" and create a comprehensive synthesis:

${JSON.stringify(sourceText, null, 2)}

Provide:
1. Thematic analysis of common patterns
2. Identification of conflicting viewpoints
3. Assessment of evidence quality
4. Gap analysis for further research
5. Actionable insights and implications
`);

synthesis.body = analysis;

// Add references to source content
for (const source of relatedContent) {
await synthesis.addReference(source);
}

// Generate quality metadata
const confidence = await synthesis.do(
'Rate confidence in this synthesis 1-10 based on source quality and coverage'
);

synthesis.description = `Synthesis of ${relatedContent.length} sources. Confidence: ${confidence}/10`;

await synthesis.save();
return synthesis;
}

async trackContentEvolution(contentId: string): Promise<string[]> {
const content = await this.contents.get(contentId);
if (!content) return [];

return await content.do(`
Trace how understanding of this topic has evolved by analyzing:
1. Publication dates and chronological progression
2. Changes in methodology or perspective
3. Evolution of terminology or concepts
4. Shifts in consensus or debate

Return as timeline of key developments.
`);
}
}

Integration with SMRT Framework

SMRT BaseObject Features

Content inherits all SMRT capabilities:

// Database persistence
await content.save();
await content.delete();

// Relationships
const relatedContent = await content.getRelated('type');

// Validation
const isValid = await content.validate();

// Lifecycle hooks
content.beforeSave = async () => {
// Auto-generate slug from title
if (!this.slug && this.title) {
this.slug = await this.getSlug();
}
};

Collection Features

const contents = new Contents({});

// CRUD operations
await contents.create({ title: 'New Content' });
const content = await contents.get(contentId);
await contents.update(contentId, { status: 'published' });
await contents.delete(contentId);

// Querying
const results = await contents.list({
where: { type: 'research' },
orderBy: 'created_at DESC',
limit: 10
});

// Aggregation
const stats = await contents.aggregate([
{ $group: { _id: '$type', count: { $sum: 1 } } }
]);

Performance and Best Practices

Optimization Strategies

Content Chunking:

// For large content, process in chunks
async function processLargeContent(content: Content): Promise<void> {
const chunks = content.body.match(/.{1,2000}/g) || [];

const analyses = await Promise.all(
chunks.map(chunk => content.do(`Analyze this section: ${chunk}`))
);

const summary = await content.do(
`Synthesize these section analyses: ${analyses.join('\n\n')}`
);

content.description = summary;
}

Batch Processing:

// Process multiple content pieces efficiently
async function batchAnalyze(contents: Content[]): Promise<void> {
const batches = chunk(contents, 10); // Process 10 at a time

for (const batch of batches) {
await Promise.all(
batch.map(async content => {
const category = await content.do('Categorize this content');
content.type = category;
return content.save();
})
);
}
}

Caching Strategies

// Cache expensive AI operations
const analysisCache = new Map<string, string>();

Content.prototype.cachedDo = async function(prompt: string): Promise<string> {
const cacheKey = `${this.id}-${hashString(prompt)}`;

if (analysisCache.has(cacheKey)) {
return analysisCache.get(cacheKey)!;
}

const result = await this.do(prompt);
analysisCache.set(cacheKey, result);
return result;
};

Next Steps

🧠 Research Agent Tutorial

Build a sophisticated research agent using the Content module

Build Research Agent →

🕷️ Web Scraping Integration

Combine Content with Spider for web content analysis

Explore Spider →

📄 PDF Processing

Extract and analyze content from PDF documents

Process PDFs →

💡 Pro Tip: The Content module shines when combined with other SMRT libraries. Use Spider for web content, PDF for documents, and Files for local content to create comprehensive content processing pipelines.