This repository demonstrates how to use GroupDocs.Search for .NET in Python applications using pythonnet. It provides two distinct implementation approaches to overcome the challenges of loading .NET assemblies with embedded dependencies in Python:
1. Wrapper-Based Approach (run_search_wrapper.py)
- Uses a custom C# wrapper library that encapsulates common search operations
- Provides simplified static methods for building indexes and performing searches
- Ideal for straightforward search tasks with minimal Python/.NET interop complexity
- Best for: Quick prototyping, simple search workflows, and users who prefer high-level APIs
2. Manual Type Resolution Approach (run_search_manual.py)
- Uses the wrapper only as a dependency resolver for embedded assemblies
- Provides direct access to GroupDocs.Search types and methods
- Offers full control over index creation and search customization
- Best for: Complex search scenarios, advanced customization, and developers who need fine-grained control
Both approaches solve the core challenge of loading GroupDocs.Search's obfuscated and embedded dependencies in Python environments.
GroupDocs.Search for .NET uses obfuscation and embedded dependencies to protect intellectual property. This creates a fundamental challenge when trying to use it directly with pythonnet:
# β This approach WILL NOT work
import os
import sys
# Load coreclr first
from pythonnet import load
load("coreclr")
import clr
# Add folder with the library and dependencies to the system path
dll_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), "dlls"))
sys.path.append(dll_dir)
# Add reference to the library
clr.AddReference("GroupDocs.Search")
# Import the Index class
from GroupDocs.Search import Index
index = Index("search_index")
index.Add("documents_folder")The Problem: GroupDocs.Search embeds referenced assemblies (like Aspose.* libraries) directly into the main DLL with obfuscation. When pythonnet tries to load the assembly:
- Type Enumeration Phase: pythonnet attempts to enumerate all public types to build Python module proxies
- Dependency Resolution: During enumeration, the CLR tries to resolve embedded dependencies
- Failure Point: The default .NET assembly resolver cannot extract obfuscated, embedded DLLs from resources
- Result:
ReflectionTypeLoadExceptionis thrown, causing pythonnet to fail creating the Python module
Why This Happens:
- Most obfuscators rely on a bootstrap/resolver that runs in your entry assembly
- Since Python is the host (not a .NET executable), the bootstrap never executes
- The embedded dependencies remain inaccessible to the standard .NET assembly resolver
This repository provides two approaches to solve this challenge:
- Wrapper Library: A C# wrapper that handles dependency resolution and exposes simplified APIs
- Manual Resolution: Direct type resolution using reflection to bypass import issues
Both methods ensure the embedded dependencies are properly resolved before attempting to use GroupDocs.Search types.
GroupDocs.Search for .NET is a comprehensive document search library that allows you to:
- Create search indexes for 50+ document formats (PDF, Word, Excel, PowerPoint, images, etc.)
- Perform full-text search across multiple document types simultaneously
- Search with various query types: simple text, boolean queries, regular expressions, and fuzzy search
- Get detailed search results with relevance scores, occurrence counts, and term highlighting
- Work with multiple indexes for large-scale document collections
- Customize search behavior with synonyms, stop words, and character replacements
Key Features:
- Support for 50+ document formats
- Multiple search query types (text, boolean, regex, fuzzy)
- Cross-platform .NET support
- High-performance indexing and searching
- Flexible licensing options
PythonNet is a package that provides near-seamless integration between Python and the .NET Common Language Runtime (CLR). It allows you to:
- Call .NET assemblies directly from Python code
- Use .NET types and methods as if they were native Python objects
- Access the full .NET ecosystem from Python applications
- Maintain performance with minimal overhead
Key Benefits:
- Direct access to .NET libraries from Python
- No need for separate .NET applications or services
- Maintains Python's simplicity while leveraging .NET's power
- Cross-platform support (Windows, Linux, macOS)
Official Repository: pythonnet/pythonnet
- Operating System: Windows 10/11 (x64), Linux, or macOS
- Python: 3.8+ (recommended: 3.11 or 3.12)
- .NET Runtime: .NET 6.0 or later
- Memory: Minimum 4GB RAM (8GB+ recommended for large documents)
- Disk Space: 500MB+ for dependencies and temporary files
| Python Version | pythonnet Version | .NET Runtime | Supported Target Frameworks | Notes |
|---|---|---|---|---|
| 3.7 β 3.10 | 2.5.x | .NET Framework 4.6.2 β 4.8 | net40, net45, net462, net48 | β
Best for legacy .NET Framework DLLs (e.g., GroupDocs.Annotation net462) Requires 64-bit Python + .NET Framework runtime |
| 3.7 β 3.10 | 2.5.x | Limited .NET Core 3.1 / .NET 5 | Some .NET Standard 2.0 DLLs | Use only if DLL explicitly supports .NET Standard |
| 3.8 β 3.12 | 3.x (β₯3.0.0) | .NET 6 / .NET 7 / .NET 8 | net6.0, net7.0, net8.0, netstandard2.0/2.1 | β
Best for modern .NET builds Requires .NET Desktop Runtime 6+ |
| 3.13+ | 3.x (β₯3.0.3) | .NET 6 / .NET 7 / .NET 8 | Same as above | β
Supported Recommended for latest Python versions |
For this repository, we recommend:
- Python 3.11 with pythonnet 3.0.5
- .NET 6.0 Desktop Runtime
- Windows x64 environment
# Create Python 3.11 virtual environment
py -3.11 -m venv venv311
# Activate virtual environment (Windows)
venv311\Scripts\activate
# Verify Python version
python --version# Upgrade pip and essential tools
python -m ensurepip --upgrade
python -m pip install --upgrade pip setuptools wheel
# Install pythonnet 3.0.5
python -m pip install pythonnet==3.0.5
# Install project requirements
pip install -r requirements.txt# Test pythonnet and .NET integration
import sys, clr
print("Python:", sys.version)
print("pythonnet imported OK:", clr.__version__)
clr.AddReference("System")
import System
print("CLR OK, .NET version:", System.Environment.Version)# Navigate to wrapper directory
cd wrapper
# Build and publish the wrapper
dotnet publish -c Release -r win-x64 --self-contained false -o ./../dlls
# Return to root directory
cd ..# Activate virtual environment (if not already active)
.venv\Scripts\activate
# Run wrapper-based approach
python run_search_wrapper.py
# Run manual type resolution approach
python run_search_manual.pyGroupDocs.Search-for-PythonNet/
βββ π lics/ # put here the license GroupDocs.Search.lic file
βββ π dlls/ # Compiled .NET assemblies and dependencies
β βββ [GroupDocs.Search.dll] # Main GroupDocs.Search library (not delivered in repository)
β βββ GroupDocs.Search.Wrapper.dll # Custom wrapper library (not delivered in repository)
β βββ [other dependencies] # Additional .NET dependencies
βββ π files/ # Sample documents for testing
β βββ invoice.01.txt # Input document for search indexing
β βββ sample.api.01.json # Additional sample document
βββ π index/ # Search index files (generated)
β βββ index.info # Index metadata
β βββ [index files] # Index data files
βββ π wrapper/ # C# wrapper library source code
β βββ SearchWrapper.cs # Main wrapper implementation
β βββ GroupDocs.Search.Wrapper.csproj # Project file
β βββ bin/ # Build output directory
βββ π run_search_wrapper.py # Example: Wrapper-based approach
βββ π run_search_manual.py # Example: Manual type resolution
βββ π requirements.txt # Python dependencies
βββ π README.md # This documentation
| Folder/File | Purpose | Contents |
|---|---|---|
lics/ |
Licenses folder | This repository does not contain any license. |
dlls/ |
Compiled assemblies | Contains all .NET DLLs required for runtime, including GroupDocs.Search and the custom wrapper |
files/ |
Sample documents | Test documents for search indexing examples (input documents) |
index/ |
Search index | Generated search index files for document searching |
wrapper/ |
C# source code | Custom wrapper library that simplifies GroupDocs.Search usage |
run_search_wrapper.py |
Wrapper example | Demonstrates simplified search using the wrapper library |
run_search_manual.py |
Manual example | Shows direct type resolution and advanced search control |
requirements.txt |
Dependencies | Python package requirements (pythonnet) |
Wrapper Library (wrapper/SearchWrapper.cs)
- Provides simplified static methods for common search tasks
- Handles dependency resolution internally
- Exposes high-level APIs for Python consumption
Python Examples
- Wrapper approach: Simple, high-level API for basic search needs
- Manual approach: Full control over index creation and search customization
Document Discovery & Knowledge Management
- Legal firms: Search through contracts, agreements, and legal documents for specific clauses
- Healthcare: Find patient records and medical documents using keywords and terms
- Education: Search through course materials, research papers, and educational content
- Real Estate: Locate property documents, contracts, and specifications using search terms
Enterprise Content Search
- Manufacturing: Search technical documentation, specifications, and quality control documents
- Financial Services: Find compliance documents, audit reports, and financial records
- Government: Search policy documents, regulations, and administrative materials
- Insurance: Locate claim documents, policy information, and risk assessments
Content Management & Publishing
- Publishing houses: Search through manuscripts, research materials, and editorial content
- Marketing agencies: Find campaign materials, brand guidelines, and creative assets
- Technical writing: Search technical documentation and knowledge bases
- Translation services: Find reference materials and translation glossaries
Automated Document Processing
- Batch indexing: Process hundreds of documents and create searchable indexes
- API integration: Add search capabilities as part of document processing workflows
- Cloud services: Integrate search functionality into cloud-based applications
- Microservices: Deploy search services as part of larger document processing systems
Custom Search Workflows
- Form processing: Search through form submissions and responses
- Report analysis: Find specific data and patterns in generated reports
- Document comparison: Search for differences between document versions
- Template matching: Find documents matching specific criteria or templates
The Core Problem: Python developers need to implement document search functionality but face challenges with:
- Loading .NET libraries with embedded dependencies
- Complex type resolution in pythonnet environments
- Maintaining compatibility across different Python/.NET versions
Our Solution Provides:
- β Simplified Integration: Easy-to-use wrapper for common search tasks
- β Full Control: Direct access to all GroupDocs.Search features
- β Dependency Resolution: Automatic handling of embedded .NET dependencies
- β Cross-Platform: Works on Windows, Linux, and macOS
- β Production Ready: Tested approaches for real-world applications
This solution represents an early implementation for using GroupDocs.Search with pythonnet. While it successfully demonstrates both wrapper-based and manual type resolution approaches, please note:
Current Status:
- β Functional: Both implementation approaches work as demonstrated
- β Tested: Examples have been validated with basic search scenarios
β οΈ Limited Testing: Not yet extensively tested across all GroupDocs.Search featuresβ οΈ Production Readiness: Requires additional testing for production environments
Before Production Use:
- Comprehensive Testing: Test with your specific document types and search requirements
- Performance Validation: Evaluate performance with large document collections and complex queries
- Error Handling: Implement robust error handling for edge cases
- Security Review: Ensure compliance with your security and data protection requirements
For Development:
- Use the wrapper approach for quick prototyping and simple search tasks
- Use the manual approach when you need full control over index properties and search customization
- Consider extending the wrapper with additional methods for your specific use cases
We welcome your feedback, test results, and suggestions for improvements! Your input will help us:
- Refine the implementation approaches
- Add more comprehensive examples
- Improve error handling and edge cases
- Explore additional GroupDocs.Search features
How to Contribute:
- Test the examples with your documents and use cases
- Report any issues or unexpected behavior
- Suggest additional wrapper methods or examples
- Share your successful integration stories
Core Technologies:
pythonnet, GroupDocs.Search, .NET, Python, document search, CLR integration, assembly loading, dependency resolution
Document Processing:
document search, full-text search, search indexing, document indexing, search queries, search results, document discovery, content search, enterprise search
Technical Implementation:
wrapper library, type resolution, reflection, embedded dependencies, obfuscated assemblies, pythonnet integration, .NET interop, cross-platform
Business Applications:
document discovery, knowledge management, enterprise search, content management, legal document search, healthcare documentation, educational search, technical documentation search
Development & Integration:
API integration, microservices, cloud services, batch processing, automated workflows, search automation, enterprise solution, production deployment