Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Reworked the extraction of images by adding `ImageHandlingMode` to the `ParserConfig`. With this, users can decide to manually extract images and handle the logic [(#19)](https://github.com/nilskruthoff/pptx-parser/issues/19)
- New [example](https://github.com/nilskruthoff/pptx-parser/tree/master/examples) `manual_image_extraction.rs` to show how to handle images manually
- `ManualImage` struct to encapsulate data and meta data of images
- `ImageHandlingMode::Save` to save images in a given output path and adding context to the Markdown file [(#20)](https://github.com/nilskruthoff/pptx-parser/issues/20)

### Removed

- `image_extraction` from [examples](https://github.com/nilskruthoff/pptx-parser/tree/master/examples) directory (replaced by `manual_image_extraction.rs`)

### Changed

- Updated [README.md](https://github.com/nilskruthoff/pptx-parser/blob/master/README.md) to document new `ParserConfig` parameters

---

## [0.2.0] - 2025-06-15
Expand All @@ -39,4 +42,4 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Changed

- [README.md](https://github.com/nilskruthoff/pptx-parser/blob/master/README.md) updated to show the latest working examples and features
- Updated [README.md](https://github.com/nilskruthoff/pptx-parser/blob/master/README.md) to show the latest working examples and features
8 changes: 6 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,12 @@ name = "memory_efficient_streaming"
path = "examples/memory_efficient_streaming.rs"

[[example]]
name = "image_extraction"
path = "examples/image_extraction.rs"
name = "manual_image_extraction"
path = "examples/manual_image_extraction.rs"

[[example]]
name = "save_images"
path = "examples/save_images.rs"

[[example]]
name = "slide_elements"
Expand Down
33 changes: 19 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,19 +62,23 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {

## Config Parameters

| Parameter | Type | Default | Description |
|--------------------------|-----------------------|---------------|-----------------------------------------------------------------------------------------------------------|
| `extract_images` | `bool` | `true` | Whether images are extracted from slides or not. If false, images can not be extracted manually either. |
| `compress_images` | `bool` | `true` | Whether images are compressed before encoding or not. Effects manually extracted images too. |
| `image_quality` | `u8` | `80` | Defines the image compression quality `(0-100)`. Higher values mean better quality but larger file sizes. |
| `image_handling_mode` | `ImageHandlingMode` | `InMarkdown` | Determines how images are handled during content export |
| Parameter | Type | Default | Description |
|------------------------|-----------------------|---------------|-----------------------------------------------------------------------------------------------------------|
| `extract_images` | `bool` | `true` | Whether images are extracted from slides or not. If false, images can not be extracted manually either. |
| `compress_images` | `bool` | `true` | Whether images are compressed before encoding or not. Effects manually extracted images too. |
| `image_quality` | `u8` | `80` | Defines the image compression quality `(0-100)`. Higher values mean better quality but larger file sizes. |
| `image_handling_mode` | `ImageHandlingMode` | `InMarkdown` | Determines how images are handled during content export |
| `image_output_path` | `Option<PathBuf>` | `None` | Output directory path for `ImageHandlingMode::Save` (mandatory for saving mode) |

<br/>

#### Member of `ImageHandlingMode`
| Member | Description |
|-----------------|-------------------------------------------------------------------------------------------------------|
| `InMarkdown` | Images are embedded directly in the Markdown output using standard syntax as `base64` data (`![]()`) |
| `Manually` | Image handling is delegated to the user, requiring manual copying or referencing (as `base64`) |
| Member | Description |
|---------------|-------------------------------------------------------------------------------------------------------|
| `InMarkdown` | Images are embedded directly in the Markdown output using standard syntax as `base64` data (`![]()`) |
| `Manually` | Image handling is delegated to the user, requiring manual copying or referencing (as `base64`) |
| `Save` | Images will be saved in a provided output directory and integrated using standard syntax (`![]()`) |

---

## 🏗 Project Structure
Expand All @@ -87,8 +91,10 @@ pptx-to-md/
├── LICENSE-APACHE
├── examples/ # Simple examples to present the usage of this crate
│ ├── basic_usage.rs
│ ├── image_extractions.rs
│ ├── manual_image_extraction.rs
│ ├── memory_efficient_streaming.rs
│ ├── performance_tests.rs
│ ├── save_images.rs
│ └── slide_elements.rs
├── src/
│ ├── lib.rs # Public API
Expand All @@ -111,7 +117,7 @@ Include the following line in your Cargo.toml dependencies section:

```toml
[dependencies]
pptx-to-md = "0.1.2" # replace with the current version
pptx-to-md = "0.3.0" # replace with the current version
```

---
Expand All @@ -122,5 +128,4 @@ and [Apache 2.0-Licence](https://github.com/nilskruthoff/pptx-parser/blob/master

Feel free to contribute or suggest improvements!

---

---
87 changes: 0 additions & 87 deletions examples/image_extraction.rs

This file was deleted.

55 changes: 55 additions & 0 deletions examples/save_images.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
//! Basic usage example for the pptx-to-md crate
//!
//! This example demonstrates how to open a PPTX file and convert all slides to Markdown.
//!
//! Run with: cargo run --example save_images <path/to/your/presentation.pptx>

use pptx_to_md::{ImageHandlingMode, ParserConfig, PptxContainer, Result};
use std::fs::File;
use std::io::Write;
use std::path::{Path, PathBuf};
use std::env;

fn main() -> Result<()> {
// Get the PPTX file path from command line arguments and provide the mandatory output path
let args: Vec<String> = env::args().collect();
let pptx_path = if args.len() > 1 {
&args[1]
} else {
eprintln!("Usage: cargo run --example save_images <path/to/presentation.pptx>");
return Ok(());
};

println!("Processing PPTX file: {}", pptx_path);

// Use the config builder to build your config
let config = ParserConfig::builder()
.extract_images(true)
.compress_images(true)
.quality(75)
.image_handling_mode(ImageHandlingMode::Save)
.image_output_path(PathBuf::from("C:/Users/nilsk/Downloads/extracted_images"))
.build();

// Open the PPTX file
let mut container = PptxContainer::open(Path::new(pptx_path), config)?;

// Parse all slides
let slides = container.parse_all()?;

println!("Found {} slides", slides.len());

// create a new Markdown file
let mut md_file = File::create("output.md")?;

// Convert each slide to Markdown and save the images automatically
for slide in slides {
if let Some(md_content) = slide.convert_to_md() {
writeln!(md_file, "{}", md_content).expect("Couldn't write to file");
}
}

println!("All slides converted successfully!");

Ok(())
}
46 changes: 34 additions & 12 deletions src/parser_config.rs
Original file line number Diff line number Diff line change
@@ -1,15 +1,19 @@
/// Determines how images are handled during content export.
use std::path::PathBuf;

/// Determines how images are handled during content export.
///
/// # Members
///
/// | Member | Description |
/// |-----------------------|-----------------------------------------------------------------------------------------------------------------------|
/// | `InMarkdown` | Images are embedded directly in the Markdown output using standard syntax as `base64` data (`![]()`) |
/// | `Manually` | Image handling is delegated to the user, requiring manual copying or referencing (as `base64` encoded string) |
/// | `Save` | Images will be saved in a provided output directory and integrated using standard syntax (`![]()`) |
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum ImageHandlingMode {
InMarkdown,
Manually
Manually,
Save,
}

/// Configuration options for the PPTX parser.
Expand All @@ -18,21 +22,27 @@ pub enum ImageHandlingMode {
/// This allows you to customize only the desired fields while falling back to sensible defaults for the rest.
///
/// # Configuration Options
///
///
/// | Parameter | Type | Default | Description |
/// |---------------------------|-----------------------|---------------|-----------------------------------------------------------------------------------------------------------|
/// | `extract_images` | `bool` | `true` | Whether images are extracted from slides or not. If false, images can not be extracted manually either. |
/// | `compress_images` | `bool` | `true` | Whether images are compressed before encoding or not. Effects manually extracted images too. |
/// | `extract_images` | `bool` | `true` | Whether images are extracted from slides or not. If false, images can not be extracted manually either |
/// | `compress_images` | `bool` | `true` | Whether images are compressed before encoding or not. Effects manually extracted images too |
/// | `image_quality` | `u8` | `80` | Compression level (0-100);<br/> higher values retain more detail but increase file size |
/// | `image_handling_mode` | `ImageHandlingMode` | `InMarkdown` | Determines how images are handled during content export. |
/// | `image_handling_mode` | `ImageHandlingMode` | `InMarkdown` | Determines how images are handled during content export |
/// | `image_output_path` | `Option<PathBuf>` | `None` | Output directory path for `ImageHandlingMode::Save` (mandatory for the saving mode) |
///
/// # Example
///
/// ```
/// use pptx_to_md::ParserConfig;
/// use std::path::PathBuf;
/// use pptx_to_md::{ImageHandlingMode, ParserConfig};
///
/// let config = ParserConfig::builder()
/// .extract_images(true)
/// .compress_images(true)
/// .quality(75)
/// .image_handling_mode(ImageHandlingMode::Save)
/// .image_output_path(PathBuf::from("/path/to/output/dir/"))
/// .build();
/// ```
#[derive(Debug, Clone)]
Expand All @@ -41,15 +51,17 @@ pub struct ParserConfig {
pub compress_images: bool,
pub quality: u8,
pub image_handling_mode: ImageHandlingMode,
pub image_output_path: Option<PathBuf>,
}

impl Default for ParserConfig {
fn default() -> Self {
Self {
Self {
extract_images: true,
compress_images: true,
quality: 80,
image_handling_mode: ImageHandlingMode::InMarkdown,
image_output_path: None,
}
}
}
Expand All @@ -69,6 +81,7 @@ pub struct ParserConfigBuilder {
compress_images: Option<bool>,
image_quality: Option<u8>,
image_handling_mode: Option<ImageHandlingMode>,
image_output_path: Option<PathBuf>,
}

impl ParserConfigBuilder {
Expand All @@ -77,26 +90,34 @@ impl ParserConfigBuilder {
self.extract_images = Some(value);
self
}

/// Sets weather images should be compressed before encoded to base64 or not
pub fn compress_images(mut self, value: bool) -> Self {
self.compress_images = Some(value);
self
}

/// Specifies the desired image quality where `100` is the original quality and `50` means half the quality
/// The lower the quality, the smaller the file size of the output image will be
pub fn quality(mut self, value: u8) -> Self {
self.image_quality = Some(value);
self
}

/// Specifies the mode for processing the image after its extracted
pub fn image_handling_mode(mut self, value: ImageHandlingMode) -> Self {
self.image_handling_mode = Some(value);
self
}


/// Specifies the output directory for the [`ImageHandlingMode::Save`]
pub fn image_output_path<P>(mut self, path: P) -> Self
where
P: Into<PathBuf>,
{
self.image_output_path = Some(path.into());
self
}

/// Builds the final [`ParserConfig`] instance, applying default values for any fields that were not set.
pub fn build(self) -> ParserConfig {
Expand All @@ -105,6 +126,7 @@ impl ParserConfigBuilder {
compress_images: self.compress_images.unwrap_or(true),
quality: self.image_quality.unwrap_or(80),
image_handling_mode: self.image_handling_mode.unwrap_or(ImageHandlingMode::InMarkdown),
image_output_path: self.image_output_path,
}
}
}
Loading