Add support for image response content in MCP tool calls

Background

The Model Context Protocol (MCP) specification supports multiple content types in tool responses, including:

Text content: Content::text(string) - for textual responses
Image content: Content::image(base64_data, mime_type) - for image responses
Audio content: Content::audio(base64_data, mime_type) - for audio responses
Resource content: for embedded resources
Resource links: for linking to external resources

Currently, rmcp-openapi only returns text content (Content::text()) for all API responses, even when the OpenAPI endpoint returns image data. This means image-returning endpoints cannot properly display their visual content to AI assistants.

Current Behavior

When an OpenAPI endpoint returns an image (e.g., GET /images/{id} with Content-Type: image/png):

The HTTP client calls response.text().await which attempts to decode bytes as UTF-8
The tool returns Content::text(response.to_mcp_content()) with malformed text
The AI assistant receives corrupted text instead of a displayable image

Location: crates/rmcp-openapi/src/tool/mod.rs:124

Desired Behavior

When an OpenAPI endpoint returns an image:

The HTTP client detects the image MIME type from the Content-Type header
The response bytes are retrieved and base64-encoded
The tool returns Content::image(base64_data, mime_type)
The AI assistant receives properly formatted image content it can display

Technical Approach

Phase 1: Detect Content Type

File: crates/rmcp-openapi/src/http_client.rs

Add content_type: Option<String> field to HttpResponse struct (line 633-644)

In process_response_with_request() (line 547-624), extract the Content-Type header:

let content_type = response.headers()
    .get(header::CONTENT_TYPE)
    .and_then(|v| v.to_str().ok())
    .map(|s| s.to_string());

Add helper method to detect binary content:

impl HttpResponse {
    pub fn is_image(&self) -> bool {
        self.content_type.as_ref()
            .map(|ct| ct.starts_with("image/"))
            .unwrap_or(false)
    }
}

Phase 2: Handle Binary Response Data

File: crates/rmcp-openapi/src/http_client.rs

Currently at line 567-570:

let body = response
    .text()
    .await
    .map_err(|e| Error::Http(format!("Failed to read response body: {e}")))?;

Replace with conditional handling:

// Detect if response is binary based on Content-Type
let is_binary = content_type.as_ref()
    .map(|ct| ct.starts_with("image/") || ct.starts_with("audio/") || ct.starts_with("video/"))
    .unwrap_or(false);

let (body, body_bytes) = if is_binary {
    let bytes = response.bytes().await
        .map_err(|e| Error::Http(format!("Failed to read response bytes: {e}")))?;
    (String::new(), Some(bytes.to_vec()))
} else {
    let text = response.text().await
        .map_err(|e| Error::Http(format!("Failed to read response body: {e}")))?;
    (text, None)
};

Add fields to HttpResponse:

pub struct HttpResponse {
    pub status_code: u16,
    pub status_text: String,
    pub headers: HashMap<String, String>,
    pub content_type: Option<String>,  // NEW
    pub body: String,
    pub body_bytes: Option<Vec<u8>>,   // NEW
    pub is_success: bool,
    pub request_method: String,
    pub request_url: String,
    pub request_body: String,
}

Phase 3: Return Image Content

File: crates/rmcp-openapi/src/tool/mod.rs

In the call() method (around line 86-126), after getting the HTTP response, check content type:

match client.execute_tool_call(&self.metadata, arguments).await {
    Ok(response) => {
        // Check if response is an image
        if response.is_image() {
            if let Some(bytes) = &response.body_bytes {
                // Base64 encode the image data
                use base64::{Engine as _, engine::general_purpose::STANDARD};
                let base64_data = STANDARD.encode(bytes);
                
                let mime_type = response.content_type
                    .clone()
                    .unwrap_or_else(|| "image/png".to_string());
                
                return Ok(CallToolResult {
                    content: vec![Content::image(base64_data, mime_type)],
                    structured_content: None,
                    is_error: Some(!response.is_success),
                    meta: None,
                });
            }
        }
        
        // Existing text content handling for non-image responses
        // ... rest of existing code ...
    }
    Err(e) => Err(e)
}

Phase 4: Dependencies

Add base64 encoding support to Cargo.toml:

[dependencies]
base64 = "0.22"  # or current stable version

Phase 5: Testing

File: crates/rmcp-openapi/tests/test_image_responses.rs (new file)

Create comprehensive tests:

Mock server returning PNG image
Mock server returning JPEG image
Verify base64 encoding correctness
Test with OpenAPI spec defining image response
Verify Content::image() is returned with correct MIME type
Test error handling for malformed image data

Example test structure:

#[tokio::test]
async fn test_image_response_returns_image_content() {
    // Setup mock server returning image/png
    let mock_server = MockServer::start().await;
    Mock::given(method("GET"))
        .and(path("/images/123"))
        .respond_with(
            ResponseTemplate::new(200)
                .set_body_bytes(include_bytes!("fixtures/test.png"))
                .insert_header("content-type", "image/png")
        )
        .mount(&mock_server)
        .await;
    
    // Create tool and execute
    let result = tool.call(&args, Authorization::None).await.unwrap();
    
    // Verify image content is returned
    assert_eq!(result.content.len(), 1);
    if let RawContent::Image(img) = &result.content[0] {
        assert_eq!(img.mime_type, "image/png");
        assert!(!img.data.is_empty());
        // Verify it's valid base64
        assert!(base64::decode(&img.data).is_ok());
    } else {
        panic!("Expected image content");
    }
}

Supported MIME Types (Initial Implementation)

image/png
image/jpeg
image/gif
image/webp
image/svg+xml
image/bmp

Can be extended later to support audio and other binary formats.

Implementation Steps

✅ Research MCP specification for image content support
📝 Create this issue with detailed plan
🔨 Update HttpResponse struct to store content type and binary data
🔨 Modify HTTP client to handle binary responses conditionally
🔨 Update tool call() method to return Content::image() for images
🧪 Add comprehensive tests for image responses
📚 Update documentation and examples
🚀 Create merge request

Benefits

AI assistants can display images from API endpoints
Full compliance with MCP specification
Better user experience when working with visual data
Foundation for supporting other binary content types (audio, video, PDFs)

Related Files

crates/rmcp-openapi/src/http_client.rs - HTTP response handling
crates/rmcp-openapi/src/tool/mod.rs - Tool execution and MCP result creation
~/.cargo/registry/.../rmcp-0.8.1/src/model/content.rs - MCP Content types (from dependency)

References

MCP Specification - Tools
rmcp crate Content types
MCP spec shows Content::image(data: String, mime_type: String) where data is base64-encoded