Skip to content

Add support for image response content in MCP tool calls

Background

The Model Context Protocol (MCP) specification supports multiple content types in tool responses, including:

  • Text content: Content::text(string) - for textual responses
  • Image content: Content::image(base64_data, mime_type) - for image responses
  • Audio content: Content::audio(base64_data, mime_type) - for audio responses
  • Resource content: for embedded resources
  • Resource links: for linking to external resources

Currently, rmcp-openapi only returns text content (Content::text()) for all API responses, even when the OpenAPI endpoint returns image data. This means image-returning endpoints cannot properly display their visual content to AI assistants.

Current Behavior

When an OpenAPI endpoint returns an image (e.g., GET /images/{id} with Content-Type: image/png):

  1. The HTTP client calls response.text().await which attempts to decode bytes as UTF-8
  2. The tool returns Content::text(response.to_mcp_content()) with malformed text
  3. The AI assistant receives corrupted text instead of a displayable image

Location: crates/rmcp-openapi/src/tool/mod.rs:124

Desired Behavior

When an OpenAPI endpoint returns an image:

  1. The HTTP client detects the image MIME type from the Content-Type header
  2. The response bytes are retrieved and base64-encoded
  3. The tool returns Content::image(base64_data, mime_type)
  4. The AI assistant receives properly formatted image content it can display

Technical Approach

Phase 1: Detect Content Type

File: crates/rmcp-openapi/src/http_client.rs

  1. Add content_type: Option<String> field to HttpResponse struct (line 633-644)

  2. In process_response_with_request() (line 547-624), extract the Content-Type header:

    let content_type = response.headers()
        .get(header::CONTENT_TYPE)
        .and_then(|v| v.to_str().ok())
        .map(|s| s.to_string());
  3. Add helper method to detect binary content:

    impl HttpResponse {
        pub fn is_image(&self) -> bool {
            self.content_type.as_ref()
                .map(|ct| ct.starts_with("image/"))
                .unwrap_or(false)
        }
    }

Phase 2: Handle Binary Response Data

File: crates/rmcp-openapi/src/http_client.rs

Currently at line 567-570:

let body = response
    .text()
    .await
    .map_err(|e| Error::Http(format!("Failed to read response body: {e}")))?;

Replace with conditional handling:

// Detect if response is binary based on Content-Type
let is_binary = content_type.as_ref()
    .map(|ct| ct.starts_with("image/") || ct.starts_with("audio/") || ct.starts_with("video/"))
    .unwrap_or(false);

let (body, body_bytes) = if is_binary {
    let bytes = response.bytes().await
        .map_err(|e| Error::Http(format!("Failed to read response bytes: {e}")))?;
    (String::new(), Some(bytes.to_vec()))
} else {
    let text = response.text().await
        .map_err(|e| Error::Http(format!("Failed to read response body: {e}")))?;
    (text, None)
};

Add fields to HttpResponse:

pub struct HttpResponse {
    pub status_code: u16,
    pub status_text: String,
    pub headers: HashMap<String, String>,
    pub content_type: Option<String>,  // NEW
    pub body: String,
    pub body_bytes: Option<Vec<u8>>,   // NEW
    pub is_success: bool,
    pub request_method: String,
    pub request_url: String,
    pub request_body: String,
}

Phase 3: Return Image Content

File: crates/rmcp-openapi/src/tool/mod.rs

In the call() method (around line 86-126), after getting the HTTP response, check content type:

match client.execute_tool_call(&self.metadata, arguments).await {
    Ok(response) => {
        // Check if response is an image
        if response.is_image() {
            if let Some(bytes) = &response.body_bytes {
                // Base64 encode the image data
                use base64::{Engine as _, engine::general_purpose::STANDARD};
                let base64_data = STANDARD.encode(bytes);
                
                let mime_type = response.content_type
                    .clone()
                    .unwrap_or_else(|| "image/png".to_string());
                
                return Ok(CallToolResult {
                    content: vec![Content::image(base64_data, mime_type)],
                    structured_content: None,
                    is_error: Some(!response.is_success),
                    meta: None,
                });
            }
        }
        
        // Existing text content handling for non-image responses
        // ... rest of existing code ...
    }
    Err(e) => Err(e)
}

Phase 4: Dependencies

Add base64 encoding support to Cargo.toml:

[dependencies]
base64 = "0.22"  # or current stable version

Phase 5: Testing

File: crates/rmcp-openapi/tests/test_image_responses.rs (new file)

Create comprehensive tests:

  1. Mock server returning PNG image
  2. Mock server returning JPEG image
  3. Verify base64 encoding correctness
  4. Test with OpenAPI spec defining image response
  5. Verify Content::image() is returned with correct MIME type
  6. Test error handling for malformed image data

Example test structure:

#[tokio::test]
async fn test_image_response_returns_image_content() {
    // Setup mock server returning image/png
    let mock_server = MockServer::start().await;
    Mock::given(method("GET"))
        .and(path("/images/123"))
        .respond_with(
            ResponseTemplate::new(200)
                .set_body_bytes(include_bytes!("fixtures/test.png"))
                .insert_header("content-type", "image/png")
        )
        .mount(&mock_server)
        .await;
    
    // Create tool and execute
    let result = tool.call(&args, Authorization::None).await.unwrap();
    
    // Verify image content is returned
    assert_eq!(result.content.len(), 1);
    if let RawContent::Image(img) = &result.content[0] {
        assert_eq!(img.mime_type, "image/png");
        assert!(!img.data.is_empty());
        // Verify it's valid base64
        assert!(base64::decode(&img.data).is_ok());
    } else {
        panic!("Expected image content");
    }
}

Supported MIME Types (Initial Implementation)

  • image/png
  • image/jpeg
  • image/gif
  • image/webp
  • image/svg+xml
  • image/bmp

Can be extended later to support audio and other binary formats.

Implementation Steps

  1. Research MCP specification for image content support
  2. 📝 Create this issue with detailed plan
  3. 🔨 Update HttpResponse struct to store content type and binary data
  4. 🔨 Modify HTTP client to handle binary responses conditionally
  5. 🔨 Update tool call() method to return Content::image() for images
  6. 🧪 Add comprehensive tests for image responses
  7. 📚 Update documentation and examples
  8. 🚀 Create merge request

Benefits

  • AI assistants can display images from API endpoints
  • Full compliance with MCP specification
  • Better user experience when working with visual data
  • Foundation for supporting other binary content types (audio, video, PDFs)

Related Files

  • crates/rmcp-openapi/src/http_client.rs - HTTP response handling
  • crates/rmcp-openapi/src/tool/mod.rs - Tool execution and MCP result creation
  • ~/.cargo/registry/.../rmcp-0.8.1/src/model/content.rs - MCP Content types (from dependency)

References