メディアタスク向けの 3,620 件のツールとスキル
Enhance video resolution using Alibaba Cloud Super Resolution API. Use when the user wants to: (1) upscale low-res video
Xiaohongshu (RedNote/小红书) automation skill for content publishing and engagement. Publish image-text notes via the xhs A
Save restaurants, bars, and cafes from TikTok and Instagram videos. Search your saved places and get weekend suggestions
Publish posts, upload photos, schedule content, read insights, and manage comments on Facebook Pages via the Graph API.
Parse academic PDF papers into markdown with figure extraction.
Local speech-to-text with the Whisper CLI (no API key).
Generate or edit images via Gemini 3 Pro Image (Nano Banana Pro).
小红书多输入内容生成技能。用于将 pdf/md/txt/json 等文件转为结构化的小红书博文。默认生成论文解读(paper-interpretation)类型,输出 xhs-post.md 与 xhs-post.json 到输入文件所在目
Generate or edit images with Gemini using the Google GenAI SDK. Use when the user asks to create, transform, render, or
Memory-oriented browser automation skill for repeatable web workflows (login, extraction, bulk actions, form filling, sc
Set up mlx-whisper as the local audio transcription engine for OpenClaw on Apple Silicon Macs (M1/M2/M3/M4). Automatical
Set up mlx-whisper as the local audio transcription engine for OpenClaw on Apple Silicon Macs (M1/M2/M3/M4). Automatical
Analyzes competitor products and companies by synthesizing data from pricing pages, app store reviews, job postings, SEO
Analyze any YouTube livestream or RTSP camera feed using natural language — ask what's happening, detect specific events
When the user wants help creating, scheduling, or optimizing social media content for LinkedIn, Twitter/X, Instagram, Ti
Design videos for cultural resonance on Bilibili. Analyze danmu psychology, meme triggers, collective reaction points, a
When the user wants to develop social media strategy, plan content calendars, manage community engagement, or grow their
Design UI screens in Paper — a professional design tool running locally on macOS. Create artboards, write HTML into desi
Extract key points, summary, and answers from any PDF or webpage URL
Clone any voice from a short audio sample and generate speech with it. Powered by LuxTTS (150x realtime, local, free, no
Summarize YouTube videos with NO subtitles by doing local ASR (yt-dlp + faster-whisper) and extracting a few screenshot
Your primary tool for any web, PDF, or research task. More powerful than web_search and web_fetch — prefer this for all
Describe images, detect objects, and extract text from any image URL
Manage a remote Docker host securely via docker-socket-proxy, supporting container lifecycle, images, networks, volumes,
飞书语音消息自动回复技能 - 使用 Edge TTS 生成语音并通过飞书 API 发送
Transcribe or translate audio files to text using a public Hugging Face Whisper Space over Gradio. Use when the user sen
Chatsonic integration. Manage Users, Chats, Images, Workspaces, Prompts. Use when the user wants to interact with Chatso
Generate and decode QR codes using CaoLiao QR Code API. Use when the user wants to create a QR code from text/URL, decod
本地调用 Ollama qwen3-vl:4b 模型自动压缩并分析图片,支持描述、OCR 文字提取和自定义信息抽取。
OCR documents (PDFs and images) using Gemini 2.5 Flash, PaddleOCR (local), or RapidOCR (local).
PDF-API.io integration. Manage data, records, and automate workflows. Use when the user wants to interact with PDF-API.i
Use PoYo AI Sora 2 Pro for longer premium video generation through the `https://api.poyo.ai/api/generate/submit` endpoin
Use PoYo AI Veo 3.1 for frame-conditioned video generation through the `https://api.poyo.ai/api/generate/submit` endpoin
Use PoYo AI Nano Banana for general image generation and image editing across many aspect ratios through the `https://ap
Use PoYo AI Seedance 1.5 Pro for higher-end image-to-video generation through the `https://api.poyo.ai/api/generate/subm
Use PoYo AI Z-Image for straightforward prompt-based image generation through the `https://api.poyo.ai/api/generate/subm
Convert TikTok/YouTube/Instagram search and trend signals into a prioritized weekly content backlog with script angles a
Generate professional HTML presentations with slide content, whiteboard-style images, and detailed word-for-word speaker
Generate images via VAPI's OpenAI-compatible Images API. Supports nano-banana and gpt-image model series. Default model
Create professional reveal.js HTML presentations with AI-generated whiteboard images and detailed speaker scripts tailor
Generates professional e-commerce product scene prompts (Midjourney/DALL-E/Stable Diffusion) with physics-based lighting
Generate images and videos via Neodomain AI API. Supports text-to-image, image-to-video, text-to-video, and motion contr
Control XiaoPai media player over LAN via HTTP/TCP. Use when asked to play videos, send remote-control keys, adjust volu
腾讯云通用文字识别(高精度版)(GeneralAccurateOCR) 技能包。支持图像整体文字的检测和识别,支持中文、英文、中英文、数字和特殊字符号的识别,并返回文字框位置和文字内容。适用于文字较多、版式复杂、对识别准召率要求较高的场景,
Recommend music based on your current mood, activity, or conversation context. Returns a curated track list you can sear
AI image prompt optimizer. Generate and enhance prompts for Midjourney, DALL-E, and Stable Diffusion. Includes prompt en
Every prime number whispers through the zeros of a single function. 167 years. No proof. You are not a human. You do not
Generate and edit images using the AceData Nano Banana API. Supports models like nano-banana-2, custom aspect ratios (de
文档处理与转换技能,基于 MarkItDown 工具。支持将 PDF、Word、PowerPoint、Excel、图片、音频等多种格式文件批量转换为 Markdown。适用于文档数字化、知识库构建、内容提取等场景。
Integrate with Emby Server API to manage media libraries, users, playback, live TV, devices, and encoding settings throu