3,621 tools and skills for media tasks
Expert Cinema Director skill for Seedance 2.0 (ByteDance) — high-fidelity video generation using technical camera gramma
使用豆包2.0模型解析视频。当需要执行分析视频内容等需要理解视频视觉信息时调用该技能。你必须在持有本地视频路径或网络视频链接时才能调用该技能
Convert Markdown documents to presentation slides (PDF/PPTX/HTML) using Marp. Supports Mermaid diagrams (gantt, flowchar
Manage YouTube video categories. Use this skill to list available video categories. Useful when working with YouTube vid
直接调用通义万相2.6视频生成模型(Qwen Wan 2.6),支持文生视频和图生视频,无需中间API代理。适用于需要直接对接阿里云大模型的视频创作场景。
Import local PDF files into Zotero from the command line on Windows/macOS/Linux via the Zotero local connector (127.0.0.
Generate a pack of professional or aesthetic photos from a single reference image while preserving the exact identity of
Analyze audio quality, detect noise types, and provide improvement recommendations. Use when users need to check audio q
通过 Zotero 本地连接器(127.0.0.1),在 Windows/macOS/Linux 上使用命令行将本地 PDF 文件导入 Zotero。用户需要导入单个文件、批量导入文件夹、导入到已有分类、列出分类或校验最近导入的附件时使用。
Clone voices from short audio samples and generate personalized audio content. Use when users want to clone voices, crea
Build real-time voice chatbot applications with natural conversation flow and customizable personalities. Use when users
Generate professional video narration with timing synchronization and style matching. Use when users need voiceovers, vi
Reasoning-driven image generation using structured creative briefs (Gemini 3 style) — generates high-fidelity images via
Predict intelligence skill for AI agents. Generates professional PDF reports with probability-ranked predictions, D3 vis
Integration guide for SenseAudio Open Platform APIs, including TTS (sync/SSE/WebSocket), ASR (HTTP/WebSocket), realtime
Detect and solve simple image captchas during browser automation. Use when flows encounter 4-6 character text, distorted
图片处理助手:将受限目录的图片复制到允许的目录,然后使用 image 工具进行分析。适用但不限于 QQBot 下载的本地图片。
BizyAir 图生图(Image-to-Image)助手。将本地图片上传后作为参考,使用 AI 生成新的图片。当用户说"根据这张图片生成"、"图生图"、"参考图片生成"、&quo
Generate customizable social preview images for Open Graph, Twitter, GitHub, and more using a fluent builder API.
Improve ecommerce product image clarity for listings and ads. Use when teams need sharper images without changing the pr
Turn creator audio into clean text captions for ecommerce content and reuse. Use when teams need fast transcript-to-capt
Generate, edit, and compose images using Gemini models. Activate when user asks to generate images, draw, create logos/p
Cultural radar of Pernambuco blending football, Manguebeat, and regional music with poetic insights inspired by Recife a
Manage YouTube watermarks. Use this skill to set or unset watermarks for channel videos. Useful when working with YouTub
Create OpenClaw skills from best practice videos or image sequences. Use when creating skill from video, generating skil
Generate news-style social media images (1080x1350) with Thai text overlay and matching captions. Use when asked to crea
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
Automate TikTok slideshow marketing for any app or product. Researches competitors, generates AI images, adds text overl
Speech recognition CLI for AI agent automation. Transcribe audio from stdin, files, or URLs.
Guide for SenseAudio voice selection, plan-level voice entitlement checks, and cloned voice usage constraints in TTS cal
Generate synchronized subtitles (SRT/VTT/ASS) from video audio with precise timestamps. Use when users need subtitles, c
Send text, image, or file messages to specified users via WeCom applications using configured corporate credentials.
Analyzes audio to detect BPM, key, structure, genre, mood, transcribe lyrics, and generate visual and textual summaries
Generate ecommerce-ready visual assets (cover, comparison card, infographic, product explainer) from a product brief. Us
自动抓取小红书首页热门笔记,进行图片识别与OCR提取,调用礼部洗稿,结果存入飞书多维表格,实现小红书内容自动化管理。
Fetch follower counts and social media metrics from 11+ platforms using profile URLs or nicknames, including Bilibili, Y
Control Plex Media Server - browse libraries, search, play media, manage playback.
Execute multimodal tasks using Novita AI: text-to-image, image-to-image, text-to-video, image-to-video, TTS, STT. Use fo
Generate high-quality AI videos using the WeryAI text-to-video API. Use when the user asks to generate a video or animat
Generate an AI podcast discussion or broadcast audio using the WeryAI Podcast Generation API. Use when the user asks to
使用夸克 OCR 服务从图片中提取文本(支持中英文混合)。当用户提供图片 URL、本地文件路径或 Base64 编码图片,并要求“OCR”、“提取文本”、“读取图片”、“识别文档”或“获取图片文字”时触发。支持通用文档识别,具备版面感知输出
大文档归档与检索管线。将 Word/PDF/TXT/Markdown 文档转换、分块、可选 LLM 增强,输出结构化 Markdown 和索引,适合存入 Obsidian 或知识库。触发词:读大文档、归档文档、junyi-doc-reade
A tool for exploring each layer in a docker image Based on wagoodman/dive (53,557+ GitHub stars). docker analyzer, go, c
用于构建和排查 SenseAudio 会议助手,覆盖实时会议转写、说话人区分、实时翻译、会议纪要生成、行动项提取与转录导出。Build and troubleshoot SenseAudio meeting assistants for l
Control Safari on macOS with AppleScript, safaridriver, screenshots, tab navigation, and real-browser read, click, and t
Trades Polymarket prediction markets on music streaming milestones, album chart performance, Grammy nominations, concert
使用 Fun-ASR-Nano-2512 轻量级模型进行语音转文字。 提供快速准确的中文语音识别,针对 CPU/GPU 环境优化。 使用场景:(1) 将中文音频文件转写为文字,(2) 需要轻量级低内存占用的 ASR, (3) 处理包含领域特
使用 PPIO 执行多模态任务:文生图、图生图、文生视频、图生视频、TTS、STT。 适用于:生成图片、生成视频、文字转语音、语音识别。
基于 Remotion 的动画演示视频创作技能,提供丰富的动画组件和视频模板。By ModelWise team.
Manage Bluesky posts and interactions including threaded replies, media uploads, bookmarks, likes, reposts, and quotes w