MCPcopy
hub / github.com/ssssssss-team/spider-flow

github.com/ssssssss-team/spider-flow @v0.5.0 sqlite

repository ↗ · DeepWiki ↗ · release v0.5.0 ↗
2,909 symbols 10,325 edges 155 files 323 documented · 11%
README
<img src="https://www.spiderflow.org/images/logo.svg" width="600">






<a target="_blank" href="https://www.oracle.com/technetwork/java/javase/downloads/index.html"><img src="https://img.shields.io/badge/JDK-1.8+-green.svg" /></a>
<a target="_blank" href="https://www.spiderflow.org"><img src="https://img.shields.io/badge/Docs-latest-blue.svg"/></a>
<a target="_blank" href="https://github.com/javamxd/spider-flow/releases"><img src="https://img.shields.io/github/v/release/javamxd/spider-flow?logo=github"></a>
<a target="_blank" href='https://gitee.com/jmxd/spider-flow'><img src="https://gitee.com/jmxd/spider-flow/badge/star.svg?theme=white" /></a>
<a target="_blank" href='https://github.com/javamxd/spider-flow'><img src="https://img.shields.io/github/stars/javamxd/spider-flow.svg?style=social"/></a>
<a target="_blank" href="https://github.com/ssssssss-team/spider-flow/raw/v0.5.0/LICENSE"><img src="https://img.shields.io/:license-MIT-blue.svg"></a>
<a target="_blank" href="https://shang.qq.com/wpa/qunwpa?idkey=10faa4cf9743e0aa379a72f2ad12a9e576c81462742143c8f3391b52e8c3ed8d"><img src="https://img.shields.io/badge/Join-QQGroup-blue"></a>

介绍 | 特性 | 插件 | DEMO站点 | 文档 | 更新日志 | 截图 | 免责声明

介绍

平台以流程图的方式定义爬虫,是一个高度灵活可配置的爬虫平台

特性

  • [x] 支持Xpath/JsonPath/css选择器/正则提取/混搭提取
  • [x] 支持JSON/XML/二进制格式
  • [x] 支持多数据源、SQL select/selectInt/selectOne/insert/update/delete
  • [x] 支持爬取JS动态渲染(或ajax)的页面
  • [x] 支持代理
  • [x] 支持自动保存至数据库/文件
  • [x] 常用字符串、日期、文件、加解密等函数
  • [x] 支持插件扩展(自定义执行器,自定义方法)
  • [x] 任务监控,任务日志
  • [x] 支持HTTP接口
  • [x] 支持Cookie自动管理
  • [x] 支持自定义函数

插件

项目部分截图

爬虫列表

爬虫列表

爬虫测试

爬虫测试

Debug

Debug

日志

日志

免责声明

请勿将spider-flow应用到任何可能会违反法律规定和道德约束的工作中,请友善使用spider-flow,遵守蜘蛛协议,不要将spider-flow用于任何非法用途。如您选择使用spider-flow即代表您遵守此协议,作者不承担任何由于您违反此协议带来任何的法律风险和损失,一切后果由您承担。

Extension points exported contracts — how you extend this code

ExpressionEngine (Interface)
表达式引擎 [14 implementers]
spider-flow-api/src/main/java/org/spiderflow/ExpressionEngine.java
SpiderFlowMapper (Interface)
爬虫资源库 实现爬虫的入库 @author Administrator
spider-flow-core/src/main/java/org/spiderflow/core/mapper/SpiderFlowMapper.java
ShapeExecutor (Interface)
执行器接口 @author jmxd [10 implementers]
spider-flow-api/src/main/java/org/spiderflow/executor/ShapeExecutor.java
FunctionMapper (Interface)
(no doc)
spider-flow-core/src/main/java/org/spiderflow/core/mapper/FunctionMapper.java
ThreadSubmitStrategy (Interface)
(no doc) [8 implementers]
spider-flow-api/src/main/java/org/spiderflow/concurrent/ThreadSubmitStrategy.java
DataSourceMapper (Interface)
(no doc)
spider-flow-core/src/main/java/org/spiderflow/core/mapper/DataSourceMapper.java
FunctionExecutor (Interface)
(no doc) [11 implementers]
spider-flow-api/src/main/java/org/spiderflow/executor/FunctionExecutor.java
FlowNoticeMapper (Interface)
(no doc)
spider-flow-core/src/main/java/org/spiderflow/core/mapper/FlowNoticeMapper.java

Core symbols most depended-on inside this repo

$
called by 1518
spider-flow-web/src/main/resources/static/js/layui/layui.all.js
push
called by 1227
spider-flow-core/src/main/java/org/spiderflow/core/expression/ExpressionTemplateContext.java
find
called by 1217
spider-flow-web/src/main/java/org/spiderflow/controller/FlowNoticeController.java
data
called by 1204
spider-flow-core/src/main/java/org/spiderflow/core/io/HttpRequest.java
get
called by 398
spider-flow-api/src/main/java/org/spiderflow/concurrent/ThreadSubmitStrategy.java
getValue
called by 391
spider-flow-web/src/main/java/org/spiderflow/model/SpiderWebSocketContext.java
remove
called by 311
spider-flow-web/src/main/java/org/spiderflow/controller/TaskController.java
test
called by 308
spider-flow-web/src/main/java/org/spiderflow/controller/DataSourceController.java

Shape

Function 1,891
Method 858
Class 138
Interface 16
Enum 6

Languages

TypeScript65%
Java35%

Modules by API surface

spider-flow-web/src/main/resources/static/js/codemirror/codemirror.js460 symbols
spider-flow-web/src/main/resources/static/js/jquery.easyui.min.js394 symbols
spider-flow-web/src/main/resources/static/js/cron/jquery.easyui.min.js394 symbols
spider-flow-web/src/main/resources/static/js/mxgraph/mxgraph.js157 symbols
spider-flow-web/src/main/resources/static/js/mxgraph/mxgraph.min.js134 symbols
spider-flow-core/src/main/java/org/spiderflow/core/expression/parsing/Ast.java130 symbols
spider-flow-web/src/main/resources/static/js/codemirror/javascript.js95 symbols
spider-flow-web/src/main/resources/static/js/layui/layui.all.js66 symbols
spider-flow-web/src/main/resources/static/js/cron/jquery-2.1.4.min.js64 symbols
spider-flow-core/src/main/java/org/spiderflow/core/model/SpiderFlow.java22 symbols
spider-flow-api/src/main/java/org/spiderflow/model/SpiderNode.java22 symbols
spider-flow-web/src/main/java/org/spiderflow/model/SpiderWebSocketContext.java20 symbols

Dependencies from manifests, versioned

com.alibaba:druid-spring-boot-starter
com.alibaba:transmittable-thread-local
com.baomidou:mybatis-plus-boot-starter
commons-codec:commons-codec
mysql:mysql-connector-java
org.apache.commons:commons-csv
org.apache.commons:commons-text
org.spiderflow:spider-flow-api

Datastores touched

(mysql)Database · 1 repos
spiderflowDatabase · 1 repos

For agents

$ claude mcp add spider-flow \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact