先准备这些基础环境:

先准备这些基础环境: JDK 8 或 11GitJetBrains / IDEAMaven这些工具的安装这里就不展开了。如果你本身在写 Java这一段一般都不陌生。补一句和 Windows 有关的提醒如果你要调试 Hive、Iceberg 这类连接器后面很可能会碰到HADOOP_HOME或winutils的问题文末会单独讲。2. Fork 仓库并克隆代码仓库位置可以从官方文档直接跳转官方仓库https://github.com/apache/seatunnelFork 完以后你自己的仓库地址会类似这样https://github.com/LeonYoah/seatunnel.git为什么建议先 fork后面提交自己的改动更方便不会直接影响官方仓库如果要提 PR这也是更常见的做法如果网络不太稳定可以借助代理地址git clone https://cdn.gh-proxy.org/https://github.com/LeonYoah/seatunnel.git cd seatunnel git checkout 2.3.13-release如果本地提示没有这个分支可以先把官方仓库加成上游git remote add upstream https://cdn.gh-proxy.org/https://github.com/apache/seatunnel.git git fetch upstream --prune git checkout 2.3.13-release补充一个常用地址平时能直接记一下https://gh-proxy.com/3. Maven 编译和 IDEA 设置先用 IDEA 打开seatunnel项目然后到Project Structure里把 JDK 指到JDK 8然后再看 Maven 设置。如果你接受把依赖装到系统默认位置C 盘且网络环境很好这一步其实可以简单一点因为 IDEA 自带 Maven。如果你想单独指定 Maven 路径可以先安装自己的 Maven,然后在$MAVEN_HOME/conf/settings.xml里配镜像。下面这份只是参考本地仓库路径记得换成你自己的?xml version1.0 encodingUTF-8? settings xmlnshttp://maven.apache.org/SETTINGS/1.0.0 xmlns:xsihttp://www.w3.org/2001/XMLSchema-instance xsi:schemaLocationhttp://maven.apache.org/SETTINGS/1.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd localRepositoryD:\apache-maven-3.8.6\ck/localRepository proxies /proxies servers /servers mirrors mirror idalimaven/id namealiyun maven/name urlhttp://maven.aliyun.com/nexus/content/groups/public//url mirrorOfcentral/mirrorOf /mirror mirror idrepo1/id mirrorOfcentral/mirrorOf namecentral repo/name urlhttp://repo1.maven.org/maven2//url /mirror mirror idrepo2/id nameMirror from Maven Repo2/name urlhttps://repo.spring.io/plugins-release//url mirrorOfcentral/mirrorOf /mirror /mirrors profiles /profiles /settings然后再到 IDEA 设置里确认这里有一项容易漏use settings from .mvn/config记得取消勾选接着执行下面两个命令# 代码格式化 mvn spotless:apply # 编译安装 mvn clean install -Dmaven.test.skiptrue -T 1C编译完成以后就可以开始本地调试了。4. 从 example 模块起调常用启动类org.apache.seatunnel.example.engine.SeaTunnelEngineLocalExample先改启动配置把provided带上然后点击Shorten command line选择JAR manifest最后用Debug模式运行5. 用自己的配置文件启动代码里有这样一行String configurePath args.length 0 ? args[0] : /examples/fake_to_console.conf;这行代码的意思很直接如果你启动时传了参数就读你传入的配置文件如果没传就默认读/examples/fake_to_console.conf所以自己的配置文件一般放到examples目录下最省事6. 实战案例pg cdc 报错怎么查、怎么修这里拿一个真实问题举例。群里有人反馈pg cdc任务报错原始堆栈at org.apache.seatunnel.connectors.cdc.base.source.reader.IncrementalSourceSplitReader.fetch(IncrementalSourceSplitReader.java:94) at org.apache.seatunnel.connectors.seatunnel.common.source.reader.fetcher.FetchTask.run(FetchTask.java:54) ... 7 more Caused by: org.apache.seatunnel.common.utils.SeaTunnelException: Read split SnapshotSplit(tableIdtraffic.public.users, splitKeyTypeROWid INT, splitStartnull, splitEndnull, lowWatermarknull, highWatermarknull) error due to java.lang.NullPointerException. at org.apache.seatunnel.connectors.cdc.base.source.reader.external.IncrementalSourceScanFetcher.checkReadException(IncrementalSourceScanFetcher.java:216) at org.apache.seatunnel.connectors.cdc.base.source.reader.external.IncrementalSourceScanFetcher.pollSplitRecords(IncrementalSourceScanFetcher.java:117) at org.apache.seatunnel.connectors.cdc.base.source.reader.IncrementalSourceSplitReader.fetch(IncrementalSourceSplitReader.java:91) ... 8 more Caused by: io.debezium.DebeziumException: java.lang.NullPointerException at org.apache.seatunnel.connectors.seatunnel.cdc.postgres.source.reader.snapshot.PostgresSnapshotSplitReadTask.execute(PostgresSnapshotSplitReadTask.java:112) at org.apache.seatunnel.connectors.seatunnel.cdc.postgres.source.reader.snapshot.PostgresSnapshotFetchTask.execute(PostgresSnapshotFetchTask.java:65) at org.apache.seatunnel.connectors.cdc.base.source.reader.external.IncrementalSourceScanFetcher.lambda$submitTask$0(IncrementalSourceScanFetcher.java:96) ... 5 more Caused by: java.lang.NullPointerException at org.apache.seatunnel.connectors.seatunnel.cdc.postgres.source.reader.snapshot.PostgresSnapshotSplitReadTask.createDataEventsForTable(PostgresSnapshotSplitReadTask.java:183) at org.apache.seatunnel.connectors.seatunnel.cdc.postgres.source.reader.snapshot.PostgresSnapshotSplitReadTask.createDataEvents(PostgresSnapshotSplitReadTask.java:170) at org.apache.seatunnel.connectors.seatunnel.cdc.postgres.source.reader.snapshot.PostgresSnapshotSplitReadTask.doExecute(PostgresSnapshotSplitReadTask.java:136) at org.apache.seatunnel.connectors.seatunnel.cdc.postgres.source.reader.snapshot.PostgresSnapshotSplitReadTask.execute(PostgresSnapshotSplitReadTask.java:107) ... 7 more这类问题有明确报错的我们第一反应 应该先看堆栈里最靠后的Caused by。很多时候最后一个Caused by才是最原始的异常来源。这次堆栈里最关键的信息其实是Caused by: java.lang.NullPointerException at org.apache.seatunnel.connectors.seatunnel.cdc.postgres.source.reader.snapshot.PostgresSnapshotSplitReadTask.createDataEventsForTable(PostgresSnapshotSplitReadTask.java:183)看到这里排查思路就清楚了先定位到具体文件和行号在 IDEA 里打断点用本地环境把问题复现出来顺着调用链看变量为什么会是空在 IDEA 里按两次Shift搜索