我正在尝试将数据从Server导入到GoogleCloudStorage,稍后我将将其上传到BigQuery。我正在通过Google的Cloud完成所有这些工作。
我已经完成了下载Sqoop和Sql服务器JDBC文件并下载并将其上传到特定google云存储的初始步骤。我还创建了一个Google集群来提交Sqoop作业,但是当我尝试使用提交代码时,它会引发一些错误。
我遵循这个过程(https://medium.com/datamindedbe/import-sql-server-data-in-bigquery-d640441d5d56),在我的例子中,我试图先提取一个表.通过dataproc提交作业的代码
我试过的
用于向DATAPROC集群提交SQOOP作业的代码
CLUSTERNAME="sqoop-cluster"
BUCKET="gs://sqoop-bucket-20092021"
libs=`gsutil ls $BUCKET/jars | paste -sd, --`
JDBC_STR="jdbc:sqlserver://RUKSQLRS01:1433;databaseName=RUKDataWarehouse"
SQL_USER="RUKSQLDataWarehouse_Reporting"
SQL_PASS="gs://sqoop-bucket-20092021/creds/sqoop.password"
TABLE="LBD_Task"
SCHEMA="dbo"
gcloud dataproc jobs submit hadoop \
--region europe-west2 \
--cluster="$CLUSTERNAME"\
--jars=$libs \
--class=org.apache.sqoop.Sqoop \
-- \
import \
-Dorg.apache.sqoop.splitter.allow_text_splitter=true \
-Dmapreduce.job.user.classpath.first=true \
--connect "$JDBC_STR" \
--username "$SQL_USER" \
--password-file "$SQL_PASS" \
--table "$SCHEMA.$TABLE" \
--warehouse-dir "$BUCKET/output/$TABLE" \
--num-mappers 1 \
--as-avrodatafile我正在犯错误
21/09/22 11:30:46 WARN tool.SqoopTool: $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
21/09/22 11:30:46 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
21/09/22 11:30:48 WARN sqoop.ConnFactory: $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
21/09/22 11:30:48 INFO manager.SqlManager: Using default fetchSize of 1000
21/09/22 11:30:48 INFO tool.CodeGenTool: Beginning code generation
21/09/22 11:31:02 ERROR manager.SqlManager: Error executing statement: com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP connection to the host RUKSQLRS01, port 1433 has failed. Error: "RUKSQLRS01. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.".
com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP connection to the host RUKSQLRS01, port 1433 has failed. Error: "RUKSQLRS01. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.".
at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:227)
at com.microsoft.sqlserver.jdbc.SQLServerException.ConvertConnectExceptionToSQLServerException(SQLServerException.java:284)
at com.microsoft.sqlserver.jdbc.SocketFinder.findSocket(IOBuffer.java:2435)
at com.microsoft.sqlserver.jdbc.TDSChannel.open(IOBuffer.java:635)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:2010)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:1687)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:1528)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:866)
at com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:569)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:904)
at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:59)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:763)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:786)
at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:289)
at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:260)
at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:246)
at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:327)
at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1872)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1671)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:106)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:501)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.google.cloud.hadoop.services.agent.job.shim.HadoopRunClassShim.main(HadoopRunClassShim.java:19)
21/09/22 11:31:02 ERROR tool.ImportTool: Import failed: java.io.IOException: No columns to generate for ClassWriter
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1677)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:106)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:501)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.google.cloud.hadoop.services.agent.job.shim.HadoopRunClassShim.main(HadoopRunClassShim.java:19)发布于 2021-09-28 15:42:35
这似乎是一个网络问题。您的SQL服务器位于GCP之外,您正试图通过主机名访问它。您需要在SQL服务器端使用外部IP和安装防火墙规则来允许从GCP访问,或者在GCP VPC网络和SQL服务器的网络之间设置VPN,并通过内部IP访问SQL服务器。
https://stackoverflow.com/questions/69284258
复制相似问题