找了半天也没有找到一份完整的 Spark 读取 Hive 数据库的代码,好心人能不能提供一份。
1
liprais 2016-03-18 09:00:15 +08:00 via iPhone
文档看了么?
把 hive 的配置拷到 conf 下面,再 new 一个 hivecontext 就可以了 |
2
anonymoustian OP @liprais 能再详细点吗,看了 spark 的文档,就一两句话。。
|
3
liprais 2016-03-18 09:26:16 +08:00
@anonymoustian
SparkConf conf = new SparkConf().setAppName("JavaRFormulaExample"); JavaSparkContext jsc = new JavaSparkContext(conf); HiveContext hiveContext = new HiveContext(jsc); |
4
anonymoustian OP |
5
liprais 2016-03-18 11:02:26 +08:00
@anonymoustian
"Configuration of Hive is done by placing your hive-site.xml, core-site.xml (for security configuration), hdfs-site.xml (for HDFS configuration) file in conf/. Please note when running the query on a YARN cluster (cluster mode), the datanucleus jars under the lib directory and hive-site.xml under conf/ directory need to be available on the driver and all executors launched by the YARN cluster. The convenient way to do this is adding them through the --jars option and --file option of the spark-submit command." 把上述三个文件(hive-site.xml,core-site.xml,hdfs-site.xml)拷到 spark 的 conf 下面就行了 然后读写的时候代码如下 // sc is an existing JavaSparkContext. HiveContext sqlContext = new org.apache.spark.sql.hive.HiveContext(sc.sc); // Queries are expressed in HiveQL. sqlContext.sql("select * from YOUR_HIVE_TABLE_NAME").collect(); |
6
anonymoustian OP @liprais spark 的 conf 下面在哪里?
|
7
liprais 2016-03-18 13:07:48 +08:00
|