<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>pyspark &#8211; Other Things</title>
	<atom:link href="https://blog.adamzolo.com/tag/pyspark/feed/" rel="self" type="application/rss+xml" />
	<link>https://blog.adamzolo.com</link>
	<description>Blog about Things by Adam Zolotarev</description>
	<lastBuildDate>Thu, 28 Mar 2019 14:02:50 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.1</generator>
	<item>
		<title>Starting with PySpark &#8211; configuration</title>
		<link>https://blog.adamzolo.com/starting-with-pyspark-configuration/</link>
					<comments>https://blog.adamzolo.com/starting-with-pyspark-configuration/#respond</comments>
		
		<dc:creator><![CDATA[Adam Zolo]]></dc:creator>
		<pubDate>Thu, 28 Mar 2019 13:34:41 +0000</pubDate>
				<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[pyspark]]></category>
		<guid isPermaLink="false">http://blog.adamzolo.com/?p=835</guid>

					<description><![CDATA[PySpark is a pain to configure. For this guide I am using macOS Mojave.Spark version 2.4.0Python 3 Start by downloading the Spark https://spark.apache.org/downloads.html. Extract wherever &#8211; can be your home directory. Install Java SDK. Important &#8211; some later versions don&#8217;t seem to be compatible with spark 2.4.0. Version 8 seems to work- https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html Install pyspark:&#8230;<p><a class="more-link" href="https://blog.adamzolo.com/starting-with-pyspark-configuration/" title="Continue reading &#8216;Starting with PySpark &#8211; configuration&#8217;">Continue reading <span class="meta-nav">&#8594;</span></a></p>]]></description>
										<content:encoded><![CDATA[
<p>PySpark is a pain to configure.</p>



<p>For this guide I am using macOS Mojave.<br>Spark version 2.4.0<br>Python 3<br><br></p>



<p>Start by downloading the Spark <a href="https://spark.apache.org/downloads.html">https://spark.apache.org/downloads.html</a>. Extract wherever &#8211; can be your home directory.</p>



<p>Install Java SDK. Important &#8211; some later versions don&#8217;t seem to be compatible with spark 2.4.0. Version 8 seems to work- <a href="https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html">https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html</a></p>



<p>Install pyspark:  pip install pyspark</p>



<p>Configure your zshrc/bash_profile &#8211; depending on what shell you use:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: plain; title: ; notranslate">
export SPARK_PATH=~/spark-2.4.0-bin-hadoop2.7
export PYSPARK_DRIVER_PYTHON=&quot;jupyter&quot;
export PYSPARK_DRIVER_PYTHON_OPTS=&quot;notebook&quot;

export PYSPARK_PYTHON=python3
alias snotebook=&#039;$SPARK_PATH/bin/pyspark --master local&#x5B;2]&#039;

export SPARK_HOME=~/spark-2.4.0-bin-hadoop2.7
export PATH=$SPARK_HOME/bin:$PATH
export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.7-src.zip:$PYTHONPATH

export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.8.2.1-src.zip:$PYTHONPATH
export PYSPARK_SUBMIT_ARGS=&quot;--master local&#x5B;2] pyspark-shell&quot;

export JAVA_HOME=$(/usr/libexec/java_home)
</pre></div>


<p>Remember to reload your console.</p>



<p>Now, when you enter <strong>pyspark</strong> on your console, it&#8217;ll open a notebook.</p>



<p>You can validate if Spark context is available by entering this in your new notebook:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: plain; title: ; notranslate">
from pyspark import SparkContext
sc = SparkContext.getOrCreate()
</pre></div>


<p></p>



<p>References: <a href="https://medium.com/@yajieli/installing-spark-pyspark-on-mac-and-fix-of-some-common-errors-355a9050f735">https://medium.com/@yajieli/installing-spark-pyspark-on-mac-and-fix-of-some-common-errors-355a9050f735</a><br></p>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.adamzolo.com/starting-with-pyspark-configuration/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
